MRI-Q project: Overall description

Computation of a matrix Q, representing the scanner configuration, used in a 3D magnetic resonance image reconstruction algorithm in non-Cartesian space.

See also: Sam S. Stone, Justin P. Haldar, Stephanie C. Tsao, Wen-Mei W. Hwu, Zhi-Pei Liang, and Bradley P. Sutton. "Accelerating Advanced MRI Reconstructions on GPUs." In Computing Frontiers, 2008.

Download the starter code/sequential version of the code. Your task is to accelerate using GPGPUs. Your goal is to make the GPU kernel execution as fast as you can with the following restriction. Read the paper above for ideas.

The results must be deterministic and match the result of the sequential code (within rounding errors). This means you may not use the fast math versions of sin and cos, and the order of accumulation operations must be the same. While some optimizations can trade off accuracy for speed, we're asking you to maintain current semantics exactly.

The given interface for the application is as follows.

You must specify using the -i option the input file. The dataset directory includes three different size input files.

You may specify the option -S to get more accurate timings (inserts synchronization after non-blocking events). This is how we will measure your final speed.

You may specify an output file using the -o option. You can then analyse the output file however you like, including comparing it to other output files using the python script in the tools directory.

You may specify as the last command line parameter an integer number to limit the number of input samples used. This can be useful in testing or verifying your code in a shorter amount of time. For reference, we also provide correct output files for using 512 or 10000 samples. Keep in mind that your optimizations should not put restrictions on the number of samples you may be provided with as input, although you could potentially pad or otherwise handle it internally.

Your report should detail all optimizations you tried, including those that ultimately were abandonded or worsened performance. For every optimization tried, and each entry should note:

Grading:

Your submission will be graded on the following parameters.

Demo/knowledge: 25%

Demo/Functionality: 40%

Report: 35%

The project will be graded based on a one-to-one demo to the instructor, as well as the proposal and report. The project will be due middle of the final week when the grading interviews will start. If you need to be graded earlier (for example due to travel plans), I can work with your schedule.