Notes on FreeSurfer code optimization

This page is for free-form entry of notes on ways to optimize the freesurfer code base, whether they be simple things, or notes on larger scale problems.

Format is: name:, <short label> - description

* nick:, -ffast-math - Try the -ffast-math flag of gcc v4.x. Prior experiments with this on the AMD compiler showed output differences in recon-all, but perhaps selective use of this flag is possible.

* nick:, SSE math lib - Replace instances of sin, cos, log and exp with routines optimized for SSE instructions found on intel processors. See http://gruntthepeon.free.fr/ssemath/. Tried this, but ran into problems and gave up. Some wrangling with it could make it possible to optionally build with this lib via #ifdefs. Actually, its not a lib but a header file.

* Richard:, MRI data structure - pointer-chasing implied by ***slices is horrific for the CPU caches (and a non-starter on the GPU). The 'chunking' alternative is much better, but needs to be used uniformly

* Richard:, MATRIX data type - If this is only used for 4x4 affine transformations, it should be coded as such. If used more generally too, then a separate 'Affine' class should be considered (this exists for the GPU already in the file affinegpu.cu)