= CUDA Developers Guide =

''See also:'' [[GpuDevelopersGuide]] ''and'' [[CudaPortNotes]]

''and'' [[http://wiki.accelereyes.com/wiki/index.php/Installing_CUDA_Under_Ubuntu_10.04|cuda under ubuntu 10]]

This is the root of the CUDA development documentation for FreeSurfer. Everything should go underneath here. For instance if one wants to add a `CUDATesting` page, it should go in `http://surfer.nmr.mgh.harvard.edu/fswiki/CUDADevelopersGuide/CUDATesting` and a link to that page should exist here. 

<<TableOfContents(2)>>

== Enabling CUDA in the Build Environment ==

[[DevelopersGuide/CUDAEnabling|CUDAEnabling]] page gives one an idea of what tweaks have been done in the build environment and what tweaks one should make to add their CUDA enabled binary to the build environment

== Development Notes ==

 * [[mrisurf_CUDA|mrisurf and CUDA]]

== Running within recon-all ==

The convention for running CUDA-enabled executables within recon-all is the following.  Firstly, it was decided that rather than create a single executable with a --use-cuda switch, instead, a separate executable with the post-fix _cuda is to be created, ie. mri_em_register_cuda (paired with mri_em_register).  The recon-all script will itself accept a -use-cuda switch, which will then run the cuda-enabled executable over the default (ie mri_em_register_cuda instead of mri_em_register).

Refer to [[DevelopersGuide/CUDAEnabling|CUDAEnabling]], particularly the Makefile.am example.  

== Benchmarks ==

_cuda executables are the GPU-enabled versions

||'''CPU'''||'''GPU'''||'''order speed-up'''||
||mri_em_register: 33min ||mri_em_register_cuda: 3min ||10x||
||mri_ca_register||mri_ca_register_cuda||||
||mri_watershed: 10min||mri_watershed_cuda||10x||
||mri_ca_label: 60min||mri_ca_label_cuda||15x||
||mri_robust_register: 30min||mri_robust_register_cuda||15x||
||mri_glmfit (monte-carlo sim): 600min||mri_glmfit_cuda||150x||
||mri_glmfit (permutation): 300min||mri_glmfit_cuda||100x||
||mris_fix_topology||mris_fix_topology_cuda||?x||
||mris_sphere: 140min||mris_sphere_cuda: 20min||5x||
||mris_inflate||mris_inflate_cuda||1x||
||mris_flatten||mris_flatten_cuda||||
||mris_volmask: 60min||mris_volmask_cuda||||


== HPC w/ CUDA Tutorials ==

[[http://www.nvidia.com/object/SC09_Tutorial.html|NVidia's HPC w/ CUDA Tutorials]]

== CUDA on machines w/o a GPU card ==

To build the CUDA-enabled code on a machine that doesnt have a CUDA-compatible GPU card, such as the Centos 4 build machines 'minerva' and 'fishie', then do the following:

 * install the CUDA v3.0 toolkit, which doesnt complain or require a GPU card
 * extract the NVIDIA driver and copy libcuda.* to /usr/lib and /usr/lib64.  to extract the files in the driver and copy:
{{{
sh ./NVIDIA-Linux-x86-195.36.15-pkg1.run -x
cd NVIDIA-Linux-x86-195.36.15-pkg1/usr/lib/
sudo cp libcuda.so.195.36.15 /usr/lib
cd /usr/lib
sudo ln -s libcuda.so.195.36.15 libcuda.so
sudo ln -s libcuda.so.195.36.15 libcuda.so.1
}}}

the utility 'cudadetect' will run, but of course should say something like:
{{{
Detecting CUDA... *** No CUDA enabled device(s) detected! ***
}}}
but utilities like mri_em_register_cuda built on minerva will run on GPU-enabled machines, like terrier.