Diff for "DevelopersGuide/CUDAEnabling"

Differences between revisions 4 and 5

Contents

Detection of CUDA
Changes to enable CUDA during configuration ( configure.in )
Tweaking Makefile.am of the binary
Steps to CUDA-ize a binary
Attribution

Detection of CUDA

CUDA detection at run-time by any script ( such as recon-all ) is possible by executing a binary file called cudadetect. If it exits with 0, CUDA is available. Otherwise, not.

The following pseudocode is how one can use the detection scheme in their own scripts:

..
cudadetect
if $status == 0
  setenv CUDAENABLED
endif
..
..
if CUDAENABLED
  <cuda_enabled_binary> <options>
else
  <normal_binary> <options>
endif

There are a few reasons why cudadetect might not detect your CUDA setup.

* The LD_LIBRARY_PATH or DYLD_LIBRARY_PATH doesn't have the PATH where CUDA libraries reside. Usually they reside in /usr/local/cuda/lib . So check to see whether these environment variables have the CUDA libraries path. Better to put it in .profile or .cshrc

* cudadetect works by calling dlopen() system call which calls a function from arbitrary shared library. In this case, cudadetect tries to call cudaGetDeviceCount from libcudart. If either the name of the library or the name of the function change, cudadetect might not work. Currently, cudadetect works for CUDA version 2.0 but effort will be taken to ensure that cudadetect calls the correct library and API is called whenever CUDA is upgraded.

Changes to enable CUDA during configuration ( configure.in )

This section gives an overview of how configure.in is changed and what stuff it exports. More often than not, one need not tamper with configure.in -- it is already changed to accommodate CUDA.

* CUDA support is enabled by using a --with-cuda=<path of cuda> switch to ./configure. Suppose, the CUDA installation is in /usr/local/cuda ( which is the default path, usually ), then the ./configure command must have the switch --with-cuda=/usr/local/cuda. The script checks to see if it can find nvcc, the CUDA compiler. This is a sanity check and if it finds nvcc, the ./configure assumes CUDA exists and exports certain things which are helpful in compiling CUDA programs ( and which are to be used in Makefile.am )

Flags which configure.in exports

These flags are typically used in the Makefile.am of any binary which has a CUDA replacement.

CUDA_CFLAGS : This variable contains the include path of CUDA header files. ( Usually like, CUDA_CFLAGS="-I/usr/local/cuda/include" )
CUDA_LIBS : This variable contains the CUDA libraries to link against ( Usually like, CUDA_LIBS="-L/usr/local/cuda/lib -lcuda -lcudart" )
BUILDCUDA : This important variable is exported as AM_CONDITIONAL. If configure finds CUDA, it sets this flag and any Makefile.am can test this flag and if it exists, compile CUDA specific stuff. This is illustrated in the next section.
configure.in does not export a #define directive (such as FS_CUDA). Instead, each Makefile.am should define FS_CUDA where needed (see examples below). This flag should be used in C programs in a #ifdef #endif directive. Which is used preferably if there are only a very few number of CUDA functions which replace standard functions. Usually like,

#ifdef FS_CUDA
  #include "cudaproc.h"
  cuda_procedure(args);
#else
  cpu_procedure(args);
#endif // FS_CUDA

And then specific rules in Makefile.am can be used to compile the .cu file where the cuda_procedure() usually resides. cudaproc.h contains the definition of the cuda_procedure() function. More in the next section.

If there are a large number of functions which are to be replaced this way, a separate source file should be written with _cuda suffix. For instance, if foo_file.c is re-written with functions calling CUDA code, the file is named foo_file_cuda.c.

Tweaking Makefile.am of the binary

Listing of Makefile.am of mri_em_register to illustrate how Makefile.am changes:

## 
## Makefile.am 
##

AM_CFLAGS=-I$(top_srcdir)/include
AM_LDFLAGS=

bin_PROGRAMS = mri_em_register
mri_em_register_SOURCES=mri_em_register.c
mri_em_register_LDADD= $(addprefix $(top_builddir)/, $(LIBS_MGH))
mri_em_register_LDFLAGS=$(OS_LDFLAGS)

## ----
## CUDA
## ----

# BUILDCUDA is defined if configure.in finds CUDA
if BUILDCUDA
# rules for building cuda files
.cu.o:
        $(NVCC) -o $@ -c $< $(NVCCFLAGS) $(AM_CFLAGS) $(MNI_CFLAGS)
bin_PROGRAMS += mri_em_register_cuda
mri_em_register_cuda_SOURCES = mri_em_register.c \
computelogsampleprob.cu findoptimaltransform.cu findoptimaltranslation.cu
mri_em_register_cuda_CFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS) -DFS_CUDA
mri_em_register_cuda_CXXFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS) -DFS_CUDA
mri_em_register_cuda_LDADD = $(addprefix $(top_builddir)/, $(LIBS_MGH)) $(CUDA_LIBS)
mri_em_register_cuda_LDFLAGS = $(OS_LDFLAGS) 
mri_em_register_cuda_LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) \
        $(LIBTOOLFLAGS) --mode=link $(CCLD) $(mri_em_register_cuda_CFLAGS) \
        $(CFLAGS) $(mri_em_register_cuda_LDFLAGS) $(LDFLAGS) -o $@
endif

# Our release target. Include files to be excluded here. They will be
# found and removed after 'make install' is run during the 'make
# release' target.
EXCLUDE_FILES=""
include $(top_srcdir)/Makefile.extra

As one can notice, the Makefile.am of a CUDA enabled code differs from normal Makefile.am only in the if BUILDCUDA.. endif block.

Notes :

CUDA support is to be enclosed in if BUILDCUDA.. endif block ( which is exported by configure.in if it finds CUDA )
.cu.o: Makefile rule takes care of how to build any .cu source file which is how CUDA C files are named. Note that NVCCFLAGS is also exported by configure.in which contains the default nvcc flags. Also, mri_em_register_cuda needs MNI_CFLAGS to compile as well.
bin_PROGRAMS - The += spec enables Makefile to build mri_em_register_cuda in addition to mri_em_register
mri_em_register_cuda_SOURCES - should contain all C file's name ( mri_em_register.c making use of #ifdef FS_CUDA code blocks) and all the sources having the CUDA functions it calls ( *.cu files )
mri_em_register_cuda_CFLAGS - in addition to the Automake's CFLAGS, it contains CUDA_CFLAGS exported by configure.in, and most importantly, -DFS_CUDA, which is what distinguishes the mri_em_register binary from mri_em_register_cuda.
mri_em_register_cuda_LDADD - in addition to the other FreeSurfer libraries, it contains CUDA_LIBS exported by configure.in which has the details about CUDA libs which the binary links against.
mri_em_register_cuda_LINK - is a rule suggesting how to link the compiled CUDA code.

Steps to CUDA-ize a binary

Suppose you have a CUDA replacement for a binary ( like the mri_em_register case above). Following steps are like guidelines:

The file could be named <binary>_cuda.c and put in the same directory where the *.c files of the binary reside. But if at all possible, keep the original source and include #ifdef FS_CUDA blocks, as this allows maintenance of only one file (as is done with mri_em_register.c).
All the *.cu files go inside the above directory and all the *.h files go inside the dev/include directory.
If the code needs timing support, look for dev/include/chronometer.h and dev/utils/chronometer.c to see how to incorporate the functions to support profiling your CUDA as well as CPU code.
Modify the Makefile.am of the binary to include CUDA-specific information -- if BUILDCUDA... endif block. The following is the template

if BUILDCUDA
# rules for building cuda files
.cu.o:
        $(NVCC) -o $@ -c $< $(NVCCFLAGS) $(AM_CFLAGS) 
bin_PROGRAMS += dummy_cuda
dummy_SOURCES = dummy.c \
dummy.cu
dummy_CFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS) -DFS_CUDA
dummy_CXXFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS) -D_FS_CUDA
dummy_LDADD = $(addprefix $(top_builddir)/, $(LIBS_MGH)) $(CUDA_LIBS)
dummy_LDFLAGS = $(OS_LDFLAGS) 
dummy_LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) \
        $(LIBTOOLFLAGS) --mode=link $(CCLD) $(dummy_CFLAGS) \
        $(CFLAGS) $(dummy_LDFLAGS) $(LDFLAGS) -o $@
endif

Attribution

A lot of ideas on Autotools and CUDA integration were taken from beagle-lib.

-  ⇤ ← Revision 4 as of 2009-07-16 18:18:51 → 
  Size: 7948
  Editor: KrishSubramaniam
  Comment:
+   ← Revision 5 as of 2010-01-05 15:33:33 → ⇥
  Size: 8334
  Editor: NickSchmansky
  Comment: updated notes on building mri_em_register_cuda using #ifdef FS_CUDA blocks
-Deletions are marked like this.
+Additions are marked like this.
 Line 45:
- * Another flag which is exported by configure.in is the preprocessor directive '''CUDA'''. This flag should be used in C programs in a #ifdef #endif directive. Which is used preferably if there are only a very few number of CUDA functions which replace standard functions.Usually like,
+ * configure.in does '''not''' export a #define directive (such as '''FS_CUDA''').  Instead, each Makefile.am should define '''FS_CUDA''' where needed (see examples below). This flag should be used in C programs in a #ifdef #endif directive. Which is used preferably if there are only a very few number of CUDA functions which replace standard functions. Usually like,
 Line 47:
-#ifdef CUDA
+#ifdef FS_CUDA
 Line 52:
-#endif
+#endif // FS_CUDA
 Line 56:
-If there are a large number of functions which are to be replaced this way, a separate source file should be written with `_cuda` suffix. For instance, if `mri_em_register` is re-written with functions calling CUDA code, the file is named `mri_em_register_cuda.c`.
+If there are a large number of functions which are to be replaced this way, a separate source file should be written with `_cuda` suffix. For instance, if `foo_file.c` is re-written with functions calling CUDA code, the file is named `foo_file_cuda.c`.
 Line 66:
-AM_CFLAGS=-I$(top_srcdir)/include -I$(top_srcdir)/include/dicom
+AM_CFLAGS=-I$(top_srcdir)/include
 Line 84:
-mri_em_register_cuda_SOURCES = mri_em_register_cuda.c \
+mri_em_register_cuda_SOURCES = mri_em_register.c \
 Line 86:
-mri_em_register_cuda_CFLAGS = $(AM_CFLAGS)  $(CUDA_CFLAGS)
mri_em_register_cuda_CXXFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS)
+mri_em_register_cuda_CFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS) -DFS_CUDA
mri_em_register_cuda_CXXFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS) -DFS_CUDA
 Line 111:
- * `mri_em_register_cuda_SOURCES` - should contain all replacement C file's name ( mri_em_register_cuda.c ) and all the sources having the CUDA functions it calls ( *.cu files )
+ * `mri_em_register_cuda_SOURCES` - should contain all C file's name ( mri_em_register.c making use of #ifdef FS_CUDA code blocks) and all the sources having the CUDA functions it calls ( *.cu files )
 Line 113:
- * `mri_em_register_cuda_CFLAGS` - in addition to the Automake's CFLAGS, it contains `CUDA_CFLAGS` exported by `configure.in`
+ * `mri_em_register_cuda_CFLAGS` - in addition to the Automake's CFLAGS, it contains `CUDA_CFLAGS` exported by `configure.in`, and most importantly, -DFS_CUDA, which is what distinguishes the mri_em_register binary from mri_em_register_cuda.
 Line 124:
- * The file has to be named `<binary>_cuda.c` and put in the ''same directory'' where the `*.c` files of the binary reside.
+ * The file could be named `<binary>_cuda.c` and put in the ''same directory'' where the `*.c` files of the binary reside.  But if at all possible, keep the original source and include #ifdef FS_CUDA blocks, as this allows maintenance of only one file (as is done with mri_em_register.c).
 Line 137:
-dummy_SOURCES = dummy_cuda.c \
+dummy_SOURCES = dummy.c \
 Line 139:
-dummy_CFLAGS = $(AM_CFLAGS)  $(CUDA_CFLAGS)
dummy_CXXFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS)
+dummy_CFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS) -DFS_CUDA
dummy_CXXFLAGS = $(AM_CFLAGS) $(CUDA_CFLAGS) -D_FS_CUDA