Differences between revisions 14 and 15
Deletions are marked like this. Additions are marked like this.
Line 17: Line 17:
 * the opteron was a seychelles node (node0355), running CentOS4.8  * the opteron was a 'seychelles' cluster node (node0355), running CentOS4.8
Line 19: Line 19:
 * the 3GHz intel was a launchpad node, running Centos5  * the 3GHz intel was a 'launchpad' cluster node, running Centos5
Line 25: Line 25:
||3GHz Intel Xeon E5472 (Core)||3.4.6||-O3 -msse2 -mfpmath=sse||NA|| hours, minutes|| ||3GHz Intel Xeon E5472 (Core)||3.4.6||-O3 -msse2 -mfpmath=sse||NA|| 5 hours, 46 minutes||
Line 40: Line 40:
 * nehalem architecture makes a huge difference (compared to amd opteron 200 series)  * nehalem architecture makes a difference (compared to amd opteron 200 series)

mri_ca_register timing info

  • tests conducted by NJS from 17-20 march 2012
  • using subject 'ernie'
  • using 'dev' build(s)
  • commandline:

mri_ca_register \
  -nobigventricles \
  -T transforms/talairach.lta \
  -align-after \
  -mask brainmask.mgz \
  norm.mgz \
  /autofs/cluster/freesurfer/centos6_x86_64/dev/average/RB_all_2008-03-26.gca \
  transforms/talairach.m3z
  • the opteron was a 'seychelles' cluster node (node0355), running CentOS4.8
  • the 2.66GHz intel was machine 'namic', running Centos6.2
  • the 3GHz intel was a 'launchpad' cluster node, running Centos5
  • the 3.3GHz intel was machine 'monster', which has 8 processors, running Centos6.0

processor

gcc v

flags

OMP threads

mri_ca_register runtime

2GHz AMD Opteron 246

3.4.6

-O3 -msse2 -mfpmath=sse

NA

12 hours, 46 minutes

2.66GHz Intel Xeon E5430 (Core)

3.4.6

-O3 -msse2 -mfpmath=sse

NA

6 hours, 3 minutes

3GHz Intel Xeon E5472 (Core)

3.4.6

-O3 -msse2 -mfpmath=sse

NA

5 hours, 46 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

3.4.6

-O3 -msse2 -mfpmath=sse

NA

3 hours, 8 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.1.2

-O3 -msse2 -mfpmath=sse

NA

3 hours, 10 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-O3 -msse2 -mfpmath=sse

NA

1 hours, 56 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse

1

1 hours, 58 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse

2

1 hours, 14 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse

3

0 hours, 57 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse

4

0 hours, 50 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse

5

0 hours, 44 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse

6

0 hours, 41 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse

7

0 hours, 40 minutes

3.3GHz Intel Xeon W5590 (Nehalem)

4.4.5

-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse

8

0 hours, 38 minutes

observations

  • asegstatsdiff comparisons show minimal differences in results
  • nehalem architecture makes a difference (compared to amd opteron 200 series)
  • gcc 4.4.5 alone drops 1 hour of time
  • -ftree-vectorize -msse4.1 flags does not make any difference over -msse2
  • omp threads plot:

runtimes.jpg

CaRegTimings (last edited 2012-06-20 21:04:57 by NickSchmansky)