Please refer to this site/make edits here for the most updated information:

#acl LcnGroup:read,write,delete,revert CmetGroup:read,write,delete,revert All: This page is readable only by those in the LcnGroup and CmetGroup.

Author(s): Nick Schmansky, Andrew Hoopes

See: Samseg

Samseg Testing

This page documents one front of the work on Samseg, where testing of a version of Samseg operating on T1-weighted input and using the existing RB FS atlas, is conducted to compare its performance against the existing FreeSurfer v6.0 subcortical segmentation processing stream, and manual segmentations. Initial work was conducted in an evaluation task: SamsegEvaluationMarch2016. The current work pertaining to this page extends that work by:

Those working on this project include: Doug Greve (DG), Nick Schmansky (NS), Andrew Hoopes (AH), Christian Larsen (CH) and Lee Tirrell (LT)

Test Data


The test subjects and scripts are located here: /cluster/fsm/users/samseg

├── scripts
├── subjects
│   ├──ADNI60
│   ├──Buckenr40
│   └── ...
└── tests
    └── ...

Test sets used to compare samseg results will include:

running a test:

To run a test, use the runtest script located in /cluster/fsm/users/samseg/scripts. To test samseg with a multi-subject set, use the -set flag and indicate the test set directory (located in /cluster/fsm/users/samseg/subjects) as well as a test results output dir:

./runtest -all -set <subjects dir> <test outdir>

example: scripts/runtest -all -set subjects/ADNI60 tests/newADNItest

-all runs each test step, including running samseg, computing dice scores, and creating charts. To specify which steps to run, use -samseg, -dice, and/or -chart instead. To test samseg with an individual subject, use the -ind flag and indicate the path to the subject as well as a test results output dir:

./runtest -all -ind <subject> <test outdir>

NOTE: I was under the impression that matplotlib (python plotting package) was installed for the center, but I guess not. Instead, I've pointed the runtest shebang to my anaconda install on topaz


In the test output directory, the script will create a dir for each subject and an analysis dir (not produced for an individual subject test). Each subject contains its samseg output as well as dice.dat and dice.log, produced by mri_compute_seg_overlap. In the analysis dir, subjs_dice.log is a summary file of the mean overlap for each subject. labels_dice.log is a summary file of the mean overlap for each brain structure across all subjects. The runtest script also creates a labels_dice.no_outliers.log, which ignores any subjects with a mean overlap below a certain threshold (default is 0.2 - can be changed with -outlier); the subjects ignored are written to outliers. An associated chart is created for each of these three files and saved as .png.

Test Runs

Robust, 2244-subject test:

06.22.2017 - Buckner39, Buckner40, ADNI60, ADNI_714_1.5T, ADNI_1150_3T, IXI_79, GE14, Siemens13 - JuneTest

Kvlreg vs Elastix:

06.18.2017 - Buckner39 and ADNI60 - multi-resolution parameter estimation and tune optimization convergence criteria - 2017-06-18

Kvlreg vs Elastix:

06.05.2017 - comparison of initial affine registration techniques - elastix_vs_kvlreg


06.08.2017 - Siemens13, GE14, Buckner40, and ADNI-HIPPO - 2017-06-08
06.01.2017 - Buckner39 - all subjects registered - (mean: 0.83, overall: 0.86) - 2017-06-01_Buckner39
06.01.2017 - ADNI60 - all subjects registered - (mean: 0.846, overall: 0.88) - 2017-06-01_ADNI60

Elastix test:

05.26.2017 - Buckner39 - all subjects registered - (mean: 0.83, overall: 0.86) - 2017-05-26_Buckner39
05.26.2017 - ADNI60 - all subjects registered - (mean: 0.82, overall: 0.87) - 2017-05-26_ADNI60

Code clean-up:

05.23.2017 - GE14 - no change since the last test - (mean: 0.77, overall: 0.83)
05.21.2017 - Buckner39 - no change since the last test - still 21 failed registrations, and an average overall dice of 0.86 for successful subjects
05.21.2017 - ADNI60 - basically no change since the last test - 18 failed registrations, and still an average overall dice of 0.84 for successful subjects

ADNI135 hippocampus:

03.25.2017 - 135 ADNI subjects with manual hippocampus labels - mean hippocampus overlap (0.96) - 2017-04-14_ADNIHIPPO

Timing test:

03.25.2017 - 2 Buckner39 subjects - average elapsed time 1256 s (20.9 min) - 2017-03-25_timing

After adding Rician noise, includes comparison to v6 aseg:

03.25.2017 - Siemens13 - all subjects successful (mean: 0.83, overall: 0.87) - 2017-03-25_Siemens13
03.25.2017 - GE14 - registration fixed - all subjects successful (mean: 0.77, overall: 0.83) - 2017-03-25_GE14
03.25.2017 - Buckner39 - no change - 21 subjects with unsuccessful registration - 2017-03-25_Buckner39
03.25.2017 - Buckner40 - registration fixed - all subjects successful (mean: 0.86, overall: 0.89) - 2017-03-25_Buckner40
03.25.2017 - ADNI60 - no change - 19 subjects with unsuccessful registration

V6 aseg:

Siemens13 - V6.0 aseg - overlap between v6 aseg and manualseg - Siemens13_V60_aseg
Buckner39 - V6.0 aseg - overlap between v6 aseg and seg_edited - Buckner39_V60_aseg
GE14 - V6.0 aseg - overlap between v6 aseg and manualseg - GE14_V60_aseg

March 2017 initial test:

03.19.2017 - Siemens13 - all subjects produced dice scores above 0.8! - 2017-03-19_Siemens13
03.19.2017 - GE14 - 2 subjects had unsuccessful registration - 2017-03-19_GE14
03.19.2017 - Buckner39 - 21 subjects had unsuccessful registration! - 2017-03-19_Buckner39
03.17.2017 - Buckner40 - 2 subjects with low dice due to poor reg, and 1 subject failed - 2017-03-17_Buckner40
03.06.2017 - ADNI60 - 19 subjects with low dice due to poor registration - 2017-03-06_ADNI60

Outlier Subjects

A list of subject that are known to fail or produce poor results

expected failures





















subject set outliers (subjects that do poorly relative to other subjects in their test set)






SamsegTesting (last edited 2021-04-20 18:17:35 by DevaniCordero)