Samseg Aseg Testing

This page documents one front of the work on Samseg, where testing of a version of Samseg operating on T1-weighted input and using the existing RB FS atlas, is conducted to compare its performance against the existing FreeSurfer v6.0 subcortical segmentation processing stream. Initial work was conducted in an evaluation task: SamsegEvaluationMarch2016. The current work pertaining to this page extends that work by:

Creation of a test cycle allowing the comparison of a subject run of Samseg (which is under algorithmic development) to the FS aseg results from v6 for that subject. The test cycle should generate reports on a run, allowing for debug, and evaluation of performance to meet, initially, the aims of the CorticoMetrics grant aims (ie, an accelerated FS based on T1 input only)
Creation of a data set which includes enough subjects to cover the wide variety of possible inputs. Starting with Buckner40 and ADNI60, and continuing with many others, particularly subjects having lots of neck in the FOV, as well as 'tough cases' both in terms of registration and segmentation, plus a variety of scanner model, and age and disease states.
Documentation of design decisions.

Those working on this project include: Doug Greve (DG), Nick Schmansky (NS), Andrew Hoopes (AH), Christian Larsen (CH) and Lee Tirrell (LT)

Test Data

structure:

The test subjects and scripts are located here: /cluster/fsm/users/samseg

samseg
├── scripts
├── subjects
│   ├──ADNI60
│   ├──Bucker40
│   └── ...
└── tests
    ├──test1_ADNI60
    ├──test2_Bucker40
    └── ...

Test sets used to compare samseg results will include:

Buckner39 - 39 manually labeled subjects used to create the FS subcortical atlas
Siemens13 - 13 manually labeled subjects scanned on Siemens Sonata (currently used in testing the FS atlas: AsegTestNotes)
GE14 - 14 manually labeled subjects scanned on GE Signa (currently used in testing the FS atlas: AsegTestNotes)
Buckner40 - public buckner40, processed by FS v6, where aseg's have been manually inspected as part of FS release testing
ADNI60 - processed by FS v6, where aseg's have been manually inspected as part of FS release testing
ADNI-HIPPO - 135 ADNI subjects with manual segmentations of hippocampus
ADNI-1.5T-FS-5.3 - 1.5T ADNI data processed by FS 5.3, and QA'd by Tian Ge (but asegs not each manually inspected)
ADNI-3T-FS-5.3 - 1.5T ADNI data processed by FS 5.3, and QA'd by Tian Ge (but asegs not each manually inspected)
Outliers - consisting of subject scans known to be problematic (lots of neck, skewed placement in scanner, etc.)
IXI, HCP, OASIS, HABS, BRAINS, Mindboggle-101, BGSP - all publicly available datasets where the intent is to process by FS v6 to include in the aseg test set.
ANDI-Long - test-retest set
Bammer120 - test-retest set
disorder sets - data from various disorder studies (outside of AD), eg. ms, schizophrenia, autism, tumor, child, addiction, etc.

running a test:

To run a test, use the runtest script located in /cluster/fsm/users/samseg/scripts. To test samseg with a multi-subject set, use the -set flag and indicate the test set directory (located in /cluster/fsm/users/samseg/subjects) as well as a test results output dir:

./runtest -all -set <subjects dir> <test outdir>

example: scripts/runtest -all -set subjects/ADNI60 tests/newADNItest

-all runs each test step, including running samseg, computing dice scores, and creating charts. To specify which steps to run, use -samseg, -dice, and/or -chart instead. To test samseg with an individual subject, use the -ind flag and indicate the path to the subject as well as a test results output dir:

./runtest -all -ind <subject> <test outdir>

NOTE: I was under the impression that matplotlib (python plotting package) was installed for the center, but I guess not. Instead, I've pointed the runtest shebang to my anaconda install on topaz

results:

In the test output directory, the script will create a dir for each subject and an analysis dir (not produced for an individual subject test). Each subject contains its samseg output as well as dice.dat and dice.log, produced by mri_compute_seg_overlap. In the analysis dir, subjs_dice.log is a summary file of the mean overlap for each subject. labels_dice.log is a summary file of the mean overlap for each brain structure across all subjects. The runtest script also creates a labels_dice.no_outliers.log, which ignores any subjects with a mean overlap below a certain threshold (default is 0.2 - can be changed with -outlier); the subjects ignored are written to outliers. An associated chart is created for each of these three files and saved as .png.

Tasks

AH - determine reasons for ANDI60 failures

Tests Run

03.17.2017 - Buckner40 - 2 subjects with low dice due to poor reg, and 1 subject failed - 2017-03-17_Buckner40
03.06.2017 - ADNI60 - 19 subjects with low dice due to poor registration - 2017-03-06_ADNI60

-Deletions are marked like this.
+Additions are marked like this.
 Line 72:
-NOTE: this script isn't completely finished - still need to add launchpad support and some small things. Also, I was under the impression that matplotlib (python plotting package) was installed for the center, but I guess not. Instead, I've pointed the runtest shebang to my anaconda install on topaz
+NOTE: I was under the impression that matplotlib (python plotting package) was installed for the center, but I guess not. Instead, I've pointed the runtest shebang to my anaconda install on topaz
 Line 81:
- * AH - Initial test data: create directories using the names Buckner40, Siemens13, GE14, Buckner40, ADNI60, ADNI-1.5T-FS-5.3 and ADNI-3T-FS-5.3, and copy from their respective source data directories only the input data necessary to run samseg, and also the reference seg file (symlinked to name 'ref_seg.mgz'), which will be the manual labels in Buckner39, Siemens13, GE14, the FS v6 aseg.mgz for Buckner40 and ADNI60, and FS v5.3 for ADNI-1.5T-FS-5.3 and ADNI-3T-FS-5.3)

 * AH - One-subject test run: refer to [[http://orbit.dtu.dk/files/127427974/Fast_and_sequence_adaptive.pdf|paper]] for testing methods.  Create a script that does the following:

  * runs samseg on a subject

  * runs mri_compute_overlap against ref_seg.mgz

  * save a png of the dice of each structure and the overall, as shown in Figure 3 of [[SamsegEvaluationMarch2016]].

 * AH - Multi-subject test run: create a script to run a set of subjects, and then produce a png of the dice results as shown in Figure 6 of [[SamsegEvaluationMarch2016]].

 * AH - run multi-subject tests runs on the initial data sets

 * AH - document on this wiki page the location of these scripts and how to run them, including where to find the results
-Line 91:
+Line 83:
-.06.2017 - ADNI60 - 19 subjects failed due to poor registration - [[SamsegAsegTesting/2017-03-06_ADNI60|2017-03-06_ADNI60]]
+.17.2017 - Buckner40 - 2 subjects with low dice due to poor reg, and 1 subject failed - [[SamsegAsegTesting/2017-03-17_Buckner40|2017-03-17_Buckner40]]<<BR>>

03.06.2017 - ADNI60 - 19 subjects with low dice due to poor registration - [[SamsegAsegTesting/2017-03-06_ADNI60|2017-03-06_ADNI60]]