FreeSurfer Slides

Functional Analysis with FS-FAST

This tutorial steps you through the analysis of an fMRI data set with the FreeSurfer Functional Analysis Stream (FSFAST) version 5.1, from organizing the data to group analysis.

Contents

FreeSurfer Slides
Tutorial Data Description
Getting and Organizing the Tutorial
Understanding the FS-FAST Directory Structure
Preprocessing

Tutorial Data Description

The data being analyzed were collected as part of the Functional Biomedical Research Network (fBIRN, www.nbirn.net).

Working-memory paradigm with distractors
18 subjects
Each subject has 1 run (except sess01 which has 4 runs)
Collected at MGH Bay 4 (3T Siemens)
FreeSurfer anatomical analyses

Getting and Organizing the Tutorial

If you do not have the tutorial data set up, then consult the FsFastTutorialData page. You will need to set the FSFTUTDIR environment variable. NOTE: if you are taking a class at MGH, the data have already been set up on your computer.

cd into the tutorial data directory and run ls:

cd $FSFTUTDIR
ls

You will see 18 folders with names like "sess09". These are the 18 subjects. There are some other files and folders there but don't worry about them now.

Understanding the FS-FAST Directory Structure

All of these sessions have been analyzed with the exception of sess01.noproc. This session is what the directory structure should look like immediately prior to beginning analysis. This includes:

Directory stucture
Raw data
subjectname file
Parasigm files for each run

The directory structure and raw data are usually created by "unpacking" the data with dcmunpack or unpacksdcmdir, but it could also be done by hand. The subjectname file and paradigm files must be added manually.

The 'Session' Folder

The folder/directory where all the data for a session are stored is called the 'session' or the 'sessid'. There may be more than one session for a given subject (eg, in a longitudinal analysis). Go into the sess01.noproc folder and run 'ls':

cd sess01.noproc
ls

You will see see two folders ('bold' and 'rest') and a file called 'subjectname'.

The 'subjectname' File

subjectname is a text file with the name of the FreeSurfer subject as found in $SUBJECTS_DIR (ie, the location of the anatomical analysis).

View the contents of the text file (with 'cat', 'more', 'less', 'gedit', or 'emacs') and verify that this subject is in the $SUBJECTS_DIR.

NOTE: it is important that the anatomical data and the functional data be from the same subject. The contents of the subjectname file is the only link! Make sure that it is right! There is a check for this below.

Functional Subdirectories (FSDs)

The other two directories (bold and rest) are 'functional subdirectories' (FSDs) and contain functional data. If you 'ls rest' you will see '001'. If you 'ls bold' you will see '001 002 003 004'. Each of these is a 'run' of fMRI data, ie, all the data collected from a start and stop of the scanner.

Raw Data

Go into the first run of the bold directory with 'cd bold/001' and run ls. You will see 'f.nii.gz wmfir.par workmem.par'. The raw data is stored in f.nii.gz (compressed NIFTI) and is directly converted from the DICOM file. Examine this file with mri_info:

mri_info --dim f.nii.gz
mri_info --res f.nii.gz

The first command results in '64 64 30 142'. This is the dimension of the functional data. Since it is functional, it has 4 dimensions: 3 spatial and one temporal (ie, 64 rows, 64 cols, 30 slices, and 142 time points or TRs or frames). The second command results in '3.438 3.437 5.000 2000.000'. This is the resolution of the data, ie, each voxel is 3.438mm by 3.437mm by 5.000mm and the TR is 2000ms (2 sec).

View the functional data with:

tkmedit -f f.nii.gz -t f.nii.gz

Click on a point to view the waveform at that point.

Paradigm Files

The workmem.par and wmfir.par files are paradigm files. They are text files that you create that indicate the stimulus schedule (ie, which stimulus was presented when).

View workmem.par. Each row indicates a stimulus presentation. You will see that each row has 5 columns. The columns are:

Stimulus Onset Time (sec)
Numeric Stimulus Identifier
Stimulus Duration (sec)
Weight (usually 1)
Text Stimulus Identifier (redundant with Numeric Stimulus Identifier)

The Stimulus Onset Time is the onset relative to the acquiistion time of first time point in f.nii.gz. The Numeric and Text Stimulus Identifiers indicate which stimulus was presented. The Stimulus Duration is the amount of time the stimulus was presented.

In this case, there are 5 event types:

Encode - encode phase
EDistrator - emotional distractor
EProbe - probe after emotional distractor
NDistrator - neutral distractor
NProbe - probe after neutral distractor

Note two things: (1) Not all the time is taken up, and (2) Basline/Fixation is not explicitly represented. By default, any time not covered by stimuluation is assumed to be baseline.

What time was the third Encode presented?
What time was the last Neutral Distractor presented?
Is this timing for all runs the same?

Preprocessing

Once the data have been arranged in the above directory structure and naming convention, they are ready to be preprocessed. In FS-FAST, it is assumed that each data set will be analyzed in three ways:

Left Cortical Hemisphere
Right Cortical Hemisphere
Subcortical Structures

You will need to decide how much to smooth the data and whether you want to do slice-timing correction. In this analysis, we will smooth the data by 5mm Full-Width/Half-Max (FWHM) and correct for slice timing. The slice-timing for this particular data set was 'Ascending', meaning that the first slice was acquired first, the second slice was acquired second, etc. To preprocess the data, run:

preproc-sess -s sess01 -surface fsaverage lhrh -mni305 \
   -fwhm 5 -stc up -fsd bold -per-run

This data has already been preprocessed, so it should just verify that it is up-to-date and return. This command has several arguments:

-s sess01 : indicates which session to preprocess
-surface fsaverage lhrh : indicates that data should be sampled to
- the left and right hemispheres of the 'fsaverage' subject
-mni305 : indicates that data should be sampled to the mni305
- volume (2mm isotropic voxel size)
-fwhm 5 : smooth by 5mm FWHM
-stc up : slice-timing correction with ascending ('up') slice order
-fsd bold : indicate the functional subdirectory
-per-run : register each run separately

This command does a lot (and it can take quite a long time to run). To understand what it does, we will go back into one of the run directories and see what it creates. To do this, 'cd sess01/bold/001' and type 'ls'. This directory previously held only f.nii.gz, workmem.par, and wmfir.par; now there are a lot of files, each indicative of a different preprocessing stage. Now type 'ls -ltr'. This command sorts the files by creation time with the oldest at the top and the newest at the bottom. The preprocessing is progressive, meaning that the output of one stage is the input to the next.

Template

This stage creates template.nii.gz (and template.log). This is the middle time point from the raw functional data (f.nii.gz). This is the reference used to motion correct and register the functionals for this run to the anatomical. It is also used to create masks of the brain.

* Verify that it has the same dimension and resolution as the raw data

using mri_info.

Masking

The masks for this run are stored in the 'masks' directory. Run 'ls -ltr masks'. You will see a file called 'brain.nii.gz'. This is a binary mask created using the FSL BET program. There is also a file called 'brain.e3.nii.gz' which is the mask eroded by three voxels. These have the same dimensions as the template. View the masks with:

tkmedit -f template.nii.gz -overlay masks/brain.nii.gz -fthresh 0.5
tkmedit -f template.nii.gz -overlay masks/brain.e3.nii.gz -fthresh 0.5

The brain.nii.gz is used to constrain voxel-wise operations. The eroded mask (brain.e3.nii.gz) is used to compute the mean functional value used for intensity normalization and global mean time course. There are other masks there that we will get to later.

Intensity Normalization and Global Mean Time Course

By default, FSFAST will scale the intensities of all voxels and time points to help assure that they are of the same value across runs, sessions, and subjects. It does this by dividing by the mean across all voxels and time points inside the brain.e3.nii.gz mask, then multiplying by 100. This value is stored in global.meanval.dat. This is a simple text file which you can view. At this point, this value is stored and used later. A waveform is also constructed of the mean at each time point (text file global.waveform.dat). This can be used as a nuisance regressor.

* What are the global means for runs 1, 2, 3, and 4?

Function-Anatomical Registration

The next six files (init.register.dof6.dat, register.dof6.dat, register.dof6.dat.mincost, register.dof6.dat.sum, register.dof6.dat.log, register.dof6.dat.param) deal with the registration from the functional to the same-subject FreeSurfer anatomical. There are only two files here that are really important: register.dof6.dat and register.dof6.dat.mincost.

The registration is will be revisited below when we talk about Quality Assurance

Motion Correction (MC)

The motion correction stage produces these files: fmcpr.mat.aff12.1D, fmcpr.nii.gz, mcprextreg, mcdat2extreg.log, fmcpr.nii.gz.mclog, fmcpr.mcdat. There are only three important file here:

fmcpr.nii.gz -- this is the motion corrected functional data. It
- has the same size and dimension as the raw functional data.
fmcpr.mcdat - text file of the amount of motion at each time
- point. This is important for Quality Assurance (see below).
mcprextreg - text file of the motion correction parameters
- assembled into an orthogonalized matrix that can be used as nuisance regressors
Verify that fmcpr.nii.gz has the same dimension as f.nii.gz using
- mri_info.

Slice-Timing Correction (STC)

Slice-timing corretion compensates for the fact that each of the 30 slices was acquired separately over the course of 2 sec. It does this by interpolating between time points to align each slice to the time of the middle of the TR. The file created with this is fmcpr.up.nii.gz (and fmcpr.up.nii.gz.log).

Verify that fmcpr.up.nii.gz has the same dimension as f.nii.gz using
- mri_info.

Resampling to Common Spaces and Spatial Smoothing

At this point, the functional data has stayed in the 'native functional space', ie, 64x64x30, 3.4x3.4x5mm3. Now it will be sampled into the 'Common Space'. The Common Space is a geometry where all subjects are in voxel-for-voxel registration. There are three such spaces in FSFAST:

Left hemisphere of fsaverage (fmcpr.up.sm5.fsaverage.lh.nii.gz)
Right hemisphere of fsaverage (fmcpr.up.sm5.fsaverage.rh.nii.gz)
Volume of fsaverage (MNI305 space) - for subcortical analyses
- (fmcpr.sm5.mni305.2mm.nii.gz)

Each of these is the entire 4D functional data set resampled into the common space. The spatial smoothing is performed after resampling. Surface-based (2D) smoothing is used for the surfaces; 3D for the volumes.

Check the dimensions of the MNI305 space volume:

mri_info --dim fmcpr.up.sm5.mni305.2mm.nii.gz
mri_info --res fmcpr.up.sm5.mni305.2mm.nii.gz

The dimension will be '76 76 93 142' meaning that there are 76 columns, 76 rows, 93 slices but still 142 time points (same as the raw data). The resolution will be '2.0 2.0 2.0 2000' meaning that each voxel is 2mm in size and the TR is still 2 sec.

Check the dimensions of the Left Hemisphere 'volume':

mri_info --dim fmcpr.up.sm5.fsaverage.lh.nii.gz
mri_info --res fmcpr.up.sm5.fsaverage.lh.nii.gz

The dimension is '163842 1 1 142'. This 'volume' has 163842 'columns', 1 'row', and 1 'slice' (still 142 time points). You are probably confused right now. That's ok, it's natural. At this point the notion of a 'volume' has been lost. Each 'voxel' is actually a vertex (of which there are 163842 in the left hemisphere of fsaverage). Storing it in a NIFTI 'volume' is just a convenience.

The 'resolution' is '1.0 1.0 1.0 2000'. The values for the first 3 dimensions are meaningless because there are no columns, rows, or slices on the surface so the distances between them are meaningless. The last value indicates the time between frames and is still accurate (2 sec).