USAGE: optseq2
Data Acquistion Parameters
--ntp Ntp : number of time points
--tr TR : temporal resolution of acquisition (in sec)
--tprescan t : start events t sec before first acquisition
Event Response and Nuisance Descriptors
--psdwin psdmin psdmax : PSD window specifications
--ev label duration nrepetitions
--repvar pct : allow nrepetitions to vary by +/- percent
--polyfit order
--tnullmin tnullmin : limit min null duration to tnullmin sec
--tnullmax tnullmax : limit max null duration to tnullmax sec
Searching and Cost Parameters
--nsearch n : search over n schedules
--tsearch t : search for t hours
--focb n : pre-optimize first order counter-balancing
--ar1 rho : optimize assuming whitening with AR1
--pen alpha T dtmin: penalize for presentations being too close
--evc c1 c2 ... cN : event contrast
--cost name : eff, vrfavg, vrfavgstd
--sumdelays : sum delays when forming contrast matrix
--seed seedval : initialize random number generator to seedval
Output Options
--nkeep n : keep n schedules
--o outstem : save schedules in outstem-RRR.par
--mtx mtxstem : save design matrices in mtxstem_RRR.mat
--cmtx cmtxfile : save contrast matrix in cmtxfile
--sum file : save summary in file (outstem.sum)
--log file : save log in file (outstem.log)
--pctupdate pct : print an update after each pct done
--sviter file : save info from each iteration in file
Input/Initialization Options
--i instem : initialize with instem-RRR.par
--in input-schedule <--in input-schedule >
--nosearch : just print output for input files
Help, Documentation, and Bug Reporting
--help : print help page
--version : print version string
$Id: optseq2.c,v 2.12 2006/12/29 02:09:11 nicks Exp $
Optseq Home Page:
http://surfer.nmr.mgh.harvard.edu/optseq
SUMMARY
optseq2 is a tool for automatically scheduling events for
rapid-presentation event-related (RPER) fMRI experiments (the schedule
is the order and timing of events). Events in RPER are presented
closely enough in time that their hemodynamic responses will
overlap. This requires that the onset times of the events be jittered
in order to remove the overlap from the estimate of the hemodynamic
response. RPER is highly resistant to habituation, expectation, and set
because the subject does not know when the next stimulus will appear
or which stimulus type it will be. RPER is also more efficient than
fixed-interval event related (FIER) because more stimuli can be
presented within a given scanning interval at the cost of assuming
that the overlap in the hemodynamic responses will be linear. In SPM
parlance, RPER is referred to as 'stochastic design'.
The flexibility of RPER means that there are a huge number of possible
schedules, and they are not equal. optseq2 randomly samples the space
of possible schedules and returns the 'best' one, where the user can
control the definition of 'best'. Cost functions include: average
efficiency, average variance reduction factor (VRF), and a weighted
combination of average and stddev of the VRF. The user can also
specify that the first order counter-balancing of the sequence of
event-types be pre-optimized.
Visit the Optseq Home Page at:
http://surfer.nmr.mgh.harvard.edu/optseq
COMMAND-LINE ARGUMENTS
--ntp Ntp
Number of time points to be acquired during the scan. This should be
for one 'run' not for the entire session. The Total Scanning Time
is the number of time points times the TR plus the prescan period,
ie, tScanTot = Ntp*TR+tPreScan.
--tr TR
Time between functional volumes (in seconds).
--tprescan tPreScan
Time before the acquisition of the first volume to be processed to
begin stimulation.
--psdwin PSDMin PSDMax
Specifications for the FIR event response window. It will be assumed that
the entire response can be captured within this window. PSDMin is the
minimum PostStimulus Delay (PSD), PSDMax is the maximum PSD. dPSD
is the sampling interval within the window. dPSD is optional; if
left unset, it will default to the TR. dPSD controls how finely spaced
the event onsets can be scheduled (ie, the onsets will only appear at
integer multiples of the dPSD).
--ev label duration nrepetitions
Event Type specification. The label is just a text label (which may be
more informative than a numeric id). Duration is the number of seconds
that the stimulus will be presented; it should be an integer multiple
of the dPSD (see --psdwin). Nrepetitions is the number of times that
this event type will be presented during the course of the run. The
number of repetitions can be optimized using the --repvar option. Use
a different --ev flag for each event type. NOTE: DO NOT INCLUDE THE
NULL STIMULUS AS AN EVENT TYPE. The total stimulation time, tStimTot,
equals the product of the duration and the number of repetitions
summed over all the events. It should be obvious that the total
stimulation time must be less than the total scanning time.
--repvar pct
Allow the number of repetitions of each event type to randomly vary by
+/- pct percent from the number specified with --ev. This allows the
user to optimize over the number of repetitions. The total stimulation
time is computed from the maximum possible number of repetitions. If
only the percentage is given, then the relative number of repetitions
of each event type will stay constant. If the string 'per-evt' is
appended, then the number of reps for each event type can change
independently to each other.
--polyfit order
Add polynomial regressors as nuisance variables. Order N includes the
Nth order polynomial was well as all lower orders. Max order is currently
2. Order 0 is a baseline offset; Order 1 is a linear trend; Order 2
is a quadradic trend. Cost functions will not explicitly include the
nuisance variables.
--tnullmin tNullMin
Force the NULL stimulus to be at least tNullMin sec between stimuli.
Note that this means that the stimulus duration + tNullMin must be
an integer multiple of the dPSD.
--tnullmax tNullMax
Limit the maximum duration of the NULL stimulus to be tNullMax sec.
Note: it may not be possible for a given parameter set to keep the NULL
stimulus below a certain amount. In this case, the following error
message will be printed out 'ERROR: could not enforce tNullMax'. By
default, tNullMax is infinite.
--nsearch Nsearch
Search over Nsearch iterations. optseq will randomly construct Nsearch
schedules, compute the cost of each one, and keep the ones with the
highest cost. It is not permitted to specify both Nsearch and Tsearch.
--tsearch Tsearch
Search for Tsearch hours. optseq will randomly construct as many
schedules schedules as it can in the given time, compute the cost of
each one, and keep the ones with the highest cost. It is not
permitted to specify both Nsearch and Tsearch.
--focb nCB1Opt
Pre-optimize the first order counter-balancing (FOCB) of the event
sequence. This will cause optseq2 to construct nCB1Opt random
sequences and keep the one with the best FOCB properties. This will be
done for each iteration. Counter balance optimization is not allowed
when there is only one event type.
--ar1 rho
Optimize while whitening with an AR(1) model with parameter rho. rho must
be between -1 and +1.
--pen alpha T dtmin
Penalize for one presentation starting too soon after the previous
presentation. The weight is computed as 1 - alpha*exp(-(dt+dtmin)/T), where
dt is the time from the offset of the previous stimulus to the onset
of the next stimulus. The basic idea here is that the second stimulus
will be reduced in amplitude by the weight factor. alpha and T were fit
from data presented in Huettel and McCarthy (NI, 2000) to be alpha=0.8
and T = 2.2 sec.
--evc C1 C2 ... CN
Optimize based on a contrast of the event types. Ci is the contrast
weight for event type i. There must be as many weights as event types.
Weights are NOT renormalized such that the sum to 1.
--cost costname
Specify cost function. Legal values are eff, vrfavg,
vrfavgstd. Default is eff. params as any parameters which accompany
the given cost function. eff is the cost function which maximizes
efficiency (no parameters). vrfavg is the cost function which
maximizes the average Variance Reduction Factor (VRF) (no
parameters). vrfavgstd maximizes a weighted combination of the average
and stddev VRF; there is one parameter, the weight give to the stddev
component.
--sumdelays
Sum the delay regression parameters when computing contrast matrix.
The event contrast (--evc) specifies how to weight the events when
forming the contrast vector. However, there are multiple coefficients
per event type corresponding to the delay in the FIR window. By default,
a separate row in the contrast matrix is provided for each delay. To
sum across the delays instead, use --sumdelays. The contrast matrix
will have only one row in this case.
--seed seedval
Initialize the random number generator to seedval. If no seedval is
specified, then one will be picked based on the time of day. optseq2
uses drand48().
--pctupdate pct
Print an update line to stdout and the log file after completing each
pct percent of the search.
--nkeep nKeep
Save nKeep of the best schedules. Increasing this number does not
substantially increase the search time, so it is a good idea to
specify more than you think you will need.
--o outstem
Save schedules in outstem-RRR.par, where RRR is the 3-digit
zero-padded schedule rank number (there will be nKeep of them).
The schedules will be saved in the Paradigm File Format (see below).
--mtx mtxstem
Save the FIR design matrices to mtxstem_RRR.mat in Matlab 4 binary
format.
--cmtx cmtxfile
Save the contrast matrix in Matlab 4 binary format.
--sum summaryfile
optseq2 will create a file which summarizes the search, including
all the input parameters as well as characteristics of each of
the schedules kept. By default, the summary file will be outstem.sum,
but it can be specified explicitly using this flag. See THE SUMMARY
FILE below.
--log logfile
During the course of the search, optseq2 will print information about
the current search status to stdio and to the log file. By default
the log file will be outstem.log. The log file will contain a summary
of input arguments as well as a series of status lines. A status line
will be printed each time there is a change in the list of nKeep best
schedules as well as at prespecified regular intervals. By default,
the interval is 10% of the search time, but this can be changed
with --pctupdate. Each status line has 12 columns: (1) percent complete,
(2) iteration number, (3) minutes since start, (4) best cost,
(5) efficiency, (6) CB1Error, (7) vrfavg, (8) vrfstd, (9) vrfmin,
(10) vrfmax, (11) vrfrange, and (12) number of iterations since
last substitution.
--pctupdate pct
Print a search status to stdio and the log file at regular intervals
corresponding to pct percent of the search time. Default is 10%.
--sviter SvIterFile
Save information summary about all the schedules to SvIterFile in
ASCII format. Each line will have 7 columns corresponding to:
(1) cost, (2) efficiency, (3) cb1err, (4) vrfavg, (5) vrfstd,
(6) vrfmin, (7) vrfmax. This is mainly for exploring the distribution
of the various costs. WARNING: this file can grow to be very large.
--i instem
Load all input schedules that match instem-RRR.par. These can be used
to initialize the search (for example, if you want to continue a
previous optimization). It is also possible to only generate a summary
and/or design matrices of the given input schedules by include the
--nosearch flag. This can be useful for testing schedules that were
optimized under one cost function against another cost function or
for testing independently generated schedules. See also --in.
--in input-schedule <--in input-schedule >
This does the same thing as --i except that each file is specified
separately.
--nosearch
Do not search for optimal schedules. This can only be used when
reading schedules in using --i or --in. See --i for more information.
ALGORITHM OVERVIEW
optseq2 randomly searches the space of schedules given the constraints
on the command-line and keeps the ones that maximize the given cost
function. Each search iteration begins by creating a random order of
events with the appropriate number of repetitions for each event
type. First order counter-balancing optimization, if done, is
performed here. Next, the timing is generated by inserting random
amounts of NULL stimulus so that the total stimulation time plus null
time is equal to the total scan time. Event onset times are
constrained to be integer multiples of dPSD. An FIR design matrix is
created from this schedule. The FIR peristimulus window begins at
PSDMin and ends at PSDMax and is incremented by dPSD. If polynomial
regressors are specified, they are appended to the FIR matrix to give
the final design matrix, hereafter referred as X. The various costs
are computed from X. The forward model is then y = XB+n, which has the
solution Bhat = inv(XtX)Xy. A contrast is Ghat = C*Bhat, where C is the
the contrast matrix.
CONTRAST MATRIX
By default, the contrast matrix is the identity over all task-related
components. The contrast matrix can be changed by specifying --evc
(and possibly --sumdelays).
COST FUNCTIONS
First-Order Counter-Balancing (FOCB). The FOCB matrix is the
Nevt-by-Nevt matrix of probabilities that one event type follows
another, where Nevt is the number of event types (excluding the NULL
condition). This is computed only from the sequence of events and is
independent of the timing (this is why it is referred to as
'pre-optimization'). The ideal FOCB matrix can be computed from the
number of repetitions for each event type. The FOCB cost matrix is
computed by subtracting the actual probability from the ideal and then
dividing by the ideal. The final cost is computed by averaging the
absolute values of all elements in the cost matrix. This cost is
minimized during pre-optimization. FOCB optimization can be combined
with any other cost function. Note: FOCB requires that there be at
least 2 event types.
Efficiency (eff). Efficiency is defined as eff = 1/trace(C*inv(Xt*X)*Ct)
(note: any nuisance regressors are not included in the computation of
the trace but are included in the computation of the inverse). The
quantity trace(C*inv(XtX)*Ct) is a measure of the sum square error in Ghat
(ie, G-Bhat) relative to the noise inherent in the experiment. Therefore,
maximizing eff is a way of finding a schedule that will result in, on
average, the least error in Ghat.
Average Variance Reduction Factor (vrfavg). The Variance Reduction Factor
(VRF) is the amount by which the variance of an individual estimator (ie,
a component of Ghat) is reduced relative to the noise inherent in the
experiment. The VRF for a estimator is the inverse of the corresponding
component on the diagonal of C*inv(XtX)*Ct. The average VRF is this value
averaged across all estimators. This will yield similar results as when
the efficiency is optimized.
Average/StdDev Variance Reduction Factor (vrfavgstd). The cost is defined
as cost = vrfavg - W*vrfstd, where vrfstd is the standard deviation of
the VRFs and W is a weighting factor specified as a parameter on the
command-line. This penalizes schedules that result in large variations
in the individual VRFs of the estimators. There is currently a bug in
the implementation that causes it to mis-state the cost when the number
of repetitions are different for different event types. Also, only use
this cost when using a prescan window equal to or greater than the
PSD window (otherwise there will be a tendency not to schedule events
near the end of the run).
THE SUMMARY FILE
The summary file summarizes the conditions under which the search was
performed as well as the properties of each schedule found. It also
includes the number of iterations searched and the time it took to
search them as well as the average and standard deviation of the cost
measured over all schedules. It also includes the maximum efficiency
and average VRF over all schedules (these will be the same as the
best schedule if the eff or vrfavg cost functions were chosen).
Each schedule is summarized in a table with the following columns:
(1) Rank, (2) Cost, (3) ZCost, (4) Iteration Number (NthIter), (5)
Efficiency (Eff), (6) FOBC Error (CB1Err), (7) Average VRF (VRFAvg),
(7) StdDev VRF (VRFStd),
(8) Minimum VRF (VRFMin), (9) Maximum VRF (VRFMax), and (10) VRF
Range (VRFRng). Many of these measures have been described above.
ZCost is the number of standard deviations from the average cost
(over all schedules). The Iteration Number is the search iteration that
that schedule was found on. The first-order counter-balancing
measures come after this table. First, the ideal FOCB probability
matrix is printed followed by the actual matrix for each of the
schedules. Note: the printed ideal matrix is based on the nominal number
of repetitions. See BUGS.
CHOOSING PARAMETERS SETS
There are several parameters that must be chosen as a group because
they rely and/or relate to each other. These parameters are: (1) the
number of time points (Ntp), (2) the TR, (3) the prescan window (tPreScan),
(4) the duration of each event type (tEv), and (5) the number of repetitions
of each event type (nReps). The most basic relationship requires that
the total amount of stimulation time (tStimTot) be less than or equal to
the total amount of scan time (tScanTot), where tStimTot = sum(tEv*nReps)
(summed over all conditions), and tScanTot = Ntp*TR+tPreScan, so
sum(tEv*nReps) <= Ntp*TR+tPreScan (1)
If this constraint is not met, you will receive a 'Time Constraint
Violation' Error. The total amount of time dedicated to the Null stimulus
(tNullTot) is equal to the difference between the total scan time and
the total stimulation time:
tNullTot = Ntp*TR+tPreScan - sum(tEv*nReps) (2)
If the parameters are chosen such that equality results in equation (1),
then there will not be any time for the Null stimulus, which is generally
not good because inter-stimulus jitter is dependent upon inserting
random amounts of Null between non-Null stimuli.
A rule of thumb is to allocate as much time for the Null as one would for
any other stimulus. This can be done by choosing parameters such that
sum(tEv*nReps)(nEv+1)/nEv = Ntp*TR+tPreScan (3)
where nEv is the number of event types. The schedule can be optimized
around this point by allowing the number of repetitions to vary around
this nominal value.
There is also a DOF constraint which requires that the number of parameters
estimated be less than the number of time points, ie
Nbeta = nPSD*nEv+(PolyOrder+1) < Ntp (4)
where Nbeta is the number of parameters, nPSD is the number of elements
in the post-stimulus time window (ie, (PSDMax-PSDMin)/dPSD), and PolyOrder
is the order of the nuisance polynomial specified with -polyfit. If this
constraint is not met, you will receive a 'DOF Constraint Violation' Error.
CHOOSING THE SEARCH TERMINATION CRITERIA
The search is terminated when either the maximum number of iterations
has been reached or the maximum search time has been reached. It is
impossible to determine how many iterations to search over because
it is not possible to globally determine what the best schedule is nor
would it be possible to determine how long it would take a random search
to get there. That said, there are some rules of thumb that can be followed,
the most basic being that if a 'large' number of schedules have been
searched and the best cost has not changed 'much' in a 'long time', then
you are done. Of course, you still have to define 'large', 'much', and
'long time'. The summary file can help with this. In particular, there is a
line with the number of iterations since the last substitution (ie,
the number of iterations since one of the best nKeep schedules changed).
This can be used to judge how long a 'long time' is. The same information
can be extracted from the NthIter column of the summary table. At a minimum
let it run for 10000 iterations.
PARADIGM FILE FORMAT
The schedules will be saved in 'paradigm file' format. This format has four
columns: time, numeric event id, event duration, and event label. A numeric
id of 0 indicates the Null Stimulus.
BUGS
Also see the Optseq Home page at http://surfer.nmr.mgh.harvard.edu/optseq
The vrfavgstd cost function does not work properly if the number of reps
is different for different event types. A prescan window should also be
specified
The ideal counter-balance matrix reported in the summary file will be
for the nominal number of reps when the user has selected to optimize
over the number of reps making comparisons between the actual and
ideal inappropriate (the FOCB error reported for each will be correct
however).
BUG REPORTING
optseq2 is offered with no guarantees and with no obligation or implication
that it will be supported in any way. Having said that, you can send
bug reports/questions to: analysis-bugs@nmr.mgh.harvard.edu. You must
include the following information: (1) optseq2 version, (2) computer
operating system, (3) command-line, (4) description of the problem, and
(5) log and/or summary files.
AUTHOR
optseq2 was written by Douglas N. Greve in the Summber of '02.
REFERENCES
Dale, A.M., Greve, D.N., and Burock, M.A. (1999) Optimal Stimulus
equences for Event-Related fMRI. 5th International Conference on
Functional Mapping of the Human Brain. Duesseldorf, Germany. June
11-16.
Dale, A.M. (1999). Optimal experimental design for event-related
fMRI. Human Brain Mapping 8:109-114.