Freesurfer Test Plan
WORK IN PROGRESS
This page documents the Freesurfer software test plan. A formal software test plan ([http://en.wikipedia.org/wiki/Test_plan see Wikipedia reference]) describes a systematic approach to testing a software application (or suite), and includes these elements:
- Scope of testing
- Test deliverables
- Release criteria
- Risks and contingencies
Also, tests should cover the following categories of testing:
- Functional - can the software executes its basic functionality under optimal conditions?
- Boundary - determine the breaking points of the software, and whether the software gracefully handles input near and beyond these boundaries.
- Stability - gauge long-term behavior of the software: whether is has a memory leak, or prone to crashes which are not repeatable in any single run of any of the other tests.
- Coverage - what percentage of the code-base is exercised by the tests?
- Performance - produce benchmarks on the performance of the software.
The Freesurfer test plan is a work-in-progress. It is one not developed top-down, but rather grown from the bottom-up as necessity and time has dictated. The goal is to build a test suite that meets the criteria of a formal test plan. This will take time.
The current test suite is an ad-hoc collection of test scripts and C/C++ code providing rudimentary testing of most of the freesurfer code-base, consisting of unit, module and system tests. The #1 aim of these tests is simple: the output files produced by the recon-all stream ([wiki:ReconAllDevTable as documented here]) must be 'correct', relative to reference files which are known to be 'correct' as determined by manual inspection or some formal method (a table of precalculated results from another program). The word 'correct' is in quotes because Freesurfer, being a research tool, is constantly evolving, as well as there being inherent variability in any complex scientific software application.
The term 'unit test' is defined in our Freesurfer test plan to mean a test of a freesurfer binary (such as mri_ca_register) or smaller (a subroutine). The framework for these tests is the 'make check' framework built into the 'make' utility (and the 'automake' tools). The 'check' target of 'make' initiates the build and run of tests created by the user to test the thing that is made by the 'all' target of a Makefile. In freesurfer, there are a number of 'make check' tests, and 'make check' is run after 'make' on each nightly build platform (see the section [wiki:DevelopersGuide/MartinosCenter "How the nightly build works"] for details).
Future - To formalize the unit tests, documentation (a wiki page) should be created which lists 1. all the binaries used in recon-all, 2. other important binaries not in the stream, and 3. the critical subroutines, as determined either by name (see Bruce Fischl and Doug Greve) and/or by profiling the binaries during a run of the recon-all stream; and for each of these, the name of the test (as run by 'make check') is listed. A table of this sort allows ascertaining coverage, and identification of tests to be developed.
There are two
- aseg tests
- aparc tests
test_recon-all - see the section [wiki:DevelopersGuide/MartinosCenter "How the daily testing works"] for details)