fmri statistics with a focus on task-based analysis and spm12
TRANSCRIPT
fMRI Statistics
With a focus ontask-based analysis and SPM12
Models
• Previously, we looked at GLM models for single subjects (first level models)
• We can also use a GLM to estimate group effects (second level models)– Fixed effects analysis: include all data in a single
model (less used option for group analysis)– Random effects analysis*: use outputs of first level
models as inputs to second level (more efficient, and generalizes)
*Strictly speaking, “mixed effects”: both random & fixed effects together
Fixed vs. Random Effects
• In a fixed effects model:Which brain areas are activated on average across subjects?
• In a random effects model:Which brain areas are activated in the same way across subjects?
• If you only have one scan on one condition per subject, fixed is all you can do (e.g., FDG-PET)
Two Stage Modeling Limitations
• In theory, every first level design matrix should be identical (in practice, somewhat robust to variation: Penny, 2004)
• Assumes underlying error is identical across subjects (all details of first level analysis condensed to voxelwise betas/contrasts)
• Limited (in SPM at least) to voxels available for all subjects
Covariates & Estimability
• Covariates can be useful, especially in a second level model (e.g., age)
• If any variable (column of X) is a linear combination of others, some betas cannot be estimated uniquely– As a result, some contrasts can be rejected as
“inestimable” by SPM*• Color coding: grey (not uniquely specified) vs.
white
*strictly speaking they are estimable, but without a unique solution
Contrast Examples
• Contrasts = linear combinations of regressors, to be compared to zero
• Could compare one to zero:
a > 0• Or difference of two:
a – b > 0 aka, “a > b”
+1
-1
Covariates
The Constant Term
How the GLM constant term is included also affects what contrasts are estimable
Usually better to have it “implicit”
Source: Rik Henson “GLM & RFT”
Regressor Scaling/Centering
• Regressors in the design matrix X need to be scaled, or big ones will dominate small ones– SPM automatically scales covariates as they are added
• Centering affects model error and interpretation– Overall mean (default)– No centering– Factor-based (for covariate interpretation that differs
across factor levels)• Also, can specify interaction with a factor
Regressor Orthogonalization
• “Orthogonal” regressors are independent– Inner product of
vectors = 0• Not (quite) the same
thing as “uncorrelated”!– Inner product of de-
meaned vectors = 0
Rodgers et al. (1984) Linearly independent, orthogonal and uncorrelated variables. The American Statistician, 38:133-134
Regressor Orthogonalization
• If regressors are not orthogonal*– attribution of effects becomes difficult– Effect estimates (betas) are artificially reduced
• Variance not assignable to one source is “lost”
*or “uncorrelated”—with default de-meaning, these are functionally identical in SPM
Parcellation of Variance
Orthogonal regressors account for different partsof the variance.
Parcellation of Variance
Non-orthogonal regressors account for overlapping parts of the variance, and eachends up with only its unique portion.
Lost
Parcellation of Variance
If the overlap is particularly severe theeffects cannot be estimated reliably: inestimable
Avoid correlated regressors!
Lost
Orthogonalization
How can we “orthogonalize” non-orthogonal regressors?
1) Avoid the issue in design (for experimental conditions)
2) Principal components analysis/factor analysis? (difficult to interpret)
3) Serial orthogonalization (used by SPM)
Serial Orthogonalization
Y = 1X1
When we have only one regressor, things are simple…
Y
Example from: Evina Chu, “Basis Functions” SPM MfD course (12/2007)
X1
Serial Orthogonalization
Y = 1X1
1 = 1.5
When we have only one regressor, things are simple…
Y
X1 (vector)
Example from: Evina Chu, “Basis Functions” SPM MfD course (12/2007)
1 (length)
This is the best estimate of Y usingonly X1
Serial OrthogonalizationNow consider adding a second regressor, one not
orthogonal to the first…
Y = 1X1 + 2X2
1 = 1
2 = 1Y
X1
X2
Serial OrthogonalizationNow consider adding a second regressor, one not
orthogonal to the first…
Y = 1X1 + 2X2
1 = 1
2 = 1Y
X1
X2
We can now estimate Y perfectlyusing both Xs; however, note that1 has dropped from 1.5 to 1.
X2 is explaining variance X1 couldalso explain.
1 (length)
2 (length)
Serial OrthogonalizationLet’s orthogonalize X2 with respect to X1. This will create
a new variable “X2*”.
Y = 1X1 + 2*X2*
1 = 1.5
2* = 1Y
X1
X2X2*
Serial OrthogonalizationLet’s orthogonalize X2 with respect to X1. This will create
a new variable “X2*”.
Y = 1X1 + 2*X2*
1 = 1.5
2* = 1Y
X1
2* (length)
1 (length)
X2*X2
Orthogonalization (via Gram-Schmidt process) produces a new variable, based on theold one and existing variables
Serial process, so order matters!
1 is back to 1.5
Serial Orthogonalization in SPM
• Design matrix is orthogonalized left-to-right, so order matters
• Tip: put the “most important” covariates first (the ones whose meaning you don’t want to change)– If all are “nuisance regressors” that won’t be
interpreted, order doesn’t matter• Can always plot the final (orthogonalized)
variables
Significance Testing
• The t-test is the basic unit of SPM– Measures “signal” (here, contrast) relative to
“noise” (variability)– T-value generated for each contrast at each voxel
• Recall that the t-test is implemented in a GLM framework (allowing covariates)
• SPM outputs spmT (“t-map”) as well as con and beta files
F-tests
• SPM can also do F-tests– In this context, can be thought of as a
generalization of t-tests– Tests multiple conditions but without the
directionality of t-tests– Tells “is any combination of these variables having
a significant effect?” (but not which ones/which direction)
• Produce “F-maps”
T/F-test Comparison
Source: Rik Henson, “SPM GLM” talk
A
BT: [1 -1]
A
BF: [1 -1]
A
BF: [1 0 0 1]
The Multiple Comparisons Problem
• Making “independent” assessments at each voxel leads to many, many statistical tests
• If we assume p < 0.05 threshold and complete independence, 5% of voxels tested will be false positives!
• ~100,000 voxels –> 5,000 false positives…
Multiple Comparisons Correction
• We can “correct” by setting a more stringent threshold
• This is called setting a “family wise” threshold, and leads to a “family wise error rate”
• Bonferroni correction: divide p threshold by number of tests– p < 0.05 @ 100,000 voxels becomes p < 0.05/100,000 p < 0.00000005 family-wise threshold
Multiple Comparisons Correction
• Bonferroni assumes independent tests• More generally, can use a less strict FWE
correction if that doesn’t hold• SPM uses Random Field Theory (RFT)
– Estimates smoothness across the brain– Smoothness implies loss of independence between
neighbors, and reduces need for correction– Less smoothness —> better spatial specificity, but in
the extreme RFT can become even more conservative than Bonferroni!
Multiple Comparisons Correction
• Another option is False Discovery Rate (FDR) correction– Instead of controlling the chance of any test being
a false positive (p), control the fraction of false positives (q)
– Often easier to limit to, say, <5% false positives than to a <5% chance of a false positive; thus, less conservative in those cases
– May or may not assume independence
Multiple Comparisons Correction
• One other option is to change the search area• This only makes sense if done a priori!• For example, a specific hypothesis might call
for investigating only regions X and Y• SPM’s results are always adjusted to the
search area– Note: SPM masks in two stages, and only the first
(prior to result generation) affects statistics!
Multiple Comparisons Correction
• So far we’ve considered voxelwise correction (aka “peak level” statistics)
• Can also use the extent of activation (number of contiguous activated voxels) to assess significance (aka “cluster level” statistics)– Can predict the expected number/size of clusters
(using RFT)– Clusters bigger than a certain size are “significant”– SPM reports using FWE and FDR values (as with peak
level)
Excursions, Peaks, Clusters
Source: Durnez et al. 2014
• Excursion set: super-threshold voxels for some threshold
• Peaks: local maxima in the excursion set• Clusters: sets of
neighboring voxelsin the excursionset
“excursion set” in black
SPM Results Reporting
Cluster level(determined by“kE”, the clustersize)
Peak level(determined byt value)
Clusters, peaks
Sub-peaks(within eachcluster)
SPM Results Visualization
• Results table gives the fundamentals
• Accompanied by a “glass brain” (maximum intensity projection) view
• SPM results can be viewed interactively (clicking in the glass brain or table)
Additional SPM Plots
• Can also use several additional views in SPM:– Slices: several contiguous slices– Sections: several orthogonal slices– Render: surface map showing near/at surface
activations • In general, want to plot what you are using to
make your inferences (especially for publication)