if you liked it you should've put a p-value on it ...or not
DESCRIPTION
Statistical inference in neuroimagingTRANSCRIPT
![Page 1: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/1.jpg)
If you liked it you should’ve put a p-value on it
… or not.
Chris Gorgolewski Max Planck Institute for Human Cognitive and Brain Sciences
![Page 2: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/2.jpg)
SIGNAL DETECTION THEORY
Signal and noise
False positive and false negative errors
Power
![Page 3: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/3.jpg)
Signal detection theory
![Page 4: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/4.jpg)
Types of errors
![Page 5: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/5.jpg)
Vocabulary
• Type I error – false positive
• Type II error – false negative
• False positive rate
• False negative rate
• Statistical power = 1 – false negative rate
• Sensitivity = Power
![Page 6: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/6.jpg)
Inference = thresholding
![Page 7: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/7.jpg)
Inference = thresholding
![Page 8: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/8.jpg)
Signal to Noise ratio
![Page 9: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/9.jpg)
Looking in the wrong places
![Page 10: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/10.jpg)
Lower SNR = we miss more stuff
![Page 11: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/11.jpg)
Lower SNR = higher FDR threshold
![Page 12: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/12.jpg)
VOXELWISE TESTS
P-maps
Multiple comparison
FWE correction: Bonferroni, permutations
FDR correction: B-H, Local FDR
![Page 13: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/13.jpg)
Hypothesis testing
• Distinguish between two hypotheses
1. H0 – there is no difference between groups
2. H1 – there is a difference between groups
• Or…
1. H0 – there is no relation between two variables
2. H1 – there is some relation between the two variables
![Page 14: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/14.jpg)
From statistical values to p-values
• Various procedures give us statistical values
– T-tests (one sample, two sample, paired etc.)
– F-Tests
– Correlation tests (r values)
• What is a p value?
![Page 15: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/15.jpg)
P value
• P(z) = A probability if we repeat our experiment (with all the analyses) and there is no effect we will get this or greater statistical value.
![Page 16: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/16.jpg)
t, z, F to p
![Page 17: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/17.jpg)
OK back to neuroimaging
• Assuming that we are doing a massive univariate analysis (we look at each voxel independently) we have a t-map
• Now using a theoretical distribution (given the degrees of freedom) we can turn it into a p-map
![Page 18: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/18.jpg)
Inference!
• We take out p-map discard all voxel with values > 0.05
– “The value for which P=0.05, or 1 in 20, is 1.96 or nearly 2; it is convenient to take this point as a limit in judging whether a deviation ought to be considered significant or not. Deviations exceeding twice the standard deviation are thus formally regarded as significant.”
• We are done – right?
![Page 19: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/19.jpg)
Not quite done yet…
• Let me generate two vectors of values and test using a t-test if they are different
• What is the probability that P(t) < 0.05
– Well… 0.05
• Let me generate another set of values… and another… 100 pairs of vectors
• What is the probability that at least one of the test?
![Page 20: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/20.jpg)
The Salmon of Doubt
![Page 21: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/21.jpg)
Correcting for multiple comparisons
• Bonferroni correction (based on Bool’s inequality)
– Divide your p-threshold by the number of tests you have performed
– Or multiple your p-values by the number of tests you have performed
![Page 22: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/22.jpg)
Bonferroni is a Family Wise Error correction
It guarantees that the chances of getting at least one false positive in all the tests is less than your
p-threshold
![Page 23: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/23.jpg)
Permutation based FWE correction
• The assumptions behind the theoretical distributions are often not met
• There are many dependencies between voxels
– Each test is not independent so Bonferroni correction can be conservative
• We can however establish an empirical distribution
![Page 24: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/24.jpg)
Permutation based FWE correction
1. Break the relation: shuffle the participants between the groups
2. Perform the test
3. Save the maximum statistical value across voxels
4. Repeat
![Page 25: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/25.jpg)
Permutation based FWE correction
Our FWE corrected p value is the percentage of permutations that yielded statistical values
higher than the original (unshuffled one)
![Page 26: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/26.jpg)
False Discovery Rate
• Even conceptually FWE correction seems conservative
– At least one test out of 60 000?
• Is there a more intuitive way of looking at this?
![Page 27: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/27.jpg)
False Discovery Rate
I present a number of voxels that I think show a strong effect, but I admit that a certain
percentage of them might be false positives.
![Page 28: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/28.jpg)
False Discovery Rate
Percentage of false positive voxels among all significant voxels.
![Page 29: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/29.jpg)
FDR procedures
• Benjamini-Hochberg procedure
– With it’s dependent variables variant
• Efrons local FDR procedure
– Explicit modeling of the signal distribution
![Page 30: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/30.jpg)
Interim Summary
• FWE corrections
– Bonferroni – simple but struggles with dependencies (over conservative)
– Permutations – less dependent on assumptions, but time consuming
• FDR corrections
– B-H – simple but also struggles with dependencies
– Local FDR – data driven, but can fail in case of low SNR
![Page 31: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/31.jpg)
CLUSTER EXTENT TESTS
Test how big are the blobs Random field theory Smoothness estimation Permutation test The problem of cluster forming threshold Fun fact: FWE with RFT
![Page 32: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/32.jpg)
Intuition
If we are interested in continuous regions of activations why are we looking at voxels not
blobs?
![Page 33: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/33.jpg)
Aww patters!
![Page 34: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/34.jpg)
No wait… it’s just smooth noise…
![Page 35: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/35.jpg)
What contributes to expected cluster size?
How likely is to get cluster of this size from pure noise?
It depends… on:
1. cluster forming threshold
2. smoothness of the map
3. size of the map
![Page 36: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/36.jpg)
Where do we get those parameters?
1. cluster forming threshold
– Arbitrary decision
2. smoothness of the map
– Estimated from the residuals of the GLM
3. size of the map
– Calculated from the mask
![Page 37: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/37.jpg)
Permutation based cluster extent probability
1. Break the relation: shuffle the participants between the groups
2. Perform the test
3. Threshold the map to get clusters
4. Save the sizes of all clusters
5. Repeat
![Page 38: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/38.jpg)
Permutation based cluster extent probability
Our cluster extent p value is the percentage of permutations that yielded cluster sizes bigger
than the original (unshuffled one)
![Page 39: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/39.jpg)
Cluster forming threshold conundrum
![Page 40: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/40.jpg)
![Page 41: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/41.jpg)
HONORABLE MENTIONS
TFCE
Mixture models
![Page 42: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/42.jpg)
Threshold Free Cluster Enhancement
![Page 43: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/43.jpg)
Spatially Regularized Mixture Models
![Page 44: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/44.jpg)
IMPLEMENTATIONS
SPM
FSL
AFNI
![Page 45: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/45.jpg)
SPM
• RFT based voxelwise FWE correction
• Smoothness estimation
• Cluster extent p-values
• Peak height p-values
• Permutation tests through SnPM toolbox
![Page 46: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/46.jpg)
FSL
• RFT based voxelwise FWE correction
• Smoothness estimation
• Cluster extent p-values
• FDR
• Permutation tests through randomize
– Including TFCE
![Page 47: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/47.jpg)
AFNI
• Cluster extent p-values (3dClustSim)
– Simulations are not permutations
• Smoothness estimation (3dFWHMx)
![Page 48: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/48.jpg)
Interim summary
Clusterwise methods allow us to find surprising patterns in terms of spatially consistent clusters
instead of individual voxels.
![Page 49: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/49.jpg)
LIMITATIONS OF P-VALUES
![Page 50: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/50.jpg)
P-VALUES ARE MEANINGLESS
![Page 51: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/51.jpg)
FORGET ALL I SAID SO FAR
![Page 52: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/52.jpg)
WE ARE ALL DOOMED
![Page 53: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/53.jpg)
P-value paradox
• There are no two entities or groups that are truly identical
• There are no two variables that are in no way unrelated
• We just fail to obtain enough samples to see it
– Or our tools are not sensitive enough
![Page 54: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/54.jpg)
More samples more “significance”
• The more subjects you will have in your study the more likely it is that you will find something significant
• The same applies to scan length, and field strength
![Page 55: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/55.jpg)
H0 is never true
we just fail to show that
![Page 56: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/56.jpg)
P-value failure
• P-values do not tell us much about actual size of the effect
• Neither do they tell of the predictive power of the found relation
![Page 57: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/57.jpg)
The interesting question
Is PCC involved in autism?
vs.
Given cortical thickness of a subjects PCC how well am I able to predict his or hers diagnosis?
![Page 58: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/58.jpg)
Why does this matter
• More subjects, longer scans, stronger scans – everything is significant
– We are getting there
• Lack of faith in science from the public
– Poor reproducibility
![Page 59: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/59.jpg)
What needs to be done
We need more replications
We need to start reporting null results
![Page 60: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/60.jpg)
What you can do
• Report effect sizes and their confidence intervals – For all test/voxels – not just those significant
• Share the unthresholded statistical maps – It only takes 5 minutes on neurovault.org
• Report all the tests you have performed – not just the significant ones
![Page 62: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/62.jpg)
![Page 63: If you liked it you should've put a p-value on it ...or not](https://reader034.vdocument.in/reader034/viewer/2022052601/558df1c91a28ab2b438b45ea/html5/thumbnails/63.jpg)
If you liked it you should’ve convinced a skeptical researcher to
to try to replicate your results.