correction for multiple comparisons - university of edinburgh · correction for multiple...
TRANSCRIPT
![Page 1: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/1.jpg)
Correction for multiple Correction for multiple comparisons comparisons
Cyril Pernet, PhDCyril Pernet, PhDSBIRC/SINAPSE SBIRC/SINAPSE –– University of EdinburghUniversity of Edinburgh
![Page 2: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/2.jpg)
OverviewOverview
Multiple comparisons correction procedures Multiple comparisons correction procedures Levels of inferences (set, cluster, voxel)Levels of inferences (set, cluster, voxel)Circularity issuesCircularity issues
![Page 3: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/3.jpg)
Multiple comparison Multiple comparison correctioncorrection
Avoiding false positivesAvoiding false positives
![Page 4: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/4.jpg)
What Problem?What Problem?
44--Dimensional DataDimensional Data1,000 multivariate observations,1,000 multivariate observations,each with > 100,000 elementseach with > 100,000 elements100,000 time series, each 100,000 time series, each with 1,000 observationswith 1,000 observations
Massively UnivariateMassively UnivariateApproachApproach
100,000 hypothesis100,000 hypothesisteststests
Massive MCP!Massive MCP!
1,000
1
2
3
. . .
Tom Nichols’ intro
![Page 5: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/5.jpg)
What Problem?What Problem?
Typical brain ~ 130000 voxelsTypical brain ~ 130000 voxels@ p = .05, it is expected = 6500 false positives!@ p = .05, it is expected = 6500 false positives!@ a more conservative value like p = .001 we still @ a more conservative value like p = .001 we still expect 130 false positives.expect 130 false positives.
Using extend threshold k without correction is not Using extend threshold k without correction is not enough as it, by chance, can cluster as well. enough as it, by chance, can cluster as well.
![Page 6: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/6.jpg)
What Problem?What Problem?
BennetBennet et al., 2009et al., 2009
TaskTask: take a decision about emotions on pictures: take a decision about emotions on picturesDesignDesign: blocks of 12 sec activation/rest: blocks of 12 sec activation/restAnalysisAnalysis: standard data processing with SPM: standard data processing with SPMSubjectSubject: a dead salmon!: a dead salmon!
![Page 7: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/7.jpg)
What Problem?What Problem?
The cluster was 81mmThe cluster was 81mm33 ! ! –– after multiple comparison corrections after multiple comparison corrections all false activations were removed.all false activations were removed.
![Page 8: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/8.jpg)
Solutions for MCPSolutions for MCP
Height ThresholdHeight ThresholdFamilywiseFamilywise Error Rate (FWER)Error Rate (FWER)
Chance of Chance of anyany false positives; Controlled by false positives; Controlled by Bonferroni & Random Field MethodsBonferroni & Random Field Methods
False Discovery Rate (FDR)False Discovery Rate (FDR)Proportion of false positives Proportion of false positives amongamong rejected testsrejected tests
BayesBayes StatisticsStatistics
![Page 9: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/9.jpg)
From single univariate to From single univariate to massive univariatemassive univariate
FamilyFamily--wise null hypothesiswise null hypothesisNull hypothesisNull hypothesis
FamilyFamily--wise error ratewise error rateType 1 error rate (chance to Type 1 error rate (chance to be wrong rejecting H0)be wrong rejecting H0)
Family of statistical valuesFamily of statistical values1 statistical value1 statistical valueMany voxelsMany voxels1 observed data1 observed data
Functional neuroimagingFunctional neuroimagingUnivariate statUnivariate stat
![Page 10: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/10.jpg)
Height ThresholdHeight Threshold
Choose locations where a test statistic Z (T, F, ...) is Choose locations where a test statistic Z (T, F, ...) is large to threshold the image of Z at a height zlarge to threshold the image of Z at a height zThe problem is how to choose this threshold z to The problem is how to choose this threshold z to exclude false positives with a high probability (e.g. exclude false positives with a high probability (e.g. 0.95)?0.95)?
To control for family wise error on must take into account the nb of tests
![Page 11: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/11.jpg)
BonferroniBonferroni
10000 Z10000 Z--scores ; alpha = 5%scores ; alpha = 5%alpha corrected = .000005 ; zalpha corrected = .000005 ; z--score = 4.42score = 4.42
100 voxels
100 voxels
![Page 12: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/12.jpg)
BonferroniBonferroni
10000 Z10000 Z--scores ; alpha = 5%scores ; alpha = 5%2D homogeneous smoothing 2D homogeneous smoothing –– 100 independent 100 independent observationsobservationsalpha corrected = .0005 ; zalpha corrected = .0005 ; z--score = 3.29score = 3.29
100 voxels
100 voxels
![Page 13: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/13.jpg)
Solutions for MCPSolutions for MCP
An important feature of neuroimaging data is that we An important feature of neuroimaging data is that we have a family of stat values that has topological features have a family of stat values that has topological features ((BonferroniBonferroni for instance consider tests as independent)for instance consider tests as independent)Why considering data as a smooth lattice? (Why considering data as a smooth lattice? (ChumbleyChumbley et et al., 2009 al., 2009 NeuroImageNeuroImage 44)44)fMRI/PET are projection methods of data points onto fMRI/PET are projection methods of data points onto the whole space the whole space –– MEEG forms continuous functions MEEG forms continuous functions in time and are smooth by the scalp (space)in time and are smooth by the scalp (space)Neural activity propagate locally through Neural activity propagate locally through intrinsic/lateral connections and is distributed via intrinsic/lateral connections and is distributed via extrinsic connections / Hemodynamic correlates are extrinsic connections / Hemodynamic correlates are initiated by diffusing signals (e.g. NO) initiated by diffusing signals (e.g. NO)
![Page 14: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/14.jpg)
Random Field TheoryRandom Field Theory
10000 Z10000 Z--scores ; alpha = 5%scores ; alpha = 5%Gaussian kernel smoothing Gaussian kernel smoothing ––How many independent observations ?How many independent observations ?
100 voxels
100 voxels
![Page 15: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/15.jpg)
Random Field TheoryRandom Field Theory
RFT relies on theoretical results for smooth statistical RFT relies on theoretical results for smooth statistical maps (hence the need for smoothing), allowing to find maps (hence the need for smoothing), allowing to find a threshold in a set of data where ita threshold in a set of data where it’’s not easy to find s not easy to find the number of independent variables. Uses the the number of independent variables. Uses the expected Euler characteristic (EC density)expected Euler characteristic (EC density)
1 Estimation of the smoothness = number of 1 Estimation of the smoothness = number of reselresel(resolution element) = (resolution element) = f(nbf(nb voxels, FWHM) voxels, FWHM) 2 expected Euler characteristic = number of clusters 2 expected Euler characteristic = number of clusters above the threshold above the threshold 3 Calculation of the threshold3 Calculation of the threshold
![Page 16: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/16.jpg)
Random Field TheoryRandom Field Theory
The Euler characteristic can be seen as the number of The Euler characteristic can be seen as the number of blobs in an image after thresholding (p value that you blobs in an image after thresholding (p value that you select in SPM)select in SPM)At high threshold, EC = 0 or 1 per At high threshold, EC = 0 or 1 per reselresel: E[EC] : E[EC] ≈≈ppFWEFWE
E[EC] = R · (4 loge 2) · (2π)−2/3 · Zt · e−1/2 Z2t for a 2D image, more complicated in 3D
![Page 17: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/17.jpg)
Random Field TheoryRandom Field Theory
For 100 For 100 reselsresels, the equation gives E[EC] = 0.049 for a , the equation gives E[EC] = 0.049 for a threshold Z of 3.8, i.e. the probability of getting one or threshold Z of 3.8, i.e. the probability of getting one or more blobs where Z is greater than 3.8 is 0.049more blobs where Z is greater than 3.8 is 0.049
100 voxels
100 voxels
If the If the reselresel size is much larger than the voxel size then size is much larger than the voxel size then E[EC] only depends on the E[EC] only depends on the nbnb of of reselsresels otherwise it otherwise it also depends on the volume, surface and diameter of also depends on the volume, surface and diameter of the search area (i.e. shape and volume matter)the search area (i.e. shape and volume matter)
![Page 18: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/18.jpg)
False discovery RateFalse discovery Rate
Whereas family wise approach corrects for any false Whereas family wise approach corrects for any false positive, the FDR approach aim at correcting among positive, the FDR approach aim at correcting among positive results only.positive results only.
1. Run an analysis with alpha = x%1. Run an analysis with alpha = x%2. Sort the resulting positive data2. Sort the resulting positive data3. Threshold to remove the false positives 3. Threshold to remove the false positives
![Page 19: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/19.jpg)
False discovery RateFalse discovery Rate
Signal+Noise
FEW correction
FDR correction
![Page 20: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/20.jpg)
7
False discovery RateFalse discovery Ratetakes the spatial structure into account
Under H0 the nb of voxels per cluster is known uncorrected p value for clusters apply FDR on the clusters (volume-wise correction)Assumes that the volume of each cluster is independent of the number of clusters
![Page 21: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/21.jpg)
Levels of inferenceLevels of inference
Voxel, cluster and setVoxel, cluster and set
![Page 22: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/22.jpg)
Levels of inferenceLevels of inference
3 levels of inference can be considered:3 levels of inference can be considered:-- Voxel level (Voxel level (probprob associated at each voxel)associated at each voxel)-- Cluster level (Cluster level (probprob associated to a set of voxels)associated to a set of voxels)-- Set level (Set level (probprob associated to a set of clusters)associated to a set of clusters)
The 3 levels are nested and based on a single probability The 3 levels are nested and based on a single probability of obtaining c or more clusters (set level) with k or more of obtaining c or more clusters (set level) with k or more voxels (cluster level) above a threshold u (voxel level): voxels (cluster level) above a threshold u (voxel level): PPww(u,k,c(u,k,c))
![Page 23: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/23.jpg)
Levels of inferenceLevels of inference
Set levelSet level: we can reject H0 for an : we can reject H0 for an omnibus test, i.e. there are some omnibus test, i.e. there are some significant clusters of activation in the significant clusters of activation in the brain.brain.
Cluster levelCluster level: we can reject H0 for : we can reject H0 for an area of a size k, i.e. a cluster of an area of a size k, i.e. a cluster of ‘‘activatedactivated’’ voxels is likely to be voxels is likely to be true for a given spatial extend.true for a given spatial extend.
Voxel levelVoxel level: we can reject H0 at each voxel, i.e. a voxel is : we can reject H0 at each voxel, i.e. a voxel is ‘‘activatedactivated’’ if exceeding a given threshold if exceeding a given threshold
![Page 24: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/24.jpg)
Levels of inferenceLevels of inference
Each level of inference is valid, but the inferences are Each level of inference is valid, but the inferences are different different –– e.g. a set might be enough to check that e.g. a set might be enough to check that subjects activated regions selected a priori for a subjects activated regions selected a priori for a connectivity analysis connectivity analysis –– clusters might be good enough if clusters might be good enough if hypotheses are about the use of different brain areas hypotheses are about the use of different brain areas between groupsbetween groups
Both voxel and cluster levels need to address the multiple Both voxel and cluster levels need to address the multiple comparison problem. If the activated region is predicted comparison problem. If the activated region is predicted in advance, the use of corrected p values is unnecessary in advance, the use of corrected p values is unnecessary and inappropriately conservative and inappropriately conservative –– a correction for the a correction for the number of predicted regions (number of predicted regions (BonferroniBonferroni) is enough ) is enough
![Page 25: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/25.jpg)
Level of inferenceLevel of inference
Uncorrected (bad)
Using p=.001 this creates an excursion setProb clusters of that size Prob peack that height
after FDR correction
RFT
![Page 26: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/26.jpg)
Circularity issuesCircularity issuesin fMRIin fMRI
![Page 27: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/27.jpg)
DefinitionDefinition
Refers to the problem of selecting data for analysisRefers to the problem of selecting data for analysisHow data (areas usually) are selected, analysed and How data (areas usually) are selected, analysed and sorted is key to avoid circularitysorted is key to avoid circularity
Put forward by Put forward by VulVul et al. 2009, et al. 2009, Perspectives on Psychological Perspectives on Psychological Science.Science. 4 4 Better explained in Better explained in KriegeskorteKriegeskorte et al., 2009 et al., 2009 Nat. Nat. NeuroscienceNeuroscience 1212
![Page 28: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/28.jpg)
CircularityCircularity
Double dipping Double dipping pblmpblm: : ““data are first analyzed to select a subset and then the subset is reanalyzed to obtain the results. In this context, assumptions and hypotheses determine the selection criterion and selection can, in turn, distort the results.”
Take a gp of subjects and measures RTs, then take 2 subgroups from the same subjects and re-do some analysis?? increases the diff.Take fMRI data and get activated areas, extract ROI and re-do some analyses??
![Page 29: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/29.jpg)
CircularityCircularity
Selection and tests must be independent Selection and tests must be independent –– non non independence create spurious effectsindependence create spurious effects
![Page 30: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/30.jpg)
CircularityCircularity
Independence of the selection and testsIndependence of the selection and tests1.1. Anatomic ROI, analysis of fMRIAnatomic ROI, analysis of fMRI2.2. SPM, minimal requirement is orthogonality of the SPM, minimal requirement is orthogonality of the
contrasts (e.g. find regions using A+B>0 C=[1 1] and contrasts (e.g. find regions using A+B>0 C=[1 1] and test A test A vsvs B C=[1 B C=[1 --1]) but if N1]) but if NAA and Nand NBB are different are different there is still a bias when testing Athere is still a bias when testing A--B (across subjects B (across subjects independence is ensured by Cindependence is ensured by Cselectionselection
TT(X(XTTX)X)--11CCtesttest))3.3. Select using a subset of data, test with another oneSelect using a subset of data, test with another one
![Page 31: Correction for multiple comparisons - University of Edinburgh · Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE – University of Edinburgh](https://reader030.vdocument.in/reader030/viewer/2022040705/5e03ed781f7d712bfd538ceb/html5/thumbnails/31.jpg)
Enough for today Enough for today ☺☺
Thanks for your attention