statistical inference and random field theory

41
Statistical Inference and Random Statistical Inference and Random Field Theory Field Theory Will Penny Will Penny SPM short course, London, May 2003 SPM short course, London, May 2003 M.Brett et al. Introduction to Random Field Theory, To appear in HBF, 2 nd Edition.

Upload: octavius-burgess

Post on 01-Jan-2016

47 views

Category:

Documents


10 download

DESCRIPTION

Statistical Inference and Random Field Theory. Will Penny SPM short course, London, May 2003. M.Brett et al. Introduction to Random Field Theory, To appear in HBF, 2 nd Edition. image data. parameter estimates. design matrix. kernel. General Linear Model model fitting statistic image. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistical Inference and Random Field Theory

Statistical Inference and RandomStatistical Inference and RandomField TheoryField Theory

Statistical Inference and RandomStatistical Inference and RandomField TheoryField Theory

Will PennyWill PennySPM short course, London, May 2003SPM short course, London, May 2003

Will PennyWill PennySPM short course, London, May 2003SPM short course, London, May 2003

M.Brett et al. Introduction to Random FieldTheory, To appear in HBF, 2nd Edition.

Page 2: Statistical Inference and Random Field Theory

realignment &motion

correction

smoothing

normalisation

General Linear Modelmodel fittingstatistic image

corrected p-values

image data parameterestimatesdesign

matrix

anatomicalreference

kernel

StatisticalParametric Map

Random Field Theory

Page 3: Statistical Inference and Random Field Theory

OverviewOverviewOverviewOverview

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results +FDR ?+FDR ?

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results +FDR ?+FDR ?

Page 4: Statistical Inference and Random Field Theory

OverviewOverviewOverviewOverview

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

Page 5: Statistical Inference and Random Field Theory

Inference at a single voxelInference at a single voxelInference at a single voxelInference at a single voxel

= p(t>u|H)

NULL hypothesis, H: activation is zero

u=2t-distribution

p-value: probability of getting a value of t at least as extreme as u. If is small we reject the null hypothesis.

Page 6: Statistical Inference and Random Field Theory

Sensitivity and SpecificitySensitivity and SpecificitySensitivity and SpecificitySensitivity and Specificity

H True (o) TN FP

H False (x) FN TP

Don’tReject

Reject

ACTION

TRUTH

o o o o o o o x x x o o x x x o x x x x

u1 u2

Sens=10/10=100%Spec=7/10=70%

At u1

Eg. t-scoresfrom regionsthat truly do and do not activate

Sens=7/10=70%Spec=9/10=90%

At u2

Sensitivity = TP/(TP+FN) = Specificity = TN/(TN+FP)= 1 - FP = Type I error or ‘error’FN = Type II error = p-value/FP rate/error rate/significance level = power

Page 7: Statistical Inference and Random Field Theory

Inference at a single voxelInference at a single voxelInference at a single voxelInference at a single voxel

= p(t>u|H)

NULL hypothesis, H: activation is zero

u=2t-distribution

We can choose u to ensurea voxel-wise significance level of

his is called an ‘uncorrected’ p-value, forreasons we’ll see later.

We can then plot a map of above thresholdvoxels.

Page 8: Statistical Inference and Random Field Theory

Inference for ImagesInference for ImagesInference for ImagesInference for Images

Signal

Signal+Noise

Noise

Page 9: Statistical Inference and Random Field Theory

11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5%

Use of ‘uncorrected’ p-value, =0.1

Percentage of Null Pixels that are False Positives

Using an ‘uncorrected’ p-value of 0.1 will lead us to conclude on average that 10% of voxels are active when they are not.

This is clearly undesirable. To correct for this we can define a null hypothesis for images of statistics.

Page 10: Statistical Inference and Random Field Theory

• FamilyFamily of hypotheses of hypotheses– HHk k k k = {1,…, = {1,…,KK}}

– HH = = HH11 HH22 … … HHkk HHKK

• FamilyFamily of hypotheses of hypotheses– HHk k k k = {1,…, = {1,…,KK}}

– HH = = HH11 HH22 … … HHkk HHKK

Family-wise Null HypothesisFamily-wise Null HypothesisFamily-wise Null HypothesisFamily-wise Null Hypothesis

FAMILY-WISE NULL HYPOTHESIS:Activation is zero everywhere

If we reject a voxel null hypothesisat any voxel, we reject the family-wiseNull hypothesis

A FP anywhere gives a FamilyWise Error (FWE)

Family-wise error rate = ‘corrected’ p-value

Page 11: Statistical Inference and Random Field Theory

Use of ‘uncorrected’ p-value, =0.1

FWE

Use of ‘corrected’ p-value, =0.1

Page 12: Statistical Inference and Random Field Theory

The Bonferroni correctionThe Bonferroni correctionThe Bonferroni correctionThe Bonferroni correction

Given a family of N independent voxels and a voxel-wise error rate v Given a family of N independent voxels and a voxel-wise error rate v

the Family-Wise Error rate (FWE) or ‘corrected’ error rate isthe Family-Wise Error rate (FWE) or ‘corrected’ error rate is

αα = 1 – (1-v) = 1 – (1-v)NN

~ Nv

Therefore, to ensure a particular FWE we choose

v = α / N

A Bonferroni correction is appropriate for independent tests

If v=0.05 then over100 voxels we’ll get5 voxel-wise type I errors. But we’ll get a much higherα. To ensure α=0.05we need v=0.0005 !

A correction for multiple comparisons

Page 13: Statistical Inference and Random Field Theory

The Bonferroni correctionThe Bonferroni correctionThe Bonferroni correctionThe Bonferroni correction

Independent Voxels Spatially Correlated Voxels

Bonferroni is too conservative for brain images

Page 14: Statistical Inference and Random Field Theory

OverviewOverviewOverviewOverview

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

Page 15: Statistical Inference and Random Field Theory

Random Field TheoryRandom Field TheoryRandom Field TheoryRandom Field Theory

• Consider a statistic image as a lattice representation of a Consider a statistic image as a lattice representation of a continuous random fieldcontinuous random field

• Use results from continuous random field theoryUse results from continuous random field theory

• Consider a statistic image as a lattice representation of a Consider a statistic image as a lattice representation of a continuous random fieldcontinuous random field

• Use results from continuous random field theoryUse results from continuous random field theory

Lattice representation

Page 16: Statistical Inference and Random Field Theory

Euler Characteristic (EC)Euler Characteristic (EC)Euler Characteristic (EC)Euler Characteristic (EC)

Topological measureTopological measure– threshold an image at threshold an image at uu

– excursion set excursion set uu

uu) == # blobs - # holes# blobs - # holes

- At high u, At high u, uu) == # blobs # blobs

Reject HReject HΩΩ if if Euler char non-zero Euler char non-zero

αα Pr( Pr(uu) > 0 ) > 0 )

Expected Euler char Expected Euler char pp–value–value

(at high u)(at high u)

αα EE[[uu)]]

Topological measureTopological measure– threshold an image at threshold an image at uu

– excursion set excursion set uu

uu) == # blobs - # holes# blobs - # holes

- At high u, At high u, uu) == # blobs # blobs

Reject HReject HΩΩ if if Euler char non-zero Euler char non-zero

αα Pr( Pr(uu) > 0 ) > 0 )

Expected Euler char Expected Euler char pp–value–value

(at high u)(at high u)

αα EE[[uu)]]

Page 17: Statistical Inference and Random Field Theory

Example – 2D Gaussian imagesExample – 2D Gaussian imagesExample – 2D Gaussian imagesExample – 2D Gaussian images

αα = R (4 ln 2) (2 = R (4 ln 2) (2ππ) ) -3/2-3/2 u exp (-u u exp (-u22/2)/2)

Voxel-wise threshold, u

Number of Resolution Elements (RESELS), R

N=100x100 voxels, Smoothness FWHM=10, gives R=10x10=100

Page 18: Statistical Inference and Random Field Theory

Example – 2D Gaussian imagesExample – 2D Gaussian imagesExample – 2D Gaussian imagesExample – 2D Gaussian images

αα = R (4 ln 2) (2 = R (4 ln 2) (2ππ) ) -3/2-3/2 u exp (-u u exp (-u22/2)/2)

For R=100 and α=0.05RFT gives u=3.8

Using R=100 in a Bonferroni correction gives u=3.3

Friston et al. (1991) J. Cer. Bl. Fl. M.

Page 19: Statistical Inference and Random Field Theory

DevelopmentsDevelopmentsDevelopmentsDevelopments

Friston et al. (1991) J. Cer. Bl. Fl. M. (Not EC Method)

2D Gaussian fields

3D Gaussian fields

3D t-fieldsWorsley et al. (1992) J. Cer. Bl. Fl. M.

Worsley et al. (1993) Quant. Brain. Func.

Page 20: Statistical Inference and Random Field Theory

Restricted search regionsRestricted search regionsRestricted search regionsRestricted search regions

Box has16 markers

Frame has 32 markers

Box and frame havesame number of voxels

Page 21: Statistical Inference and Random Field Theory

• General form for expected Euler characteristicGeneral form for expected Euler characteristic• 22, , FF, & , & tt fields fields •• restricted search regions restricted search regions

αα = = R Rd d (()) d d ((uu))

• General form for expected Euler characteristicGeneral form for expected Euler characteristic• 22, , FF, & , & tt fields fields •• restricted search regions restricted search regions

αα = = R Rd d (()) d d ((uu))

Unified TheoryUnified TheoryUnified TheoryUnified Theory

Rd (): RESEL count; depends on

the search region – how big, how

smooth, what shape ?

d (): EC density; depends on

type of field (eg. Gaussian, t) and thethreshold, u.

Au

Worsley et al. (1996), HBM

Page 22: Statistical Inference and Random Field Theory

• General form for expected Euler characteristicGeneral form for expected Euler characteristic• 22, , FF, & , & tt fields fields •• restricted search regions restricted search regions

αα = = R Rd d (()) d d ((uu))

• General form for expected Euler characteristicGeneral form for expected Euler characteristic• 22, , FF, & , & tt fields fields •• restricted search regions restricted search regions

αα = = R Rd d (()) d d ((uu))

Unified TheoryUnified TheoryUnified TheoryUnified Theory

Rd (): RESEL count

R0() = () Euler characteristic of

R1() = resel diameter

R2() = resel surface area

R3() = resel volume

d (u): d-dimensional EC density –

E.g. Gaussian RF:

0(u) = 1- (u)

1(u) = (4 ln2)1/2 exp(-u2/2) / (2)

2(u) = (4 ln2) exp(-u2/2) / (2)3/2

3(u) = (4 ln2)3/2 (u2 -1) exp(-u2/2) / (2)2

4(u) = (4 ln2)2 (u3 -3u) exp(-u2/2) / (2)5/2

Au

Worsley et al. (1996), HBM

Page 23: Statistical Inference and Random Field Theory

Resel Counts for Brain StructuresResel Counts for Brain StructuresResel Counts for Brain StructuresResel Counts for Brain Structures

FWHM=20mm (1) Threshold depends on Search Volume(2) Surface area makes a large contribution

Page 24: Statistical Inference and Random Field Theory

OverviewOverviewOverviewOverview

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

Page 25: Statistical Inference and Random Field Theory

Functional Imaging DataFunctional Imaging DataFunctional Imaging DataFunctional Imaging Data

• The Random Fields are the component fields,The Random Fields are the component fields,

Y = Xw +E, e=E/Y = Xw +E, e=E/σσ

• We can only We can only estimateestimate the component fields, using the component fields, using

estimates of w and estimates of w and σσ

• To apply RFT we need the RESEL count which To apply RFT we need the RESEL count which requires smoothness estimatesrequires smoothness estimates

• The Random Fields are the component fields,The Random Fields are the component fields,

Y = Xw +E, e=E/Y = Xw +E, e=E/σσ

• We can only We can only estimateestimate the component fields, using the component fields, using

estimates of w and estimates of w and σσ

• To apply RFT we need the RESEL count which To apply RFT we need the RESEL count which requires smoothness estimatesrequires smoothness estimates

Page 26: Statistical Inference and Random Field Theory

Component fieldsComponent fieldsComponent fieldsComponent fields

= +Y X

componentfields

data matrixd

esig

n m

atri

xparameters += ?

voxelsvoxels

scansscans

errors ?

variance

Page 27: Statistical Inference and Random Field Theory

Estimated component fieldsEstimated component fieldsEstimated component fieldsEstimated component fields

data matrix

des

ign

mat

rix

parameters errors+ ?= ?voxelsvoxels

scansscans

estimate

^

residuals

estimatedcomponent

fields

parameterestimates

estimated variance

=

Each row isan estimatedcomponent field

Page 28: Statistical Inference and Random Field Theory

ze

ze

ye

ze

xe

ze

ye

ye

ye

xe

ze

xe

ye

xe

xe

var,cov,cov

,covvar,cov

,cov,covvar

Smoothness EstimationSmoothness EstimationSmoothness EstimationSmoothness Estimation

• Roughness Roughness ||||

• Point Response Function Point Response Function PRFPRF

• Roughness Roughness ||||

• Point Response Function Point Response Function PRFPRF

• Gaussian Gaussian PRFPRF

ffxx 0 0

ffyy00 0 ffzz

|||| = (4ln(2)) = (4ln(2))3/23/2 / (f / (fxx f fyy f fzz))

• RESEL COUNTRESEL COUNT

RR33(() = ) = (()) / (f / (fxx f fyy f fzz))

αα = R = R33(() (4ln(2))) (4ln(2))3/23/2 ( (u u 2 2 -1) exp(--1) exp(-u u 22/2) / (2/2) / (2))22

• Gaussian Gaussian PRFPRF

ffxx 0 0

ffyy00 0 ffzz

|||| = (4ln(2)) = (4ln(2))3/23/2 / (f / (fxx f fyy f fzz))

• RESEL COUNTRESEL COUNT

RR33(() = ) = (()) / (f / (fxx f fyy f fzz))

αα = R = R33(() (4ln(2))) (4ln(2))3/23/2 ( (u u 2 2 -1) exp(--1) exp(-u u 22/2) / (2/2) / (2))22

Approximate the peak of the Covariance function with a Gaussian

Page 29: Statistical Inference and Random Field Theory

RFT AssumptionsRFT AssumptionsRFT AssumptionsRFT Assumptions

• Model fit & assumptionsModel fit & assumptions– valid distributional resultsvalid distributional results

• Multivariate normalityMultivariate normality– of of componentcomponent images images

• Covariance function of Covariance function of componentcomponent images must be images must be

- Stationary- Stationary (pre SPM99)(pre SPM99)

- - Can be nonstationaryCan be nonstationary

(SPM99 onwards)(SPM99 onwards)

- Twice differentiable- Twice differentiable

• Model fit & assumptionsModel fit & assumptions– valid distributional resultsvalid distributional results

• Multivariate normalityMultivariate normality– of of componentcomponent images images

• Covariance function of Covariance function of componentcomponent images must be images must be

- Stationary- Stationary (pre SPM99)(pre SPM99)

- - Can be nonstationaryCan be nonstationary

(SPM99 onwards)(SPM99 onwards)

- Twice differentiable- Twice differentiable

SmoothnessSmoothnesssmoothness » voxel sizesmoothness » voxel size

lattice approximationlattice approximation

smoothness estimationsmoothness estimation

practicallypracticallyFWHMFWHM 3 3 VoxDimVoxDim

otherwiseotherwiseconservativeconservative

Typical applied smoothing:Typical applied smoothing:

Single Subj fMRI: 6mmSingle Subj fMRI: 6mm

PET: 12mmPET: 12mm

Multi Subj fMRI: 8-12mmMulti Subj fMRI: 8-12mm

PET: 16mm PET: 16mm

SmoothnessSmoothnesssmoothness » voxel sizesmoothness » voxel size

lattice approximationlattice approximation

smoothness estimationsmoothness estimation

practicallypracticallyFWHMFWHM 3 3 VoxDimVoxDim

otherwiseotherwiseconservativeconservative

Typical applied smoothing:Typical applied smoothing:

Single Subj fMRI: 6mmSingle Subj fMRI: 6mm

PET: 12mmPET: 12mm

Multi Subj fMRI: 8-12mmMulti Subj fMRI: 8-12mm

PET: 16mm PET: 16mm

Page 30: Statistical Inference and Random Field Theory

OverviewOverviewOverviewOverview

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

Page 31: Statistical Inference and Random Field Theory

Cluster and Set-level InferenceCluster and Set-level InferenceCluster and Set-level InferenceCluster and Set-level Inference

• We can increase sensitivity by trading off anatomical specificityWe can increase sensitivity by trading off anatomical specificity

• Given a voxel level threshold u, we can computeGiven a voxel level threshold u, we can compute the likelihood (under the null hypothesis) of getting n or more connected the likelihood (under the null hypothesis) of getting n or more connected

components in the excursion set ie. a cluster containing at least n voxelscomponents in the excursion set ie. a cluster containing at least n voxels

CLUSTER-LEVEL INFERENCECLUSTER-LEVEL INFERENCE

• Similarly, we can compute the likelihood of getting cSimilarly, we can compute the likelihood of getting c clusters each having at least n voxelsclusters each having at least n voxels

SET-LEVEL INFERENCESET-LEVEL INFERENCE

• We can increase sensitivity by trading off anatomical specificityWe can increase sensitivity by trading off anatomical specificity

• Given a voxel level threshold u, we can computeGiven a voxel level threshold u, we can compute the likelihood (under the null hypothesis) of getting n or more connected the likelihood (under the null hypothesis) of getting n or more connected

components in the excursion set ie. a cluster containing at least n voxelscomponents in the excursion set ie. a cluster containing at least n voxels

CLUSTER-LEVEL INFERENCECLUSTER-LEVEL INFERENCE

• Similarly, we can compute the likelihood of getting cSimilarly, we can compute the likelihood of getting c clusters each having at least n voxelsclusters each having at least n voxels

SET-LEVEL INFERENCESET-LEVEL INFERENCE

Weak vs Strong control over FWE

Page 32: Statistical Inference and Random Field Theory

Levels of inferenceLevels of inferenceLevels of inferenceLevels of inference

set-levelset-levelP(c P(c 3 | n 3 | n 12, u 12, u 3.09) = 3.09) =

0.0190.019

cluster-levelcluster-levelP(c P(c 1 | n 1 | n 82, t 82, t 3.09) = 0.029 (corrected) 3.09) = 0.029 (corrected)

n=82n=82

n=32n=32

n=1n=122

voxel-levelvoxel-levelP(c P(c 1 | n > 0, t 1 | n > 0, t 4.37) = 0.048 (corrected) 4.37) = 0.048 (corrected)

At least onecluster withunspecifiednumber of voxels abovethreshold

At least one cluster with at least 82 voxels above threshold

At least 3 clusters abovethreshold

Page 33: Statistical Inference and Random Field Theory

OverviewOverviewOverviewOverview

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

1.1. TerminologyTerminology

2.2. TheoryTheory

3.3. Imaging DataImaging Data

4.4. Levels of InferenceLevels of Inference

5. 5. SPM ResultsSPM Results

Page 34: Statistical Inference and Random Field Theory

SPM99 results ISPM99 results ISPM99 results ISPM99 results I

ActivationsSignificant atCluster levelBut not atVoxel Level

Page 35: Statistical Inference and Random Field Theory
Page 36: Statistical Inference and Random Field Theory

SPM99 results IISPM99 results IISPM99 results IISPM99 results II

Activations Significant atVoxel andCluster level

Page 37: Statistical Inference and Random Field Theory

SPM results...SPM results...SPM results...SPM results...

Page 38: Statistical Inference and Random Field Theory

False Discovery RateFalse Discovery RateFalse Discovery RateFalse Discovery Rate

H True (o) TN FP

H False (x) FN TP

Don’tReject

Reject

ACTION

TRUTH

o o o o o o o x x x o o x x x o x x x x

u1 u2

FDR=3/13=23%=3/10=30%

At u1

Eg. t-scoresfrom regionsthat truly do and do not activate

FDR=1/8=13%=1/10=10%

At u2

FDR = FP/(FP+TP)

= FP/(FP+TN)

Page 39: Statistical Inference and Random Field Theory

False Discovery RateFalse Discovery RateIllustration:Illustration:

False Discovery RateFalse Discovery RateIllustration:Illustration:

Signal

Signal+Noise

Noise

Page 40: Statistical Inference and Random Field Theory

FWE

Control of Familywise Error Rate at 10%

Occurrence of Familywise Error

6.7% 10.4% 14.9% 9.3% 16.2% 13.8% 14.0% 10.5% 12.2% 8.7%

Control of False Discovery Rate at 10%

Percentage of Activated Pixels that are False Positives

Page 41: Statistical Inference and Random Field Theory

SummarySummarySummarySummary

• We should correct for multiple comparisonsWe should correct for multiple comparisons

• We can use Random Field Theory (RFT)We can use Random Field Theory (RFT)

• RFT requires (i) a good lattice approximation to underlying RFT requires (i) a good lattice approximation to underlying multivariate Gaussian fields, (ii) that these fields are continuous multivariate Gaussian fields, (ii) that these fields are continuous with a twice differentiable correlation functionwith a twice differentiable correlation function

• To a first approximation, RFT is a Bonferroni correction using To a first approximation, RFT is a Bonferroni correction using RESELS.RESELS.

• We only need to correct for the volume of interest.We only need to correct for the volume of interest.

• Depending on nature of signal we can trade-off anatomical Depending on nature of signal we can trade-off anatomical specificity for signal sensitivity with the use of cluster-level specificity for signal sensitivity with the use of cluster-level inference. inference.

• We should correct for multiple comparisonsWe should correct for multiple comparisons

• We can use Random Field Theory (RFT)We can use Random Field Theory (RFT)

• RFT requires (i) a good lattice approximation to underlying RFT requires (i) a good lattice approximation to underlying multivariate Gaussian fields, (ii) that these fields are continuous multivariate Gaussian fields, (ii) that these fields are continuous with a twice differentiable correlation functionwith a twice differentiable correlation function

• To a first approximation, RFT is a Bonferroni correction using To a first approximation, RFT is a Bonferroni correction using RESELS.RESELS.

• We only need to correct for the volume of interest.We only need to correct for the volume of interest.

• Depending on nature of signal we can trade-off anatomical Depending on nature of signal we can trade-off anatomical specificity for signal sensitivity with the use of cluster-level specificity for signal sensitivity with the use of cluster-level inference. inference.