multivariate models for fmri data klaas enno stephan (with 90% of slides kindly contributed by kay...

74
Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering University of Zurich & ETH Zurich

Upload: oscar-watson

Post on 26-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

Multivariate models for fMRI data

Klaas Enno Stephan(with 90% of slides kindly contributed by Kay H. Brodersen)

Translational Neuromodeling Unit (TNU)Institute for Biomedical EngineeringUniversity of Zurich & ETH Zurich

Page 2: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

2

Univariate approaches are excellent for localizing activations in individual voxels.

Why multivariate?

v1 v2 v1 v2

reward no reward

*

n.s.

Page 3: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

3

Multivariate approaches can be used to examine responses that are jointly encoded in multiple voxels.

Why multivariate?

v1 v2 v1 v2

n.s.

orange juice apple juice v1

v2

n.s.

Page 4: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

4

Multivariate approaches can utilize ‘hidden’ quantities such as coupling strengths.

Why multivariate?

activity

observed BOLD signal

hidden underlying neural activity and coupling strengths

t

driving input modulatory input

t

activity

activity

signal

signal

signal

Friston, Harrison & Penny (2003) NeuroImage; Stephan & Friston (2007) Handbook of Brain Connectivity; Stephan et al. (2008) NeuroImage

Page 5: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

5

Overview

1 Modelling principles

2 Classification

3 Multivariate Bayes

4 Generative embedding

Page 6: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

6

Overview

1 Modelling principles

2 Classification

3 Multivariate Bayes

4 Generative embedding

Page 7: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

7

Encoding vs. decoding

context (cause or consequence) BOLD signal

condition stimulusresponse prediction error

encoding model

decoding model

𝑔 : 𝑋 𝑡→𝑌 𝑡

h :𝑌 𝑡→ 𝑋𝑡

Page 8: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

8

Regression vs. classification

Regression model

independentvariables(regressors)

continuousdependent variable

Classification model

independentvariables(features)

categoricaldependent variable(label)

𝑓

𝑓 vs.

Page 9: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

9

Univariate vs. multivariate models

BOLD signal

context

A univariate model considers a single voxel at a time.

A multivariate model considers many voxels at once.

Spatial dependencies between voxels are only introduced afterwards, through random field theory.

BOLD signalcontext

Multivariate models enable inferences on distributed responses without requiring focal activations.

Page 10: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

10

Prediction vs. inference

The goal of prediction is to find a generalisable encoding or decoding function.

The goal of inference is to decide between competing hypotheses.

predicting a cognitive state using a

brain-machine interface

predicting asubject-specific

diagnostic status

predictive density marginal likelihood (model evidence)

comparing a model that links distributed neuronal

activity to a cognitive state with a model that

does not

weighing the evidence for

sparse vs. distributed coding

Page 11: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

11

Goodness of fit vs. complexity

Goodness of fit is the degree to which a model explains observed data.

Complexity is the flexibility of a model (including, but not limited to, its number of parameters).

4 parameters 9 parameters

Bishop (2007) PRML

1 parameter

𝑋

𝑌

We wish to find the model that optimally trades off goodness of fit and complexity.

underfitting overfittingoptimal

truthdatamodel

Page 12: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

13

Overview

1 Modelling principles

2 Classification

3 Multivariate Bayes

4 Generative embedding

Page 13: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

14

A principled way of designing a classifier would be to adopt a probabilistic approach:

Constructing a classifier

Generative classifiers

use Bayes’ rule to estimate

• Gaussian naïve Bayes• Linear discriminant

analysis

Discriminative classifiers

estimate directlywithout Bayes’ theorem

• Logistic regression• Relevance vector machine• Gaussian process classifier

Discriminant classifiers

estimate directly

• Fisher’s linear discriminant

• Support vector machine

𝑓𝑌 𝑡 that which maximizes

In practice, classifiers differ in terms of how strictly they implement this principle.

Page 14: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

15

Searchlight approach

A sphere is passed across the brain. At each location, the classifier is evaluated using only the voxels in the current sphere map of t-scores.

Whole-brain approach

A constrained classifier is trained on whole-brain data. Its voxel weights are related to their empirical null distributions using a permutation test map of t-scores.

Common types of fMRI classification studies

Nandy & Cordes (2003) MRMKriegeskorte et al. (2006) PNAS

Mourao-Miranda et al. (2005) NeuroImage

Page 15: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

16

Support vector machine (SVM)

Vapnik (1999) Springer; Schölkopf et al. (2002) MIT Press

Nonlinear SVMLinear SVM

v1

v2

Page 16: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

17

Stages in a classification analysis

Feature extraction

Classification using cross-validation

Performance evaluation

Bayesian mixed-effects inference

mixed effects

Page 17: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

18

We can obtain trial-wise estimates of neural activity by filtering the data with a GLM.

Feature extraction for trial-by-trial classification

¿

data design matrix

𝛽1𝛽2⋮𝛽𝑝

× +𝑒

coefficients

Boxcar regressor for trial 2

Estimate of this coefficient reflects activity on trial 2

Page 18: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

19

The generalization ability of a classifier can be estimated using a resampling procedure known as cross-validation. One example is 2-fold cross-validation:

Cross-validation

examples123

99100

?training example

test examples

folds

??

1

...

???

2

...

performance evaluation

Page 19: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

20

A more commonly used variant is leave-one-out cross-validation.

Cross-validation

examples123

99100

?training example

test example?...

98

?

...99

?

...

100

...

folds

?

1

...

?

2

...

performance evaluation

Page 20: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

21

Single-subject study with trials

The most common approach is to assess how likely the obtained number of correctly classified trials could have occurred by chance.

Performance evaluation

number of correctly classified trialstotal number of trialschance level (typically 0.5)binomial cumulative density function

Binomial test

In MATLAB:p = 1 - binocdf(k,n,pi_0)

subject

-+--+

trial

trial 1… [0111

0]

Page 21: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

22

Performance evaluation

subject 1

-+--+

trial

trial 1…

population

[01110]

subject 2

[11011]

subject 3

[01100]

subject 4

[11111]

subject

… [01110]

Page 22: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

23

Group study with subjects, trials each

In a group setting, we must account for both within-subjects (fixed-effects) and between-subjects (random-effects) variance components.

Performance evaluation

sample mean of sample accuracies chance level (typically 0.5)sample standard deviation cumulative Student’s -distribution

t-test onsummary statistics

Binomial test on concatenated data

Binomial test on averaged data

Bayesian mixed-effects inference

fixed effects random effectsfixed effects mixed effects

Brodersen, Mathys, Chumbley, Daunizeau, Ong, Buhmann, Stephan (2012) JMLRBrodersen, Daunizeau, Mathys, Chumbley, Buhmann, Stephan (2013) NeuroImage

available for MATLAB and R

Page 23: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

24

Research questions for classification

Temporal evolution of discriminability Model-based classification

accuracy

50 %

100 %

within-trial time

Accuracy rises above chance

Participant indicates decision

Overall classification accuracy Spatial deployment of discriminative regions

80%

55%

accuracy

50 %

100 %

classification task

Truthor

lie?

Left or right button?

Healthy or ill?

Pereira et al. (2009) NeuroImage, Brodersen et al. (2009) The New Collection

{ group 1, group 2 }

Page 24: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

25

Multivariate classification studies conduct group tests on single-subject summary statistics that discard the sign or

direction of underlying effects

do not necessarily take into account confounding effects (e.g. correlation of task conditions with difficulty etc.)

Therefore, in some analyses confounds rather than distributed representations may have produced positive results.

Potential problems

Todd et al. 2013, NeuroImage

Page 25: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

26

Simulation: Experiment condition (rule A vs. rule B) does not affect voxel activity, but difficulty does. Moreover, experiment condition and difficulty are confounded at the individual-subject level in random directions across 500 subjects.

Potential problems

Todd et al. 2013, NeuroImage

Page 26: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

27

Empirical example: searchlight analysis: fMRI study on rule representations (flexible stimulus–response mappings

Standard MVPA: rule representations in prefrontal regions.

GLM: no significant results. Controlling for a variable

that is confounded with rule at the individual-subject level but not the group level (reaction time differences across rules) eliminates the MVPA results.

Potential problems

Todd et al. 2013, NeuroImage

Page 27: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

28

Overview

1 Modelling principles

2 Classification

3 Multivariate Bayes

4 Generative embedding

Page 28: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

29

Multivariate Bayes

Mike West

SPM brings multivariate analyses into the conventional inference framework of hierarchical Bayesian models and their inversion.

Page 29: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

30

Multivariate Bayes

some cause or consequence decoding model

Multivariate analyses in SPM rest on the central notion that inferences about how the brain represents things can be reduced to model comparison.

vs.

sparse coding in orbitofrontal cortex

distributed coding in prefrontal cortex

Page 30: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

31

From encoding to decoding

Encoding model: GLM Decoding model: MVB

𝑌

¿ 𝑋 𝛽

¿𝑇𝐴+𝐺𝛾+𝜀𝛾𝜀

𝛽

𝛽𝑋

𝐴

𝑌

¿ 𝐴 𝛽

¿𝑇𝐴+𝐺𝛾+𝜀

𝑋

𝐴

𝑌=𝑇𝑋 𝛽+𝐺𝛾+𝜀 𝑇𝑋=𝑌 𝛽−𝐺𝛾𝛽−𝜀𝛽

𝛾𝜀

Page 31: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

32

Multivariate Bayes

Friston et al. 2008 NeuroImage

Page 32: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

33

Encoding model: neuronal activity = linear mixture of causes:

A=X𝛽 BOLD data matrix is a temporal convolution of underlying neuronal activity:

𝑌=𝑇A+𝐺𝛾+𝜀 Decoding model:

mental state is a linear mixture of voxel-wise activity: 𝑋=A𝛽 thus A=X𝛽-1

substitution into gives:𝑌 Y =𝑇X𝛽-1 +𝐺𝛾+𝜀 ⇒ Y𝛽 =𝑇X+𝐺𝛾𝛽+𝜀𝛽 ⇒ 𝑇X= Y𝛽-𝐺𝛾𝛽-𝜀𝛽

Importantly, we can remove the influence of confounds by premultiplying with the appropriate residual forming matrix: RTX= RY𝛽+𝜍

MVB: details

Page 33: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

34

To make inversion tractable: imposing priors on 𝛽 (i.e., assumption about how mental states are

represented by voxel-wise activity)

Scientific inquiry: model selection: comparing different priors to identify which neuronal

representation of mental states is most plausible

MVB: details

Page 34: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

35

Specifying the prior for MVB

To make the ill-posed regression problem tractable, MVB uses a prior on voxel weights. Different priors reflect different anatomical and/or coding hypotheses.

For example:

voxels

patterns

Voxel 3 is allowed to play a role, but only if its neighbours play similar roles.

Voxel 2 is allowed to play a role.

Friston et al. 2008 NeuroImage

Page 35: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

36

MVB can be illustrated using SPM’s attention-to-motion example dataset.

This dataset is based on a simple block design. There are three experimental factors:

photic – display shows random dots

motion – dots are moving attention – subjects asked to pay

attention

Example: decoding motion from visual cortex

scan

s

photic motion attention const

Büchel & Friston 1999 Cerebral CortexFriston et al. 2008 NeuroImage

During these scans, for example, subjects were passively viewing moving dots.

Page 36: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

37

Multivariate Bayes in SPM

Step 1After having specified and estimated a model, use the Results button.

Step 2Select the contrast to be decoded.

Page 37: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

38

Multivariate Bayes in SPM

Step 3Pick a region of interest.

Page 38: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

39

Multivariate Bayes in SPM

Step 4Multivariate Bayes can be invoked from within the Multivariate section.

Step 5Here, the region of interest is specified as a sphere around the cursor. The spatial prior implements a sparse coding hypothesis.

anatomical hypothesis

coding hypothesis

Page 39: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

40

Multivariate Bayes in SPM

Step 6Results can be displayed using the BMS button.

Page 40: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

41

Model evidence and voxel weights

log BF = 3

Page 41: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

42

Summary: research questions for MVB

How does the brain represent things?Evaluating competing coding hypotheses

Where does the brain represent things?Evaluating competing anatomical hypotheses

Page 42: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

43

Overview

1 Modelling principles

2 Classification

3 Multivariate Bayes

4 Generative embedding

Page 43: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

44

Model-based classification

Brodersen, Haiss, Ong, Jung, Tittgemeyer, Buhmann, Weber, Stephan (2011) NeuroImageBrodersen, Schofield, Leff, Ong, Lomakina, Buhmann, Stephan (2011) PLoS Comput Biol

step 2 —embedding

step 1 —modelling

measurements from an individual subject

subject-specificgenerative model

subject representation in model-based feature space

A → BA → CB → BB → C

A

CB

step 3 —classification

A

CB

jointly discriminativeconnection strengths?

step 5 —interpretation

classification model

1

0discriminability of

groups?

accu

racy step 4 —

evaluation

Page 44: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

45

Model-based classification: model specification

medialgeniculate

body

planumtemporale

Heschl’sgyrus(A1)

medialgeniculate

body

planumtemporale

Heschl’sgyrus(A1)

stimulus input

anatomicalregions of interest

y = –26 mm

L R

Page 45: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

46

Model-based classification

patientscontrols 16218917734729133230781 243389360

50

60

70

80

90

100

bal

anc

ed a

ccu

racy

activation-based

correlation-based

model-based

a c s p m e z o f l r

bala

nced

acc

urac

y

100%

50%

90%

80%

70%

60%

n.s. n.s.

*

Page 46: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

47

Generative embedding

-10

0

10

-0.5

0

0.5

-0.1

0

0.1

0.2

0.3

0.4

-0.4-0.2

0 -0.5

0

0.5-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-10

0

10

-0.5

0

0.5

-0.1

0

0.1

0.2

0.3

0.4

-0.4-0.2

0 -0.5

0

0.5-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

generative em

bedding

Para

met

er 3

Voxe

l 3

Parameter 1Voxel 1Voxel 2 Parameter 2

controlspatients

Voxel-based activity space Model-based parameter space

classification accuracy

98%classification accuracy

75%

Page 47: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

48

Model-based classification: interpretation

MGB

PT

HG(A1)

MGB

PT

HG(A1)

stimulus input

L R

Page 48: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

49

Model-based classification: interpretation

MGB

PT

HG(A1)

MGB

PT

HG(A1)

stimulus input

L R

highly discriminativesomewhat discriminativenot discriminative

Page 49: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

50

Model-based clustering

Deserno, Sterzer, Wüstenberg, Heinz, & Schlagenhauf (2012) J Neurosci

PC dLPFC

VC

WM

stimulus

fMRI data acquired during working-memory task & modelled using a three-region DCM

42 patients diagnosedwith schizophrenia

41 healthy controls

Page 50: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

51

Model-based clustering

model selection interpretation validation

Brodersen et al. 2014, NeuroImage: Clinical

Page 51: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

52

Summary

Classification• to assess whether a cognitive state is linked

to patterns of activity• to visualize the spatial deployment of

discriminative activity

Multivariate Bayes• to evaluate competing anatomical

hypotheses• to evaluate competing coding hypotheses

Generative embedding• to assess whether groups differ in terms of

model parameter estimates (connectivity)• to generate mechanistic subgroup

hypotheses

Page 52: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

53

Thank you

Page 53: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

54

Page 54: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

55

Classification Pereira, F., Mitchell, T., & Botvinick, M. (2009). Machine learning classifiers and fMRI: A tutorial

overview. NeuroImage, 45(1, Supplement 1), S199-S209. O'Toole, A. J., Jiang, F., Abdi, H., Penard, N., Dunlop, J. P., & Parent, M. A. (2007). Theoretical,

Statistical, and Practical Perspectives on Pattern-based Classification Approaches to the Analysis of Functional Neuroimaging Data. Journal of Cognitive Neuroscience, 19(11), 1735-1752.

Haynes, J., & Rees, G. (2006). Decoding mental states from brain activity in humans. Nature Reviews Neuroscience, 7(7), 523-534.

Norman, K. A., Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10(9), 424-30.

Brodersen, K. H., Haiss, F., Ong, C., Jung, F., Tittgemeyer, M., Buhmann, J., Weber, B., et al. (2010). Model-based feature construction for multivariate decoding. NeuroImage (2010).

Brodersen, K. H., Ong, C., Stephan, K. E., Buhmann, J. (2010). The balanced accuracy and its posterior distribution. ICPR, 3121-3124.

Multivariate Bayes Friston, K., Chu, C., Mourao-Miranda, J., Hulme, O., Rees, G., Penny, W., et al. (2008). Bayesian

decoding of brain images. NeuroImage, 39(1), 181-205.

Further reading

Page 55: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

56

Multivariate approaches can exploit a sampling bias in voxelized imagesto reveal interesting activity on a subvoxel scale.

Why multivariate?

Boynton (2005) Nature Neuroscience

¿

Page 56: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

57

The last 10 years have seen a notable increase in decoding analyses in neuroimaging.

Why multivariate?

Haxby et al. (2001) ScienceLautrup et al. (1994) Supercomputing in Brain Research

PET

prediction

Page 57: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

58

Searchlight approach

A sphere is passed across the brain. At each location, the classifier is evaluated using only the voxels in the current sphere map of t-scores.

Whole-brain approach

A constrained classifier is trained on whole-brain data. Its voxel weights are related to their empirical null distributions using a permutation test map of t-scores.

Which brain regions are jointly informative of a cognitive state of interest?

Spatial deployment of informative regions

Nandy & Cordes (2003) MRMKriegeskorte et al. (2006) PNAS

Mourao-Miranda et al. (2005) NeuroImage

Page 58: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

59

Classification induces constraints on the experimental design. When estimating trial-wise Beta values, we need longer ITIs (typically 8 – 15

s). At the same time, we need many trials (typically 100+). Classes should be balanced. If they are imbalanced, we can resample the

training set, constrain the classifier, or report the balanced accuracy. Construction of examples

Estimation of Beta images is the preferred approach. Covariates should be included in the trial-by-trial design matrix.

Temporal autocorrelation In trial-by-trial classification, exclude trials around the test trial from the

training set. Avoiding double-dipping

Any feature selection and tuning of classifier settings should be carried out on the training set only.

Performance evaluation Do random-effects or mixed-effects inference. Correct for multiple tests.

Issues to be aware of (as researcher or reviewer)

Page 59: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

60

Evaluating the performance of a classification algorithm critically requires a measure of the degree to which unseen examples have been identified with their correct class labels.

The procedure of averaging across accuracies obtained on individual cross-validation folds is flawed in two ways. First, it does not allow for the derivation of a meaningful confidence interval. Second,it leads to an optimistic estimate when a biased classifier is tested on an imbalanced dataset.

Both problems can be overcome by replacing the conventional point estimate of accuracy by an estimate of the posterior distribution of the balanced accuracy.

Performance evaluation

Brodersen, Ong, Buhmann, Stephan (2010) ICPR

Page 60: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

61

Pattern characterization

Example – decoding the identity of the person speaking to the subject in the scanner

Formisano et al. (2008) Science

voxe

l 1

voxe

l 2

...

fingerprint plot(one plot per class)

Page 61: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

62

Is there a link between and ?

To test for a statistical dependency between a contextual variable and the BOLD signal , we compare : there is no dependency : there is some dependency

Which statistical test?

1. define a test size (the probability of falsely rejecting , i.e., specificity),

2. choose the test with the highest power (the probability of correctly rejecting , i.e., sensitivity).

Lessons from the Neyman-Pearson lemma

The Neyman-Pearson lemma

The most powerful test of size is:to reject when the likelihood ratio exceeds a criticial value ,

with chosen such that

The null distribution of the likelihood ratio can be determined non-parametrically or under parametric assumptions.

This lemma underlies both classical statistics and Bayesian statistics (where is known as a Bayes factor).

Neyman & Person (1933) Phil Trans Roy Soc London

Page 62: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

63

Lessons from the Neyman-Pearson lemma

In summary

1. Inference about how the brain represents things reduces to model comparison.

2. To establish that a link exists between some context and activity , the direction of the mapping is not important.

3. Testing the accuracy of a classifier is not based on and is therefore suboptimal.

Neyman & Person (1933) Phil Trans Roy Soc LondonKass & Raftery (1995) J Am Stat AssocFriston et al. (2009) NeuroImage

Page 63: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

64

Temporal evolution of discriminability

Soon et al. (2008) Nature Neuroscience

Example – decoding which button the subject pressed

motor cortex

frontopolar cortex

classificationaccuracy

decision response

Page 64: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

65

Approach1. estimation of an encoding model2. nearest-neighbour classification or voting

Identification / inferring a representational space

Mitchell et al. (2008) Science

Page 65: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

66

Approach1. estimation of an encoding model2. model inversion

Reconstruction / optimal decoding

Paninski et al. (2007) Progr Brain Res Pillow et al. (2008) Nature Miyawaki et al. (2009) Neuron

Page 66: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

67

Recent MVB studies

Page 67: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

68

Specifying the prior for MVB

1st level – spatial coding hypothesis

2nd level – pattern covariance structure

𝜂×voxels

patterns

Thus: and

𝑈 𝑈 𝑈Voxel 3 is allowed to play a role, but only if its neighbours play weaker roles.

Voxel 2 is allowed to play a role.

Page 68: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

69

Inverting the model

Partition #1

Partition #2

Partition #3(optimal)

Σ=𝜆1×

Σ=𝜆1×

Σ=𝜆1×

+𝜆2×

+𝜆2× +𝜆3×

Model inversion involves finding the posterior distribution over voxel weights .

In MVB, this includes a greedy search for the optimal covariance structure that governs the prior over .

subset subset

subset subset subset

subset

Pattern 8 is allowed to make some contribution, independently of the contributions of other patterns.

Pattern 3 makes an important positive contribution / is activated.

Page 69: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

70

Observations vs. predictions

𝑹𝑿 𝑐

(moti

on)

Page 70: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

71

MVB may outperform conventional point classifiers when using a more appropriate coding hypothesis.

Using MVB for point classification

Support Vector Machine

Page 71: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

72

Model-based analyses by data representation

Model-basedanalyses

How do patterns ofhidden quantities (e.g., connectivity among brain regions) differ between groups?

Structure-basedanalyses

Which anatomical structures allow us to separate patients and healthy controls?

Activation-basedanalyses

Which functionaldifferences allow us toseparate groups?

Page 72: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

73

0

0.2

0.4

0.6

0.8

1

bala

nced

acc

urac

y

1 2 3 4 5 6 7 80

20

40

60

80

100

120

log

mod

el e

vide

nce

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

bala

nced

pur

ity

Model-based clustering

0

0.2

0.4

0.6

0.8

1

bala

nced

acc

urac

y

1 2 3 4 5 6 7 80

20

40

60

80

100

120

log

mod

el e

vide

nce

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

bala

nced

pur

ity

0

0.2

0.4

0.6

0.8

1

bala

nced

acc

urac

y

1 2 3 4 5 6 7 80

20

40

60

80

100

120

log

mod

el e

vide

nce

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

bala

nced

pur

ity

functional connectivity

effective connectivity

78%71%

(model selection) (model validation)

best model

71%

supervised learning:SVM classification

unsupervised learning:GMM clustering (using effective connectivity)

62%

78%

55%

regionalactivity

Brodersen, Deserno, Schlagenhauf, Penny, Lin, Gupta, Buhmann, Stephan (in preparation)

Page 73: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

74

Question 1 – What do the data tell us about hidden processes in the brain? compute the posterior

Generative embedding and DCM

?

?

Question 2 – Which model is best w.r.t. the observed fMRI data? compute the model evidence

Question 3 – Which model is best w.r.t. an external criterion? compute the classification accuracy

{ patient, control }

Page 74: Multivariate models for fMRI data Klaas Enno Stephan (with 90% of slides kindly contributed by Kay H. Brodersen) Translational Neuromodeling Unit (TNU)

75

Model-based classification using DCM

structure-basedclassification

activation-basedclassification

model-basedclassification

model selection

inference onmodelparameters

vs.

?

{ group 1, group 2 }