an introduction to bayesian inference and model comparison · an introduction to bayesian inference...
TRANSCRIPT
![Page 1: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/1.jpg)
J. Daunizeau
ICM, Paris, France TNU, Zurich, Switzerland
An introduction to Bayesian inference and model comparison
![Page 2: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/2.jpg)
Overview of the talk
9 An introduction to probabilistic modelling
9 Bayesian model comparison
9 SPM applications
![Page 3: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/3.jpg)
Overview of the talk
9 An introduction to probabilistic modelling
9 Bayesian model comparison
9 SPM applications
![Page 4: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/4.jpg)
Degree of plausibility desiderata: - should be represented using real numbers (D1) - should conform with intuition (D2) - should be consistent (D3)
a=2 b=5
a=2
• normalization:
• marginalization:
• conditioning : (Bayes rule)
Probability theory: basics
![Page 5: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/5.jpg)
Deriving the likelihood function
- Model of data with unknown parameters:
� �y f T e.g., GLM: � �f XT T
- But data is noisy: � �y f T H �
- Assume noise/residuals is ‘small’:
� � 22
1exp
2p H H
V§ ·v �¨ ¸© ¹
� �4 0.05P H V! |
H
→ Distribution of data, given fixed parameters:
� � � �� �2
2
1exp
2p y y fT T
V§ ·v � �¨ ¸© ¹
T
f
![Page 6: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/6.jpg)
Forward and inverse problems
� �,p y m-
forward problem
likelihood
� �,p y m-
inverse problem
posterior distribution model data
![Page 7: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/7.jpg)
Likelihood:
Prior:
Bayes rule:
Likelihood, priors and the model evidence
T
generative model m
![Page 8: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/8.jpg)
Principle of parsimony : « plurality should not be assumed without necessity »
y=f(x
) y
= f(
x)
x
“Occam’s razor” :
mod
el e
vide
nce
p(y|
m)
space of all data sets
Model evidence:
Bayesian model comparison
![Page 9: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/9.jpg)
••• inference
causality
Hierarchical models
![Page 10: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/10.jpg)
Directed acyclic graphs (DAGs)
![Page 11: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/11.jpg)
Variational approximations (VB, EM, ReML)
→ VB : maximize the free energy F(q) w.r.t. the approximate posterior q(θ) under some (e.g., mean field, Laplace) simplifying constraint
� �1 or 2q T
� �1 or 2 ,p y mT
� �1 2, ,p y mT T
T1
T2
� � � � � �� �
� � � �� �Free energy
ln | ln , | , ;q
F q
p y m p y m S q KL p y m qT T T � �
![Page 12: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/12.jpg)
Overview of the talk
9 An introduction to probabilistic modelling
9 Bayesian model comparison
9 SPM applications
![Page 13: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/13.jpg)
� �t t Y{ t *
� �0*P t t H!
� �0p t H
� �0*P t t H D! dif then reject H0
• estimate parameters (obtain test stat.)
H0 :T 0• define the null, e.g.:
• apply decision rule, i.e.:
classical (null) hypothesis testing
• define two alternative models, e.g.:
• apply decision rule, e.g.:
Bayesian Model Comparison
Frequentist versus Bayesian inference
Y y
� �1p Y m
� �0p Y m
space of all datasets
if then accept m0 � �� �
0
1
P m yP m y
Dt
� �
� � � �
0 0
1 1
1 if 0:
0 otherwise
: 0,
m p m
m p m N
TT
T
®¯
6
![Page 14: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/14.jpg)
Family-level inference
A B A B
A B
u
A B
u
P(m1|y) = 0.04 P(m2|y) = 0.25
P(m2|y) = 0.7 P(m2|y) = 0.01
� � � �1 1 max
0.3m
P e y P m y �
model selection error risk:
![Page 15: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/15.jpg)
Family-level inference
A B A B
A B
u
A B
u
P(m1|y) = 0.04 P(m2|y) = 0.25
P(m2|y) = 0.7 P(m2|y) = 0.01
� � � �1 1 max
0.3m
P e y P m y �
model selection error risk:
P(f2|y) = 0.95 P(f1|y) = 0.05
� � � �1 1 max
0.05f
P e y P f y �
family inference (pool statistical evidence)
� � � �m f
P f y P m y�
¦
![Page 16: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/16.jpg)
Sampling subjects as marbles in an urn
10
i
i
mm
® ¯
→ ith marble is blue
→ ith marble is purple
→ (binomial) probability of drawing a set of n marbles:
� � � �11
1 ii
nmm
i
p m r r r �
��
Thus, our belief about the proportion of blue marbles is:
� � � � � �� � 1
1
11
11 ii
n p r nmm
iii
p r m p r r r E r m mn
v�
v � � ª º ¬ ¼ ¦�
r = proportion of blue marbles in the urn
r
1m 2m nm…
![Page 17: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/17.jpg)
Group-level model comparison
At least, we can measure how likely is the ith subject’s data under each model!
� � � � � � � �1
,n
i i ii
p r m y p r p y m p m r
v �
� �i ip y m � �n np y m� �1 1p y m � �2 2p y m
… …
r
1m 2m nm
ny2y1y
…
… � � � �,m
p r y p r m y ¦Our belief about the proportion of models is:
Exceedance probability: � �'k k k kP r r yM z !
![Page 18: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/18.jpg)
Overview of the talk
9 An introduction to probabilistic modelling
9 Bayesian model comparison
9 SPM applications
![Page 19: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/19.jpg)
realignment smoothing
normalisation
general linear model
template
Gaussian field theory
p <0.05
statistical inference
segmentation and normalisation
dynamic causal modelling
posterior probability maps (PPMs)
multivariate decoding
![Page 20: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/20.jpg)
grey matter CSF white matter
…
…
yi ci O
Pk
P2
P1
V1 V 2 V k
class variances
class means
ith voxel value
ith voxel label
class frequencies
aMRI segmentation mixture of Gaussians (MoG) model
![Page 21: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/21.jpg)
Decoding of brain images recognizing brain states from fMRI
+
fixation cross
>>
pace response
log-evidence of X-Y sparse mappings: effect of lateralization
log-evidence of X-Y bilateral mappings: effect of spatial deployment
![Page 22: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/22.jpg)
fMRI time series analysis spatial priors and model comparison
PPM: regions best explained by short-term memory model
PPM: regions best explained by long-term memory model
fMRI time series
GLM coeff
prior variance of GLM coeff
prior variance of data noise
AR coeff (correlated noise)
short-term memory design matrix (X)
long-term memory design matrix (X)
![Page 23: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/23.jpg)
m2 m1 m3 m4
V1 V5 stim
PPC
attention
V1 V5 stim
PPC
attention
V1 V5 stim
PPC
attention
V1 V5 stim
PPC
attention
m1 m2 m3 m4
15
10
5
0
V1 V5 stim
PPC
attention
1.25
0.13
0.46
0.39 0.26
0.26
0.10 estimated
effective synaptic strengths for best model (m4)
models marginal likelihood ln p y m� �
Dynamic Causal Modelling network structure identification
![Page 24: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/24.jpg)
SPM: frequentist vs Bayesian RFX analysis
-2
-1
0
1
2
0 1M M0.05p �
-2
-1
0
1
2
0.05p !0 1M M -2
-1
0
1
2
0.05p !0 1M M
-2
-1
0
1
2
0 1M M
0.05p �
0 ?T
subjects
para
met
er e
stim
ates
![Page 25: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/25.jpg)
I thank you for your attention.
![Page 26: An introduction to Bayesian inference and model comparison · An introduction to Bayesian inference and model comparison. Overview of the talk 9An introduction to probabilistic modelling](https://reader034.vdocument.in/reader034/viewer/2022042107/5e874876cd7fd104d739d549/html5/thumbnails/26.jpg)
A note on statistical significance lessons from the Neyman-Pearson lemma
• Neyman-Pearson lemma: the likelihood ratio (or Bayes factor) test
� �� �
1
0
p y Hu
p y H/ t
is the most powerful test of size to test the null. � �0p u HD / t
MVB (Bayes factor) u=1.09, power=56%
CCA (F-statistics) F=2.20, power=20%
error I rate
1 - e
rror I
I rat
e
ROC analysis
• what is the threshold u, above which the Bayes factor test yields a error I rate of 5%?