multitreegeneralized steppingstonesampling–anew...

48
Multitree Generalized Steppingstone Sampling – A New MCMC Method for Estimating the Marginal Likelihood of a Models Mark T. Holder, Paul O. Lewis, David L. Swofford, and David Bryant KU, UConn, Duke, U. Otago (NZ) Feb 16, 2013 – Austin, TX

Upload: others

Post on 10-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Multitree GeneralizedSteppingstone Sampling – A NewMCMC Method for Estimatingthe Marginal Likelihood of a

Models

Mark T. Holder, Paul O. Lewis, David L. Swofford, andDavid Bryant

KU, UConn, Duke, U. Otago (NZ)

Feb 16, 2013 – Austin, TX

Page 2: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Context

• We rarely know the “true” model to use for analyses.• Model-averaging methods can be difficult to implementand use.

• We can use the marginal likelihood to choose betweenmodels.

Page 3: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

The marginal likelihood is thedenominator of Bayes’ Rule

p(θ | D,M) =P(D | θ,M)p(θ |M)

P(D |M)

θ is a set of parameter values in the model M

D is the data

P(D | θ,M) is the likelihood of θ.

p(θ,M) is the prior of θ.

Page 4: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

The marginal likelihood assess the fit ofthe model to the data by considering allparameter values

P(D |M) =

∫P(D | θ,M)p(θ |M)dθ

Page 5: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

MCMC avoids the marginal like. calc.

p(θ∗|D,M)

p(θ | D,M)=

P(D|θ∗,M)p(θ∗|M)P(D|M)

P(D|θ,M)p(θ|M)P(D|M)

p(θ∗|D,M)

p(θ | D,M)=

P(D | θ∗,M)p(θ∗ |M)

P(D | θ,M)p(θ |M)

Page 6: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Bayesian model selection

Bayes Factor between two models:

B10 =P(D |M1)

P(D |M0)

We could estimate P(D|M1) by drawingpoints from the prior on θ and calculatingthe mean likelihood.

Page 7: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

−2 −1 0 1 2

010

2030

40

Sharp posterior (black) and prior (red)

x

dens

ity

Page 8: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Drawing from the prior will often miss any of the trees withhigh posterior density – massive sample sizes would beneeded.

Perhaps we can use samples from the posterior?

Page 9: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

We can use the harmonic mean of the likelihoods of MCMCsamples to estimate P(D|M1).

However. . .

Page 10: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

“The Harmonic Mean of the Likelihood:Worst Monte Carlo Method Ever”

A post on Dr. Radford Neal’s bloghttp://radfordneal.wordpress.com/2008/08/17/

the-harmonic-mean-of-the-likelihood-worst-monte-carlo-method-ever

“The total unsuitability of the harmonic mean estimatorshould have been apparent within an hour of its discovery.”

Page 11: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Harmonic mean estimator of themarginal likelihood

• appealing because it comes “for free” after we havesampled the posterior using MCMC,

• unfortunately the estimator can have a huge varianceassociated with it in some (very common) cases. Forexample if:• the vast majority of parameter space has very low

likelihood, and• a very small region has high likelihoods.

Page 12: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Importance sampling to approximate adifficult target distribution

1 Simulate points from an easy distribution.2 Reweight the points by the ratio of densities betweenthe easy and target distribution.

3 Treat the reweighted samples as draws from the targetdistribution.

Page 13: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

−3 −2 −1 0 1 2 3

0.0

0.5

1.0

1.5

2.0

2.5

Importance and target densities

x

dens

ity

Page 14: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

−3 −2 −1 0 1 2 3

0.0

0.5

1.0

1.5

2.0

2.5

Importance and target densities

x

dens

ity

−3 −2 −1 0 1 2 3

02

46

810

12

Importance weights

x

wei

ght

Page 15: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

−3 −2 −1 0 1 2 3

0.0

0.5

1.0

1.5

2.0

2.5

Importance and target densities

x

dens

ity

−3 −2 −1 0 1 2 3

02

46

810

12

Importance weights

x

wei

ght

−3 −2 −1 0 1 2 3

02

46

810

12

Samples from importance distribution

x

−3 −2 −1 0 1 2 3

02

46

810

12

Weighted samples

x

wei

ght

Page 16: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Importance sampling

The method works well if the importance distribution is:• fairly similar to the target distribution, and• not “too tight” to allow sampling the full range of thetarget distribution

Page 17: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

In phylogenetics our posterior distribution is too peakedand our prior is too vague to allow us to use them inimportance sampling:

−2 −1 0 1 2

010

2030

40

Sharp posterior (black) and prior (red)

x

dens

ity

Page 18: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Steppingstone sampling uses a series ofimportance sampling runs

−2 −1 0 1 2

010

2030

40

Steppingstone densities

x

dens

ity

Page 19: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Steppingstone sampling (Xie et al. 2011, Fan et al. 2011)blends two distributions:• the posterior, P(D | θ,M1)P(θ,M1)

• a tractable reference distribution, π(θ)

pβ(θ | D,M1) =[P(D | θ,M1)P(θ,M1)]

β [π(θ)](1−β)

p1(θ | D,M1) is the posterior.c1 is the marginal likelihood of the model.

p0(θ | D,M1) is the reference distribution.c0 is 1.

Page 20: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Steppingstone sampling (Xie et al. 2011, Fan et al. 2011)blends two distributions:• the posterior, P(D | θ,M1)P(θ,M1)

• a tractable reference distribution, π(θ)

pβ(θ | D,M1) =[P(D | θ,M1)P(θ,M1)]

β [π(θ)](1−β)

P(D |M1) =c1c0

=

(c1c0.38

)(c0.38

c0.1

)(c0.1c0.01

)(c0.01

c0

)=

(c1

���c0.38

)(���c0.38

��c0.1

)(��c0.1

���c0.01

)(���c0.01

c0

)

Page 21: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Run MCMC with different β values

−2 −1 0 1 2

010

2030

40

Steppingstone densities

x

dens

ity

Page 22: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

P(D |M) =(

P(D|M)c0.38

)(c0.38c0.1

)(c0.1c0.01

) (c0.01

1

)

Photo by Johan Nobel http://www.flickr.com/photos/43147325@N08/4326713557/ downloaded from Wikimedia

Page 23: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Reference distributions in Steppingstonesampling

In the original steppingstone (Xie et al 2011):

reference = prior

In generalized steppingstone (Fan et al 2011) it can be anydistribution:• Should be centered around the areas with highprobability,

• Must be a probability distribution with a knownnormalizing constant,

• Should be easy to draw sample from.

Page 24: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

The log Bayes Factor for a complex model compared to a simple(true) model. Estimated twice (by Fan et al, 2011)

Harmonic mean: Original Steppingstone:

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

−60 −40 −20 0 20

−60

−40

−20

020

Seed 1

Seed

2

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

−60 −40 −20 0 20

−60

−40

−20

020

Seed 1

Seed

2

●●

●●

●●

●●

●●

●●

●●●

●●

−60 −40 −20 0 20

−60

−40

−20

020

Seed 1

Seed

2

Page 25: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Original Steppingstone: Generalized Steppingstone:

●●

●●

●●

●●

●●

●●

●●●

●●

−60 −40 −20 0 20

−60

−40

−20

020

Seed 1

Seed

2

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

−60 −40 −20 0 20

−60

−40

−20

020

Seed 1

Seed

2

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

−60 −40 −20 0 20

−60

−40

−20

020

Seed 1

Seed

2

Figure from Fan et al., 2011

Page 26: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Steppingstone sampling when the tree isnot known

• generalized steppingstone assumes afixed tree,

• the original steppingstone can very slowwhen the tree is not known.

Page 27: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Tree-CenteredIndependent-Split-Probability (TCISP)distribution

Input: a tree with probabilities for eachsplit (from a pilot MCMC run).

Output: a probability distribution over alltree topologies.

Page 28: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

A

G

D

E

J L

HF

C

KI

0.9

0.8 0.6 0.5

0.4 0.8

0.3

0.9

Input: a focal treewith split probabilitiesto center the distribution

Page 29: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

A

G

D

E

J L

HF

C

KI

The tree we will drawwill display the blue splitsand will not display the red splits

Page 30: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

A G

D

EJ

LHF

CKI

Page 31: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

A

G

D

E

J LH

F

C

KI

One of the many resolutionswhich avoid the red splits

Page 32: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

A

G

D

E

J L

HF

C

KI

A

G

D

E

J LH

F

C

KI

Page 33: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Calculating a tree’s probability

A

G

D

E

J L

HF

C

KI

0.9

0.8 0.6 0.5

0.4 0.8

0.3

0.9

A

G

D

E

J LH

F

C

KI

focal tree tree to score

Page 34: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Calculating a tree’s probability

A

G

D

E

J L

HF

C

KI

A

G

D

E

J LH

F

C

KI

focal tree tree to score

Page 35: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Calculating a tree’s probability

A

G

D

E

J L

HF

C

KI

0.9

0.2 0.4 0.5

0.6 0.2

0.7

0.9

A

G

D

E

J LH

F

C

KI

focal tree tree to score

Page 36: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Calculating a tree’s probability

A

G

D

E

J L

HF

C

KI

0.9

0.2 0.4 0.5

0.6 0.2

0.7

0.9

A G

D

EJ

LHF

CKI

focal tree tree with shared splits

P = 0.2× 0.9× 0.4× 0.5× 0.6× 0.2× 0.7× 0.9

Page 37: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Counting trees the max. distance from atree

• Bryant and Steel(2009) provide anO(n5) algorithm.

• Bryant contributed an O(n2) algorithm.

Page 38: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Multitree steppingstone sampling• Has been validated on small data sets.• Minimum running time is unknown.• Appears to be a practical way to approximatea model’s marginal likelihood.

• The TCISP distribution could be useful inother context.

• The method assumes unrooted, fully resolvedtrees,

• implemented in phycas(http://www.phycas.org)

Holder, Lewis, Swofford, Bryant (in prep)

Page 39: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Marginal likelihoods• Harmonic mean estimator is not reliable• AIC is preferable to the harmonic meanestimator (Guy Baele et al MBE 2012)

• Thermodynamic Integration (aka PathSampling) is an accurate method (Lartillot andPhilippe, 2006; Rodrigue and Aris-Brousou)

• Model-jumping approaches avoid the need tocalculate marginal likelihoods.

• Steppingstone and/or Path Sampling areavailable in MrBayes, Beast, ∗Beast, Migrate,and other Bayesian software for evolutionaryanalyses

Page 40: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Thanks!

Thanks to the organizers, NSF, and to you(for listening to stats on a Saturdaymorning)

Page 41: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

This slide shows the math to demonstrate that we can usethe harmonic mean to estimate the marginal likelihood,P (D). Consider, a model that has a discrete parameter vthat can take two values (r and b). h in the harmonic meanof the likelihoods for parameters values sampled from theposterior distribution

h =1

P(v = r|D)[

1P(D|v=r)

]+ P(v = b|D)

[1

P(D|v=b)

]=

1[P(v=r)P(D|v=r)

P(D)

] [1

P(D|v=r)

]+[

P(v=b)P(D|v=b)P(D)

] [1

P(D|v=b)

]=

1[P(v=r)P(D)

]+[

P(v=b)P(D)

]=

P(D)

P(v = r) + P(v = b)

= P(D)

Page 42: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

The harmonic mean as an importancesampling estimator

The marginal likelihood is the expected value of thelikelihood over points drawn from the prior:

P(D) = Ep(θ)[P(D|θ)] =

∫P(D|θ)P(θ)dθ

In importance sampling, we draw parameter values θ froma different density, g(θ), but multiple the points by aweight, w:

Ep(θ)[P(D|θ)] =1n

∑ni=1 P(D|θi)wi

b

where b is a normalization constant.

Page 43: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

The appropriate weight is the ratio of the target densityand our importance density:

wi =P(θi)

g(θi)

And it turns out that the normalization constant is simplya sum of the importance weights:

b =1

n

n∑i=1

P(θi)

g(θi)

Ep(θ)[P(D|θ)] =

1n

∑ni=1

P(D|θi)P(θi)g(θi)

1n

∑ni=1

P(θi)g(θi)

Page 44: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Ep(θ)[P(D|θ)] =

1n

∑ni=1

P(D|θi)P(θi)g(θi)

1n

∑ni=1

P(θi)g(θi)

If the importance distribution, g(θ), is the prior then theimportance weights are all 1:

Ep(θ)[P(D|θ)] =

1n

∑ni=1

P(D|θi)P(θi)P(θi)

1n

∑ni=1

P(θi)P(θi)

=1

n

n∑i=1

P(D|θi)

Recall that the points, θ, are draw by sampling from theimportance distribution, so this sum of likelihoods isalready weighted by the prior.

Page 45: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Ep(θ)[P(D|θ)] =

1n

∑ni=1

P(D|θi)P(θi)g(θi)

1n

∑ni=1

P(θi)g(θi)

If the importance distribution, g(θ) is the posterior then:

Ep(θ)[P(D|θ)] =

1n

∑ni=1

P(D|θi)P(θi)P(D|θi)P(θi)

1n

∑ni=1

P(θi)P(D|θi)P(θi)

=1

1n

∑ni=1

P(θi)P(D|θi)P(θi)

=n∑n

i=11

P(D|θi)

This is the justification of the harmonic mean estimator ofthe marginal likelihood.

Page 46: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Lartillot and Philippe’s thermodynamicintegration

Like the steppingstone sampler, the thermodyanmicintegration method (or path sampling) use power posteriordensities with 0 ≤ β ≤ 1:

pβ(θ|D) =P(D|θ)βP(θ)

with c0 = 0 and c1 = P(D).Lartillot and Philippe showed that

∂ ln cβ∂β

= Epβ

[∂ ln

[P(D|θ)βP(θ)

]∂β

]Note that by definition of a definite integral:∫ 1

0

∂ ln cβ∂β

dβ = ln c1 − ln c0 = ln P(D)

Page 47: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

Thus,

ln P(D) =

∫ 1

0

Epβ

[∂ ln

[P(D|θ)βP(θ)

]∂β

]dβ

By differentiating we see that:

∂ ln[P(D|θ)βP(θ)

]∂β

=∂[ln P(θ) + β ln P(D|θ)]

∂β

= ln P(D|θ)

The integration is not analytically tractable, but we cancalculate Epβ [ln P(D|θ)] by conducting MCMC at the apower posterior distribution pβ and taking an average ofthe log-likelihoods.Then we can use standard numerical integration techniquesto estimate the integral.

Page 48: MultitreeGeneralized SteppingstoneSampling–ANew …phylo.bio.ku.edu/slides/HolderAustinSteppingstone.pdf · 2018-01-25 · MultitreeGeneralized SteppingstoneSampling–ANew MCMCMethodforEstimating

This integration is not analytically tractable, but we cancalculate Epβ [ln P(D|θ)] by conducting MCMC at the apower posterior distribution pβ and taking an average ofthe log-likelihoods.Then we can use standard numerical integration techniquesto estimate the integral.