an introduction to bayesian...

114
Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 1 An Introduction to Bayesian Phylogenetics Paul O. Lewis Department of Ecology & Evolutionary Biology 22 July 2015 Workshop on Molecular Evolution Woods Hole, Mass.

Upload: others

Post on 20-May-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 1

An Introduction to Bayesian Phylogenetics

Paul O. Lewis Department of Ecology & Evolutionary Biology

22 July 2015

Workshop on Molecular Evolution Woods Hole, Mass.

Page 2: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 2

An Introduction to Bayesian Phylogenetics

• Bayesian inference in general • Markov chain Monte Carlo (MCMC) • Bayesian phylogenetics • Prior distributions • Bayesian model selection

Page 3: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 3

I. Bayesian inference in general

Page 4: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 4

Joint probabilities B = Black S = Solid W = White D = Dotted

Page 5: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 5

Conditional probabilities

Page 6: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 6

Bayes’ rule

Page 7: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 7

Probability of "Dotted"

Page 8: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Pr(D) is the marginal probability of being dotted To compute it, we marginalize over colors

Pr(B|D) =Pr(B) Pr(D|B)

Pr(D)

=Pr(D,B)

Pr(D,B) + Pr(D,W )

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 8

Bayes' rule (cont.)

Page 9: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 9

Bayes' rule (cont.)

It is easy to see that Pr(D) serves as a normalization constant, ensuring that Pr(B|D) + Pr(W|D) = 1.0

Page 10: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Joint probabilities

10

B W

Pr(D,B)D

S Pr(S,B) Pr(S,W)

Pr(D,W)

Page 11: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Marginalizing over colors

11

B

W

Pr(D,B)

D

S

Pr(S,B) Pr(S,W)

Pr(D,W)

Marginal probability of being a dotted marble is the sum of all joint

probabilities involving dotted marbles

Page 12: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Marginal probabilities

12

B W

Pr(D,B) + Pr(D,W)D

S Pr(S,B) + Pr(S,W)

Pr(S) = marginal probability of being solid

Pr(D) = marginal probability of being dotted

Page 13: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Joint probabilities

13

B W

Pr(D,B)D

S Pr(S,B) Pr(S,W)

Pr(D,W)

Page 14: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Marginalizing over "dottedness"

14

B W

Pr(D,B)

D

S

Pr(S,B) Pr(S,W)

Pr(D,W)

Marginal probability of being a white

marble

Page 15: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 15

Bayes' rule (cont.)

Page 16: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Pr(✓|D) =Pr(D|✓) Pr(✓)P✓ Pr(D|✓) Pr(✓)

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 16

Bayes' rule in Statistics

D refers to the "observables" (i.e. the Data) refers to one or more "unobservables"

(i.e. parameters of a model, or the model itself): – tree model (i.e. tree topology) – substitution model (e.g. JC, F84, GTR, etc.) – parameter of a substitution model (e.g. a branch length, a

base frequency, transition/transversion rate ratio, etc.) – hypothesis (i.e. a special case of a model) – a latent variable (e.g. ancestral state)

Page 17: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Posterior probability of hypothesis θ

Marginal probability of the data (marginalizing

over hypotheses)

Prior probability of hypothesis θLikelihood of hypothesis θ

Pr(�|D) =Pr(D|�) Pr(�)�� Pr(D|�) Pr(�)

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 17

Bayes’ rule in statistics

Page 18: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 18

Simple (albeit silly) paternity example

Possibilities θ1 θ2 Row sum

Genotypes AA Aa ---

Prior 1/2 1/2 1

Likelihood 1 1/2 ---

Prior X Likelihood 1/2 1/4 3/4

Posterior 2/3 1/3 1

θ1 and θ2 are assumed to be the only possible fathers, child has genotype Aa, mother has genotype aa, so child must have received allele A from the true father. Note: the data in this case is the child’s genotype (Aa)

Page 19: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

The prior can be your friend

19

Suppose the test for a rare disease is 99% accurate.

Pr(+|disease) = 0.99Pr(+|healthy) = 0.01

datum hypothesis

(Note that we do not need to consider the case of a negative test result.)

It is very tempting to (mis)interpret the likelihood as a posterior probability and conclude that there is a 99% chance that I have the disease.

Suppose further I test positive for the disease. How worried should I be?

Want to know Pr(disease|+), not Pr(+|disease)

Page 20: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Pr(disease|+) =Pr(+|disease)

�12

Pr(+|disease)�

12

�+ Pr(+|healthy)

�12

=(0.99)

�12

(0.99)�

12

�+ (0.01)

�12

� = 0.99

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

The prior can be your friend

20

The posterior probability is 0.99 only if the prior probability of having the disease is 0.5:

Pr(disease|+) =(0.99)

�1

1000000

(0.99)�

11000000

�+ (0.01)

�9999991000000

⇡ 0.0001

If, however, the prior odds against having the disease are 1 million to 1, then the posterior probability is much more reassuring:

Page 21: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

An important caveat

21

This (rare disease) example involves a tiny amount of data (one observation) and an extremely informative prior, and gives the impression that maximum likelihood (ML) inference is not very reliable.

However, in phylogenetics, we often have lots of data and use much less informative priors, so in phylogenetics ML inference is generally very reliable.

Page 22: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 22

Discrete vs. Continuous

• So far, we've been dealing with discrete hypotheses (e.g. either this father or that father, have disease or don’t have disease)

• In phylogenetics, substitution models represent an infinite number of hypotheses (each combination of parameter values is in some sense a separate hypothesis)

• How do we use Bayes' rule when our hypotheses form a continuum?

Page 23: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Likelihood Prior probability density

Marginal probability of the data

Posterior probability density

f(✓|D) =f(D|✓)f(✓)Rf(D|✓)f(✓)d✓

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 23

Bayes’ rule: continuous case

Page 24: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 24

If you had to guess...

0.0 ∞

1 meter

Not knowing anything about my archery abilities, draw a curve representing your view of the chances of my arrow landing a distance d from the center of the target (if it helps, I'm standing 50 meters away from the target)

d

Page 25: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 25

Case 1: assume I have talent

0.0

1 meter

20.0 40.0 60.0 80.0distance in centimeters from target center

An informative prior (low variance) that says most of my arrows will fall within 20 cm of the center (thanks for your confidence!)

Page 26: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 26

Case 2: assume I have a talent for missing the target!

0.0

1 meter

20.0 40.0 60.0distance in cm from target center

Also an informative prior, but one that says most of my arrows will fall within a narrow range just outside the entire target!

Page 27: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 27

Case 3: assume I have no talent

0.0

1 meter

20.0 40.0 60.0 80.0distance in cm from target center

This is a vague prior: its high variance reflects nearly total ignorance of my abilities, saying that my arrows could land nearly anywhere!

Page 28: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 28

A matter of scale

Notice that I haven't provided a scale for the vertical axis.

What exactly does the height of this curve mean?

For example, does the height of the dotted line represent the probability that my arrow lands 60 cm from the center of the target?

0.0 20.0 40.0 60.0

No.

Page 29: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 29

Probabilities are associated with intervals

Probabilities are attached to intervals (i.e. ranges of values), not individual values

The probability of any given point (e.g. d = 60.0) is zero!

However, we can ask about the probability that d falls in a particular range e.g. 50.0 < d < 65.0

0.0 20.0 40.0 60.0

Page 30: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 30

Probabilities vs. probability densities

Probability density function

Note: the height of this curve does not represent a probability (if it did, it would not exceed 1.0)

Page 31: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 31

Densities of various substances

Substance Density (g/cm3)

Cork 0.24

Aluminum 2.7

Gold 19.3

Density does not equal mass mass = density × volume

Note: volume is appropriate for objects of dimension 3 or higher For 2-dimensions, area takes the place of volume For 1-dimension, linear distance replaces volume.

Page 32: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 32

Gold on this end

Aluminum this end

distance from left end

dens

ity

Density varies

Page 33: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 33

Integration of densities

The density curve is scaled so that the value of this integral (i.e. the total area) equals 1.0

Page 34: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 34density.ai

Integration of a probability density yields a probability

Area under the density curve from 0 to 2 is the probability that θ is less

than 2

Page 35: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 35

Archery Priors Revisited

0 10 20 30 40 50 60

0.00

0.05

0.10

0.15

0.20

0.25

0.30

mean=1.732 variance=3

mean=60 variance=3

mean=200 variance=40000

These density curves are all variations of a gamma probability distribution.

We could have used a gamma distribution to

specify each of the prior probability distributions for the archery example.

Note that higher variance means less informative

Page 36: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 36

Coin-flipping

y = observed number of heads n = number of flips (sample size) p = (unobserved) proportion of heads

Note that the same formula serves as both the: • probability of y (if p is fixed) • likelihood of p (if y is fixed)

Page 37: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 37

The posterior is (almost always) more informative than the prior

p

uniform prior density

posterior density

= posterior probability (mass)

Page 38: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 38

Beta(2,2) prior is vague but not flat

Beta(2,2) prior density

posterior density

Posterior probability of p between 0.45 and 0.55 is 0.223

Page 39: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

f(✓,�|D) =f(D|✓,�) f(✓)f(�)R

R� f(D|✓,�) f(✓)f(�) d✓d�

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 39

Usually there are many parameters...

Likelihood

Marginal probability of dataPosterior probability

density

Prior probability densityA 2-parameter example

An analysis of 100 sequences under the simplest model (JC69) requires 197 branch length parameters. The denominator is a 197-fold integral in this case!

Now consider summing over all possible tree topologies! It would thus be nice to avoid having to calculate the

marginal probability of the data...

Page 40: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 40

II. Markov chain Monte Carlo (MCMC)

Page 41: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 41

Markov chain Monte Carlo (MCMC)

For more complex problems, we might settle for a

good approximationto the posterior distribution

Page 42: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 42

MCMC robot’s rules

Uphill steps are always accepted

Slightly downhill steps are usually accepted

Drastic “off the cliff” downhill steps are almost never accepted

With these rules, it is easy to see why the

robot tends to stay near the tops of hills

Page 43: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 43

(Actual) MCMC robot rules

Uphill steps are always accepted because R > 1

Slightly downhill steps are usually accepted because R is near 1

Drastic “off the cliff” downhill steps are almost never accepted because R is near 0

Currently at 1.0 m Proposed at 2.3 m R = 2.3/1.0 = 2.3

Currently at 6.2 m Proposed at 5.7 m R = 5.7/6.2 =0.92 Currently at 6.2 m

Proposed at 0.2 m R = 0.2/6.2 = 0.03

6

8

4

2

0

10

The robot takes a step if it draws a Uniform(0,1) random deviate that is less than or equal to R

Page 44: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

=f(D|�⇤)f(�⇤)

f(D)

f(D|�)f(�)f(D)

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 44

Cancellation of marginal likelihood

When calculating the ratio R of posterior densities, the marginal probability of the data cancels.

f(�⇤|D)f(�|D)

Posterior odds

=f(D|�⇤)f(�⇤)f(D|�)f(�)

Likelihood ratio Prior odds

Page 45: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 45

Target vs. Proposal Distributions

Pretend this proposal distribution allows good mixing. What does good

mixing mean?

Page 46: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

default2.TXT

State0 2500 5000 7500 10000 12500 15000 17500

-10

-9

-8

-7

-6

-5

-4

-3

-2

-1

0

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 46

Trace plots

“White noise” appearance is a sign of good mixing

I used the program Tracer to create this plot: http://tree.bio.ed.ac.uk/software/tracer/

AWTY (Are We There Yet?) is useful for investigating convergence:

http://king2.scs.fsu.edu/CEBProjects/awty/awty_start.php

https://github.com/danlwarren/RWTY

log(

post

erio

r)

Page 47: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 47

Target vs. Proposal Distributions

Proposal distributions with smaller variance...

Disadvantage: robot takes smaller steps, more time required to explore the same area

Advantage: robot seldom refuses to take proposed steps

Page 48: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

smallsteps.TXT

State0 2500 5000 7500 10000 12500 15000 17500

-6

-5

-4

-3

-2

-1

0

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 48

If step size is too small, large-scale trends will be apparentlo

g(po

ster

ior)

Page 49: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 49

Target vs. Proposal Distributions

Proposal distributions with larger variance...

Disadvantage: robot often proposes a step that would take it off a cliff, and refuses to move

Advantage: robot can potentially cover a lot of ground quickly

Page 50: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

bigsteps2.TX

T

State0 2500 5000 7500 10000 12500 15000 17500

-12

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 50

Chain is spending long periods of time “stuck” in one place

“Stuck” robot is indicative of step sizes that are too large (most proposed steps would take the robot “off the cliff”)

log(

post

erio

r)

Page 51: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 51

MCRobot (or "MCMC Robot")

Free app for Windows or iPhone/iPad available from http://mcmcrobot.org/

(note: version currently on Apple Store does not work under iOS 8 - will replace it soon)

Android: next year?

Mac version: not soon (but see John Huelsenbeck's iMCMC app for MacOS:

http://cteg.berkeley.edu/software.html)

Page 52: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 52

Metropolis-coupled Markov chain Monte Carlo (MCMCMC)

• MCMCMC involves running several chains simultaneously

• The cold chain is the one that counts, the rest are heated chains

• Chain is heated by raising densities to a power less than 1.0 (values closer to 0.0 are warmer)

Geyer, C. J. 1991. Markov chain Monte Carlo maximum likelihood for dependent data. Pages 156-163 in Computing Science and Statistics (E. Keramidas, ed.).

Page 53: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Heated chains act as scouts for the cold chain

53

cold

heated

Cold chain robot can easily make this jump because it is uphill

Hot chain robot can also make this jump with high

probability because it is only slightly downhill

Page 54: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 54

cold

heated

Hot chain and cold chain robots swapping places

Swapping places means both robots can cross the valley, but this is

more important for the cold chain because its valley is much deeper

Page 55: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Back to MCRobot...

55

Page 56: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 56

The Hastings ratioIf robot has a greater tendency to propose steps to the right as opposed to the left when choosing its next step, then the acceptance ratio must counteract this tendency.

Suppose the probability of proposing a spot to the right

is twice that of proposing a spot to the left

In this case, the Hastings ratio decreases the chance of accepting moves to the right by half, and

increases the chance of accepting moves to the left (by a factor of 2), thus exactly compensating for the asymmetry in the proposal distribution.

Hastings, W. K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97-109.

Page 57: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

The Hastings ratio

57

Example where MCMC Robot proposed moves to the right 80% of the time, but Hastings ratio was not used to modify acceptance probabilities

Page 58: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 58

Hastings Ratio

Note that if q(θ|θ*) = q(θ*|θ), the Hastings ratio is 1

Acceptance ratio Posterior ratio Hastings ratio

Page 59: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 59

III. Bayesian phylogenetics

Page 60: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 60

So, what’s all this got to do with phylogenetics?

Imagine pulling out trees at random from a barrel. In the barrel, some trees are represented numerous times, while other possible trees are not present. Count 1 each time you see the split separating just A and C from the other taxa, and count 0 otherwise. Dividing by the total trees sampled approximates the true proportion of that split in the barrel.

Page 61: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Moving through treespace

61

The Larget-Simon move

*Larget, B., and D. L. Simon. 1999. Markov chain monte carlo algorithms for the Bayesian analysis of phylogenetic trees. Molecular Biology and Evolution 16: 750-759. See also: Holder et al. 2005. Syst. Biol. 54: 961-965.

Y

X Step 1: Pick 3 contiguous edges randomly, defining two subtrees, X and Y

Page 62: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Moving through treespace

62

The Larget-Simon move

Y

X

Step 2: Shrink or grow selected 3-edge segment by a random amountY

XStep 1: Pick 3 contiguous edges randomly, defining two subtrees, X and Y

Page 63: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Moving through treespace

63

The Larget-Simon move

Step 2: Shrink or grow selected 3-edge segment by a random amountY

XStep 1: Pick 3 contiguous edges randomly, defining two subtrees, X and Y

Page 64: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Moving through treespace

64

The Larget-Simon move

Step 2: Shrink or grow selected 3-edge segment by a random amount

Step 1: Pick 3 contiguous edges randomly, defining two subtrees, X and Y

Step 3: Choose X or Y randomly, then reposition randomly

Y

Page 65: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Moving through treespace

65

The Larget-Simon move

Step 2: Shrink or grow selected 3-edge segment by a random amount

Step 1: Pick 3 contiguous edges randomly, defining two subtrees, X and Y

Step 3: Choose X or Y randomly, then reposition randomly

Proposed new tree: 3 edge lengths have changed and the topology differs by one NNI rearrangement

Page 66: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Moving through treespace

66

Current tree Proposed tree

log-posterior = -34256 log-posterior = -32519(better, so accept)

Page 67: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 67

Moving through parameter spaceUsing κ (ratio of the transition rate to the transversion rate) as an example of a model parameter.

Proposal distribution is the uniform distribution on the interval (κ-δ, κ+δ)

The “step size” of the MCMC robot is defined by δ: a larger δ means that the robot will attempt to make larger jumps on average.

Page 68: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 68

Putting it all together• Start with random tree and arbitrary initial

values for branch lengths and model parameters

• Each generation consists of one of these (chosen at random): – Propose a new tree (e.g. Larget-Simon move) and either accept

or reject the move

– Propose (and either accept or reject) a new model parameter value

• Every k generations, save tree topology, branch lengths and all model parameters (i.e. sample the chain)

• After n generations, summarize sample using histograms, means, credible intervals, etc.

Page 69: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 69

Marginal Posterior Distribution of κ

95% credible interval

Histogram created from a sample of 1000 kappa values.

upper = 3.604

mean = 3.234

lower = 2.907

Data from Lewis, L., and Flechtner, V. 2002. Taxon 51: 443-451.

Page 70: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 70

IV. Prior distributions

Page 71: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Common Priors

• Discrete uniform for topologies – exceptions becoming more common

• Beta for proportions • Gamma or Log-normal for branch lengths

and other parameters with support [0,∞) – Exponential is common special case of the

gamma distribution

• Dirichlet for state frequencies and GTR relative rates

71

Page 72: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 72

Discrete Uniform distribution for topologies

B

D

E

C

AB

D

C

A

EB

D

A

E

CB

E

A

D

CE

D

A

B

C

E

B

C

A

D A

E

C

B

D A

B

C

E

DA

B

E

D

C A

B

D

C

E

D

E

C

A

BE

A

C

D

BA

D

B

C

E A

D

C

E

BA

D

E

B

C

Page 73: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Yule model provides joint prior for both topology and divergence times

73

The rate of speciation under the Yule model (λ) is constant and applies equally and independently to each lineage. Thus, speciation events get closer together in time as the tree grows because more lineages are available to speciate.

Page 74: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0 1 2 3 4 5

0.0

0.5

1.0

1.5

2.0

pro

babili

ty d

ensity

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 74

Gamma(a,b) distributions

Exponential(1) = Gamma(1,1)

Gamma(0.1, 10)Gamma(400, 0.01)

peak > 0 if a > 1

shoots off to infinity if a < 1

hits y-axis at b if a = 1

Gamma distributions are ideal for parameters that range from 0 to infinity (e.g. branch lengths)

a = shape b = scale mean* = ab variance* = ab2

*Note: be aware that in many papers the Gamma distribution is defined such that the second (scale) parameter is the inverse of the value b used in this slide! In this case, the mean and variance would be a/b and a/b2, respectively.

Page 75: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

µ = log(m2)� log(m)� log(v + m2)� log(m2)2

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Log-normal distribution

75

If X is log-normal with parameters µ and σ...

µ

σ

...then log(X) is normal with mean µ and standard deviation σ.

Important: µ and σ do not represent the mean and standard deviation of X: they are the mean and standard deviation of log(X)!

Xlog(X)

mode = eµ�⇥2 mode = µ

median = µmedian = eµ

mean = eµ+⇥2/2 mean = µ

variance = �2variance = e2µ+⇥2(e⇥2

� 1)

To choose µ and σ to yield a particular mean (m) and variance (v) for X, use these formulas:

�2 = log(v + m2)� log(m2)

Page 76: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

pro

babili

ty d

ensity

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 76

Beta(a,b) gallery

Beta(10,10)

Beta(1,1)

Beta

(1.2

,2)

Beta(0.8,2)leans left if a < b mean = a/(a+b) =

0.286 symmetric if a = b mean = a/(a+b) = 0.5

flat if a = b = 1

Beta distributions are appropriate for proportions, which are constrained to the interval [0,1].

mean = a/(a+b) variance = ab/[(a+b)2(a+b+1)]

Page 77: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 77

Flat prior: a = b = c = d = 1

Informative prior: a = b = c = d = 300

(stereo pairs)

Dirichlet(a,b,c,d) distribution

a→πA, b→πC, c→πG, d→πTUsed for nucleotide relative frequencies:

(equal frequencies strongly encouraged)

(no scenario discouraged)

Dirichlet(a,b,c,d,e,f) used for GTR exchangeability parameters.

(Thanks to Mark Holder for suggesting the use of a tetrahedron)

Page 78: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 78

Prior Miscellany- priors as rubber bands - running on empty - hierarchical models - empirical bayes

Page 79: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0 2 4 6 8 10

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 79

This Gamma(4,1) prior ties down its parameter at the mode, which is at 3, and

discourages it from venturing too far in either direction. For example, a parameter value of 10 would be stretching the rubber band fairly

tightly

The mode of a Gamma(a,b) distribution

is (a-1)b (assuming a > 1)

Page 80: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0 2 4 6 8 10

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 80

This Gamma prior also has a mode at 3, but has a variance 40 times smaller. Decreasing

the variance is tantamount to increasing the strength of the metaphorical rubber band.

Now you (or the likelihood) would have to tug on the parameter fairly hard for it to have a

value as large as 4.

This gamma distribution has shape 91.989 and

scale 0.032971

Page 81: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0.11 Arabidopsis thaliana

2 Taxus baccata5 Dicksonia antarctica

4 Psilotum nudum6 Sphagnum palustre

3 Huperzia lucidula7 Anthoceros formosae

8 Marchantia polymorpha12 Nitellopsis obtusa11 Lychnothamnus barbatus

10 Lamprothamnium macropogon9 Chara connivens14 Tolypella int prolifera

13 Nitella opaca18 Coleochaete sieminskiana17 Coleochaete irregularis

16 Coleochaete soluta 32d115 Coleochaete orbicularis

20 Chaet oval19 Chaet globosum SAG2698

27 Mougeotia sp 75825 Mesotaenium caldariorum

26 Zygnema peliosporum 4524 Spirogyra maxima 2495

22 Cosmocladium perissum21 Onychonema sp

23 Gonatozygon monotaenium30 Klebsormidium nitens29 Klebsormidium subtilissimum

28 Klebsormidium flaccidum31 Entransia fimbriata

32 Chlorokybus atmosphyticus34 Mesostigma viride NIES33 Mesostigma viride

36 Chlamydomonas reinhardtii35 Volvox carteri

37 Paulschulzia pseudovolvox38 Pteromonas angulos

39 Nephroselmis olivacea40 Cyanophora paradoxa

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 81

Internal branch length prior is exponential with mean 0.1

This is a reasonably vague internal branch length prior

Example: Internal Branch Length Priors

Separate priors applied to internal and external branches

External branch length prior is exponential with mean 0.1

Page 82: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 82

Internal branch length prior mean 0.01

0.11 Arabidopsis thaliana

2 Taxus baccata5 Dicksonia antarctica

4 Psilotum nudum3 Huperzia lucidula6 Sphagnum palustre

7 Anthoceros formosae8 Marchantia polymorpha

12 Nitellopsis obtusa11 Lychnothamnus barbatus

10 Lamprothamnium macropogon9 Chara connivens

14 Tolypella int prolifera13 Nitella opaca

18 Coleochaete sieminskiana17 Coleochaete irregularis16 Coleochaete soluta 32d1

15 Coleochaete orbicularis20 Chaet oval19 Chaet globosum SAG269827 Mougeotia sp 758

25 Mesotaenium caldariorum26 Zygnema peliosporum 45

24 Spirogyra maxima 249522 Cosmocladium perissum21 Onychonema sp

23 Gonatozygon monotaenium30 Klebsormidium nitens

29 Klebsormidium subtilissimum28 Klebsormidium flaccidum

31 Entransia fimbriata32 Chlorokybus atmosphyticus

34 Mesostigma viride NIES33 Mesostigma viride

36 Chlamydomonas reinhardtii35 Volvox carteri

37 Paulschulzia pseudovolvox38 Pteromonas angulos

39 Nephroselmis olivacea40 Cyanophora paradoxa

(external branch length prior mean always 0.1)

Page 83: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0.11 Arabidopsis thaliana2 Taxus baccata

5 Dicksonia antarctica4 Psilotum nudum

3 Huperzia lucidula6 Sphagnum palustre

7 Anthoceros formosae8 Marchantia polymorpha

12 Nitellopsis obtusa11 Lychnothamnus barbatus10 Lamprothamnium macropogon9 Chara connivens

14 Tolypella int prolifera13 Nitella opaca

18 Coleochaete sieminskiana17 Coleochaete irregularis16 Coleochaete soluta 32d115 Coleochaete orbicularis

20 Chaet oval19 Chaet globosum SAG2698

27 Mougeotia sp 75825 Mesotaenium caldariorum

26 Zygnema peliosporum 4524 Spirogyra maxima 2495

22 Cosmocladium perissum21 Onychonema sp

23 Gonatozygon monotaenium30 Klebsormidium nitens

29 Klebsormidium subtilissimum28 Klebsormidium flaccidum

31 Entransia fimbriata32 Chlorokybus atmosphyticus

34 Mesostigma viride NIES33 Mesostigma viride

36 Chlamydomonas reinhardtii35 Volvox carteri

37 Paulschulzia pseudovolvox38 Pteromonas angulos39 Nephroselmis olivacea

40 Cyanophora paradoxa

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 83

Internal branch length prior mean 0.001

Page 84: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Internal branch length prior mean 0.0001

0.11 Arabidopsis thaliana

2 Taxus baccata4 Psilotum nudum

5 Dicksonia antarctica3 Huperzia lucidula6 Sphagnum palustre7 Anthoceros formosae

8 Marchantia polymorpha12 Nitellopsis obtusa11 Lychnothamnus barbatus10 Lamprothamnium macropogon9 Chara connivens

14 Tolypella int prolifera13 Nitella opaca

18 Coleochaete sieminskiana17 Coleochaete irregularis

16 Coleochaete soluta 32d115 Coleochaete orbicularis

20 Chaet oval19 Chaet globosum SAG2698

27 Mougeotia sp 75825 Mesotaenium caldariorum

26 Zygnema peliosporum 4524 Spirogyra maxima 2495

22 Cosmocladium perissum21 Onychonema sp

23 Gonatozygon monotaenium30 Klebsormidium nitens29 Klebsormidium subtilissimum

28 Klebsormidium flaccidum31 Entransia fimbriata

32 Chlorokybus atmosphyticus34 Mesostigma viride NIES

33 Mesostigma viride37 Paulschulzia pseudovolvox

35 Volvox carteri36 Chlamydomonas reinhardtii

38 Pteromonas angulos39 Nephroselmis olivacea

40 Cyanophora paradoxa

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 84

Page 85: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0.11 Arabidopsis thaliana

2 Taxus baccata5 Dicksonia antarctica

4 Psilotum nudum3 Huperzia lucidula

6 Sphagnum palustre7 Anthoceros formosae

8 Marchantia polymorpha12 Nitellopsis obtusa11 Lychnothamnus barbatus

10 Lamprothamnium macropogon9 Chara connivens

13 Nitella opaca14 Tolypella int prolifera

18 Coleochaete sieminskiana17 Coleochaete irregularis

15 Coleochaete orbicularis16 Coleochaete soluta 32d1

30 Klebsormidium nitens29 Klebsormidium subtilissimum

28 Klebsormidium flaccidum26 Zygnema peliosporum 45

24 Spirogyra maxima 249531 Entransia fimbriata

22 Cosmocladium perissum21 Onychonema sp

23 Gonatozygon monotaenium27 Mougeotia sp 758

25 Mesotaenium caldariorum19 Chaet globosum SAG269820 Chaet oval

32 Chlorokybus atmosphyticus34 Mesostigma viride NIES33 Mesostigma viride

35 Volvox carteri37 Paulschulzia pseudovolvox

36 Chlamydomonas reinhardtii38 Pteromonas angulos

39 Nephroselmis olivacea40 Cyanophora paradoxa

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 85

Internal branch length prior mean 0.00001

Page 86: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0.11 Arabidopsis thaliana

2 Taxus baccata5 Dicksonia antarctica

4 Psilotum nudum6 Sphagnum palustre

3 Huperzia lucidula7 Anthoceros formosae8 Marchantia polymorpha

10 Lamprothamnium macropogon9 Chara connivens

11 Lychnothamnus barbatus12 Nitellopsis obtusa

14 Tolypella int prolifera13 Nitella opaca18 Coleochaete sieminskiana

16 Coleochaete soluta 32d117 Coleochaete irregularis

15 Coleochaete orbicularis28 Klebsormidium flaccidum

20 Chaet oval19 Chaet globosum SAG2698

31 Entransia fimbriata25 Mesotaenium caldariorum

26 Zygnema peliosporum 4524 Spirogyra maxima 2495

27 Mougeotia sp 75821 Onychonema sp

29 Klebsormidium subtilissimum30 Klebsormidium nitens

22 Cosmocladium perissum23 Gonatozygon monotaenium

32 Chlorokybus atmosphyticus33 Mesostigma viride

34 Mesostigma viride NIES36 Chlamydomonas reinhardtii

35 Volvox carteri37 Paulschulzia pseudovolvox

38 Pteromonas angulos39 Nephroselmis olivacea

40 Cyanophora paradoxa

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 86

Internal branch length prior mean 0.000001

The internal branch length prior is calling the shots now, and the

likelihood must obey.

Page 87: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 87

Prior Miscellany- priors as rubber bands - running on empty - hierarchical models - empirical bayes

Page 88: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 88

#NEXUS

begin data; Dimensions ntax=4 nchar=1; Format datatype=dna missing=?; matrix taxon1 ? taxon2 ? taxon3 ? taxon4 ? ; end;

begin mrbayes; set autoclose=yes; lset rates=gamma; prset shapepr=exponential(10.0); mcmcp nruns=1 nchains=1 printfreq=1000; mcmc ngen=10000000 samplefreq=1000; end;

Running on empty

You can use the program Tracer to show the estimated density: http://tree.bio.ed.ac.uk/software/tracer/

Solid line: prior density estimated from MrBayes output

Dotted line: exponential(10) density for comparison

BEAST and MrBayes 3.2 both make it easy to

ignore the data without having to resort to

creating fake data sets

Page 89: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 89

Prior Miscellany- priors as rubber bands - running on empty - hierarchical models - empirical bayes

Page 90: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 90

A

A

A T

C

C

Prior: Exponential, mean=0.1

In a non-hierarchical model, all parametersare present in the likelihood function

Page 91: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 91

µ is a hyperparameter governing the mean of the edge length prior

Prior: Exponential, mean µ

Hierarchical models add hyperparameters not present in the likelihood function

For example, see Suchard, Weiss and Sinsheimer. 2001. MBE 18(6): 1001-1013.

hyperprior

During an MCMC analysis, µ will hover around a reasonable value, sparing you from having to decide what value is

appropriate. You still have to specify a hyperprior, however.

Page 92: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 92

Prior Miscellany- priors as rubber bands - running on empty - hierarchical models - empirical bayes

Page 93: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 93

Empirical Bayes

Prior: Exponential, mean=MLE

An empirical Bayesian would use the maximum likelihood estimate (MLE) of the length of an average branch here

Empirical Bayes uses the data to determine some aspects of the prior, such as the prior mean.

Pure Bayesian approaches choose priors without reference to the

data.

Page 94: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 94

V. Bayesian model selection

Page 95: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

AIC = 2k � 2 log(maxL)

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

AIC is not Bayesian. Why?

95

number of free (estimated) parameters maximized log likelihood

AIC is not Bayesian because the prior is not considered (and the prior is an important component of a Bayesian model)

f(✓|D) =f(D|✓)f(✓)Rf(D|✓)f(✓)d✓

The marginal likelihood (denominator in Bayes’ Rule) is commonly used for Bayesian model selection

Represents the (weighted) average fit of the model to the observed data (weights provided by the prior)

Page 96: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 96

An evolutionary distance example

– Let's compare models JC69 vs. K80 – Parameters:

• ν is edge length (expected no. substitutions/site) – free in both JC69 and K80 models

• κ is transition/transversion rate ratio – free in K80, set to 1.0 in JC69

X Y

Page 97: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 97

Likelihood Surface when K80 true

JC69 model (just this 1d line)

K80 model (entire 2d space)sequence length = 500 sites true branch length = 0.15 true kappa = 5.0

K80 wins

Based on simulated data:

(branch length)(trs/trv rate ratio)

Assume joint prior is flat over the area

shown.

Page 98: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 98

Likelihood Surface when JC true

sequence length = 500 sites true branch length = 0.15 true kappa = 1.0

JC69 model (just this 1d line)

K80 model (entire 2d space)

JC69 wins

Based on simulated data:

(branch length)(trs/trv rate ratio)

Assume joint prior is flat over the area

shown.

Page 99: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Estimating the marginal likelihood

99

f(D) is the marginal likelihood

Estimating f(D) is equivalent to estimating the area under the curve whose height is, for every value of θ, equal to f(D|θ) f(θ)

Page 100: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Estimating the marginal likelihood

100

f(D) is the marginal likelihood

Estimating f(D) is equivalent to estimating the area under the curve whose height is, for every value of θ, equal to f(D|θ) f(θ)

Remember that f(D) is the normalizing constant that turns the posterior kernel into a posterior density.

posterior kernel

posterior density

Page 101: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Estimating the marginal likelihood

101

Sample evenly from a box with known area A that completely encloses the curve.

Area under the curve is just A times the fraction of sampled points that lie under the curve.

Note: multiplying each f(θ) by a number less than 1

While not a box, the prior f(θ) does have area 1.0 and completely encloses the curve:

Page 102: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0.0 0.2 0.4 0.6 0.8 1.0

00.5

11.5

c1

c0

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Estimating the marginal likelihood

102

Prior

Unnormalized posterior

MCMC provides a way to sample from any distribution. The orange points are values of θ drawn from the Beta(2,2) prior.

Marginal likelihood (area under the unnormalized posterior)

1.0 (area under prior density)

The fraction of dots inside the unnormalized posterior is an estimate of this ratio:

Would work better if unnormalized posterior

represented a larger fraction of the area under

the prior...

Page 103: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Stepping-stone method

103

c1.0

c0.0=

✓c1.0

c0.9

◆ ✓c0.9

c0.8

◆ ✓c0.8

c0.7

◆ ✓c0.7

c0.6

◆ ✓c0.6

c0.5

◆ ✓c0.5

c0.4

◆ ✓c0.4

c0.3

◆ ✓c0.3

c0.2

◆ ✓c0.2

c0.1

◆ ✓c0.1

c0.0

Sample from this distribution

See what fraction of samples are under this density curve

This fraction is an estimate of this ratio

(estimates a series of ratios that each represent smaller jump)

Page 104: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

-6900

-6800

-6700

-6600

-6500

2 4 8 16 32 64

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

How many “stepping stones” (i.e. ratios) are needed?

K (number of ratios)

-6623

Xie, W.G., P.O. Lewis, Y. Fan, L. Kuo and M.-H. Chen. 2011. Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model Selection. Systematic Biology 60(2):150-160.

(see also followup paper describing more efficient generalized stepping-stone method: Fan, Y., Wu, R., Chen, M.-H., Kuo, L., and Lewis, P. O. 2011. Molecular Biology and Evolution 28(1):523-532)

Error bars based on 1 standard error computed using 30 independent analyses.

• rbcL data • 10 green plants • GTR+G model •1000 samples/steppingstone

Stepping-stone method

104

Page 105: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

-6900

-6800

-6700

-6600

-6500

2 4 8 16 32 64

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Is steppingstone sampling accurate?

• rbcL data • 10 green plants • GTR+G model •1000 samples/steppingstone

Lartillot, N., and H. Philippe. 2006. Computing bayes factors using thermodynamic integration. Systematic Biology 55(2): 195-207.

See for comparison of SS and TI: Baele G., Lemey P., Bedford T., Rambaut A., Suchard M.A., Alekseyenko A.V. 2012. Molecular Biology and Evolution 29(9):2157–2167

Error bars based on 1 standard error computed using 30 independent analyses.

Thermodynamic integration (also called path sampling)

Stepping-stone method

-6623

105

K (number of ratios)

Page 106: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

-6900

-6800

-6700

-6600

-6500

2 4 8 16 32 64

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

How about the harmonic mean method?

• rbcL data • 10 green plants • GTR+G model •1000 samples/steppingstone

Thermodynamic integration

Stepping-stone method

Harmonic mean method-6589-6623

106

Newton, M. A., and A. E. Raftery. 1994. Approximate Bayesian inference with the weighted likelihood bootstrap (with discussion). J. Roy. Statist. Soc. B 56:3-48.

K (number of ratios)

Note that the harmonic mean method is biased, and the bias does not go

away with increased sampling

Page 107: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

The problem that Dirichlet Process models solve

107

gene

Red depicts sites with, for example: - an unusually high or low rate - unusual equilibrium base (or amino acid) frequencies - an unusually high or low nonsynon./synon. rate ratio - some other unusual feature

Desired: a prior model that: - classifies sites into meaningful categories - discourages large numbers of categories - assigns reasonable parameter values to each of the categories - does all this automatically

Page 108: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

� + 1

1� + 1

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

AB

A B

A

Imagine you have a collection of objects (e.g. sites, codons) labeled A, B, C, ...

B can either be added to A’s group or form its own group

The parameter α determines the propensity for forming a new group

Page 109: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

� + 1

1� + 1

1� + 2

� + 2

2� + 2

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

ABC

AB C

AC B

BCA

A B C

AB

A B

A

1� + 2

� + 2

The third object C can either be added to an existing group...

...or form its own group

Page 110: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

� + 1

1� + 1

1� + 2

� + 2

2� + 2

3� + 3

� + 3

1� + 3

2� + 3

��

⇥ ⇤1

� + 1

⌅ ⇤2

� + 2

⌅ ⇤3

� + 3

��

⇥ ⇤1

� + 1

⌅ ⇤2

� + 2

⌅ ⇤�

� + 3

��

⇥ ⇤1

� + 1

⌅ ⇤�

� + 2

⌅ ⇤2

� + 3

��

⇥ ⇤1

� + 1

⌅ ⇤�

� + 2

⌅ ⇤1

� + 3

��

⇥ ⇤1

� + 1

⌅ ⇤�

� + 2

⌅ ⇤�

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤1

� + 2

⌅ ⇤2

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤1

� + 2

⌅ ⇤1

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤1

� + 2

⌅ ⇤�

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤1

� + 2

⌅ ⇤1

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤1

� + 2

⌅ ⇤2

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤�

� + 2

⌅ ⇤1

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤�

� + 2

⌅ ⇤�

� + 3

⌅Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

ABCD

AB CD

AC BD

AD BC

ABC D

ABD C

AB C D

ACD B

BCDA

AC DB

BCA D

AD B C

A B DC

A CBD

A B CD

ABC

AB C

AC B

BCA

A B C

AB

A B

A

1� + 2

� + 2

� + 3

� + 3

� + 3

� + 3

1� + 3

1� + 3

1� + 3

1� + 3

1� + 3

2� + 3

2� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤1

� + 2

⌅ ⇤�

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤�

� + 2

⌅ ⇤1

� + 3

��

⇥ ⇤�

� + 1

⌅ ⇤�

� + 2

⌅ ⇤1

� + 3

After all objects have been considered, you can follow paths to determine the probability of different final configurations

Remember that this is a prior, so the data have a (usually big) say in how many clusters there are and what parameter values are assigned to each.

Page 111: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

� + 1

1� + 1

1� + 2

� + 2

2� + 2

3� + 3

� + 3

1� + 3

2� + 3

↵ = 1

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

ABCD

AB CD

AC BD

AD BC

ABC D

ABD C

AB C D

ACD B

BCDA

AC DB

BCA D

AD B C

A B DC

A CBD

A B CD

ABC

AB C

AC B

BCA

A B C

AB

A B

A

1� + 2

� + 2

� + 3

� + 3

� + 3

� + 3

1� + 3

1� + 3

1� + 3

1� + 3

1� + 3

2� + 3

2� + 3

0.25

0.08

0.08

0.04

0.04

0.08

0.04

0.04

0.04

0.08

0.04

0.04

0.04

0.04

0.04

Small values favor few,

large groups

Page 112: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

� + 1

1� + 1

1� + 2

� + 2

2� + 2

3� + 3

� + 3

1� + 3

2� + 3

↵ = 10

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

ABCD

AB CD

AC BD

AD BC

ABC D

ABD C

AB C D

ACD B

BCDA

AC DB

BCA D

AD B C

A B DC

A CBD

A B CD

ABC

AB C

AC B

BCA

A B C

AB

A B

A

1� + 2

� + 2

� + 3

� + 3

� + 3

� + 3

1� + 3

1� + 3

1� + 3

1� + 3

1� + 3

2� + 3

2� + 3

0.00

0.01

0.01

0.01

0.06

0.01

0.01

0.06

0.01

0.01

0.06

0.06

0.06

0.06

0.58

Larger values favor more

groups

Page 113: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop)

Dirichlet Process Priors• To encourage few, large groups, use a small alpha value • To encourage lots of small groups, use a large alpha

value • In practice, hierarchical models are often used (i.e.

alpha is a hyperparameter that is estimated, so you need not worry about choosing the appropriate value for alpha)

• Bottom line: DP models are very nice for automatically grouping sites into clusters that have some property in common

113

Page 114: An Introduction to Bayesian Phylogeneticsphylo.bio.ku.edu/woodshole/Bayesian-WoodsHole_22July2015.pdf · An Introduction to Bayesian Phylogenetics • Bayesian inference in general

Paul O. Lewis (2015 Woods Hole Molecular Evolution Workshop) 114

The End