maximum likelihood (cont.)

Maximum likelihood(cont.)

Midterm

• Relatedness – is not sharing a common ancestor but sharing a relatively recent CA

• Homoplasy includes convergence and/or reversal

• PTP looks at tree length; g1 looks at skew in tree length

• Both introgression and LGT result in a dominant and minor history

Midterm

• ILD test looks at the sum of the lengths (or likelihood) of the optimal trees from each partition

• Topology tests evaluate whether the data support one tree better than another. It can be used to evaluate support for a clade or to assess discordance, but that is an application

Midterm

• Parsimony criterion: Tree that can explain the data with the lowest number of character state changes (weighted by the inferred evidential value of each character state transition)

• Assumptions: Single; independence; branch lengths short and fairly even

Dig

italis

Loph

osp

ermu

mC

ymb

alaria

Asarin

aL. ca

na

den

sisL. vu

lgaris

Ch

aen

orhin

umM

isop

atesA

ntirrhin

um

Incomplete lineage sorting

The cause is the retention of a

polymorphism – does not depend on gene

duplication

A CB

A-B coalescen

ceAB-C coalescence

Split 2

Split 1

Relationship among models9

4

66

35

2

1

Site-to-site rate heterogeneity• Two main methods:

– Some proportion of invariant sites

– Rates distributed as a discrete approximation to a gamma distribution

• Both use one parameter – I = proportion of invariant sites– α = shape parameter

Nested models

• Simpler models have fewer parameters than more complex models

• Two models are “nested” if all parameters in the simpler model are also in the more complex model– Nested: GTR-HKY; HKY-JC; GTR-JC; – Not nested: F81-K2P; JC+I-HKY

Which of these pairs are nested?

• HKY-K2P

• GTR+Γ-GTR

• HKY+I-HKY+Γ

• HKY+I-HKY

In the case of nested models

• Log-likelihood under the simpler model = Ls

• Log-likelihood under the complex model = Lc

• It will always be the case that Lc ≥ Ls

• But how much better can we explain the data under the more complex model?

Log-Likelihood ratios

• The likelihood ratio is 2(Lc-Ls)

• For nested models the expected LR is distributed as a Χ2 with as many degrees of freedom as the number of extra parameters in the more complex model (kc-ks)

Hierarchical LR tests

• If the LR is significant under a chi-square then favor the more complex model

• If the LR is not significant then stick with the simpler model

Akaike Information Criterion

• Another approach to choosing among models

• Can be used even among non-nested models

• Pick the model with the lowest AIC:– k = number of parameters in model– AIC = -2 ln L + 2k

Relationship between MP and ML

• One argument - MP is inherently nonparametric No direct comparison possible

• MP is an ML model that makes particular assumptions

Parsimony-like likelihood model(see Lewis 1998 for more)

• Estimate branch-length independently for each character (a VERY complex model)

• Only sum over maximum likelihood ancestral states

Why use MP

• The model is less realistic, but:– We can do more thorough searches and data

exploration (computational efficiency)– Robust results will usually still be supported

Why use ML

• The model is explicit

• We can statistically compare alternative models of molecular evolution

• We can conduct parametric statistical tests

Likelihood based topology test

• Kishino-Hasegawa test

• Likelihood ratio test of zero length branches

maximum likelihood (cont.)

Documents