maximum likelihood (cont.)
DESCRIPTION
Maximum likelihood (cont.). Midterm. Relatedness – is not sharing a common ancestor but sharing a relatively recent CA Homoplasy includes convergence and/or reversal PTP looks at tree length ; g1 looks at skew in tree length Both introgression and LGT result in a dominant and minor history. - PowerPoint PPT PresentationTRANSCRIPT
Maximum likelihood(cont.)
Midterm
• Relatedness – is not sharing a common ancestor but sharing a relatively recent CA
• Homoplasy includes convergence and/or reversal
• PTP looks at tree length; g1 looks at skew in tree length
• Both introgression and LGT result in a dominant and minor history
Midterm
• ILD test looks at the sum of the lengths (or likelihood) of the optimal trees from each partition
• Topology tests evaluate whether the data support one tree better than another. It can be used to evaluate support for a clade or to assess discordance, but that is an application
Midterm
• Parsimony criterion: Tree that can explain the data with the lowest number of character state changes (weighted by the inferred evidential value of each character state transition)
• Assumptions: Single; independence; branch lengths short and fairly even
Dig
italis
Loph
osp
ermu
mC
ymb
alaria
Asarin
aL. ca
na
den
sisL. vu
lgaris
Ch
aen
orhin
umM
isop
atesA
ntirrhin
um
Incomplete lineage sorting
The cause is the retention of a
polymorphism – does not depend on gene
duplication
A CB
A-B coalescen
ceAB-C coalescence
Split 2
Split 1
Relationship among models9
4
66
35
2
1
Site-to-site rate heterogeneity• Two main methods:
– Some proportion of invariant sites
– Rates distributed as a discrete approximation to a gamma distribution
• Both use one parameter – I = proportion of invariant sites– α = shape parameter
Nested models
• Simpler models have fewer parameters than more complex models
• Two models are “nested” if all parameters in the simpler model are also in the more complex model– Nested: GTR-HKY; HKY-JC; GTR-JC; – Not nested: F81-K2P; JC+I-HKY
Which of these pairs are nested?
• HKY-K2P
• GTR+Γ-GTR
• HKY+I-HKY+Γ
• HKY+I-HKY
In the case of nested models
• Log-likelihood under the simpler model = Ls
• Log-likelihood under the complex model = Lc
• It will always be the case that Lc ≥ Ls
• But how much better can we explain the data under the more complex model?
Log-Likelihood ratios
• The likelihood ratio is 2(Lc-Ls)
• For nested models the expected LR is distributed as a Χ2 with as many degrees of freedom as the number of extra parameters in the more complex model (kc-ks)
Hierarchical LR tests
• If the LR is significant under a chi-square then favor the more complex model
• If the LR is not significant then stick with the simpler model
Akaike Information Criterion
• Another approach to choosing among models
• Can be used even among non-nested models
• Pick the model with the lowest AIC:– k = number of parameters in model– AIC = -2 ln L + 2k
Relationship between MP and ML
• One argument - MP is inherently nonparametric No direct comparison possible
• MP is an ML model that makes particular assumptions
Parsimony-like likelihood model(see Lewis 1998 for more)
• Estimate branch-length independently for each character (a VERY complex model)
• Only sum over maximum likelihood ancestral states
Why use MP
• The model is less realistic, but:– We can do more thorough searches and data
exploration (computational efficiency)– Robust results will usually still be supported
Why use ML
• The model is explicit
• We can statistically compare alternative models of molecular evolution
• We can conduct parametric statistical tests
Likelihood based topology test
• Kishino-Hasegawa test
• Likelihood ratio test of zero length branches