phylogeny reconstruction how do we reconstruct the tree of life? outline: terminology methods...
Post on 19-Dec-2015
218 Views
Preview:
TRANSCRIPT
Phylogeny reconstruction How do we reconstruct the tree of life?
Outline:
Terminology
Methods
distance
parsimony
maximum likelihood
bootstrapping
Problems
homoplasy
hybridisation
Dr. Sean Graham, UBC.
Phylogenetic reconstruction
•Rooted trees
Phylogenetic reconstruction
•Rooted trees
Outgroup:
Phylogenetic reconstruction
Phylogenetic reconstructionIntroduction
Bir
ds
Cro
co
dile
s
Tu
rtle
s
Am
ph
ibia
ns
Ma
mm
als
Liz
ard
s
Sn
ake
s
Tu
rtle
s
Am
ph
ibia
ns
Ma
mm
als
Liz
ard
s
Sn
ake
s
Cro
co
dile
s
Bir
ds
Understanding Trees
Do these phylogenies agree?
Figure 14.17
Branch lengths
A
B
C
D
A
B
C
D
1 nt change
Understanding Trees
A B C D EMonophyletic
A B C D E
Paraphyletic
A B C D E
Polyphyletic
Trees can be used to describe taxonomic groups
What is the relationship between taxonomic names and phylogenetic
groups?
Bir
ds
Cro
co
dile
s
Tu
rtle
s
Am
ph
ibia
ns
Ma
mm
als
Liz
ard
s
Sn
ake
sAmnion
Amniotes
What is the relationship between taxonomic names and phylogenetic
groups?
Bir
ds
Cro
co
dile
s
Tu
rtle
s
Liz
ard
s
Sn
ake
s
Cold Blooded
Reptiles
What is the relationship between taxonomic names and phylogenetic
groups?
Bir
ds
Cro
co
dile
s
Tu
rtle
s
Am
ph
ibia
ns
Ro
de
nts
Liz
ard
s
Sn
ake
s
Wings
Ba
ts
Polyphyletic example: Amentiferae
Polyphyletic example: Amentiferae
Ancestor with separate flowers
Willows WalnutsOaks
Evolution of catkins
Vertebrate Phylogeny
Are these groups monophyletic, paraphyletic or polyphyletic?
fish?
tetrapods? (= four limbed)
amphibians?
mammals?
ectotherms (= warm blooded)?
Constructing Trees
Methods:
distance (UPGMA, Neighbor joining)
parsimony
maximum likelihood (Bayesian)
Distance Methods (phenetics)
Distance methods rely on clustering algorithms (e.g. UPGMA)
Trait 1T
rait
2
AB
C
D
Distance matrix
A B C D
A 1.0 3.0 4.9
B 3.3 3.0
C 3.0
D
Example 1: morphology
UPGMA
Trait 1T
rait
2
AB
C
D
Distance matrix
A B C D
A 1.0 3.0 4.9
B 3.3 3.0
C 3.0
D
A
B
Example 1: morphology
UPGMA
Trait 1T
rait
2
AB
C
D
Distance matrix
A B C D
A 1.0 3.0 4.9
B 3.3 3.0
C 3.0
D
A
B
C
D
Example 1: morphology
Distance matrix
A B C D
A 1 3 5
B 3 7
C 7
D
Distance methods with sequence data
A: ATTGCAATCGG
B: ATTACGATCGG
C: GTTACAACCGG
D: CTCGTAGTCGA
A
B
New Distance matrix: take averages
AB C D
AB 3 6
C 7
D
Distance methods with sequence data
A
B
A B C D
A 1 3 5
B 3 7
C 7
D
AB C D
AB 3 6
C 7
D
Distance methods with sequence data
A
B
A B C D
A 1 3 5
B 3 7
C 7
D
C
A
BC
D
AB C D
AB 3 6
C 7
D
Distance methods with sequence data
A
B
A B C D
A 1 3 5
B 3 7
C 7
D
C
A
BC
D
Assumptions of distance methods
Strengths and weaknesses of distance methods
II. Parsimony Methods (Cladistics)
Hennig (German entomologist) wrote in 1966
Translated into English in 1976: very influential
Applying parsimony
• Consider four taxa (1-4) and four characters (A-D)
• Ancestral state: abcd
A B C D
1 a’ b c d
2 a’ b’ c d’
3 a’ b’ c’ d
4 a’ b’ c d
Trait
Ta
xon
Applying parsimony
• Consider four taxa (1-4) and four characters (A-D)• Ancestral state: abcd
A B C D
1 a’ b c d
2 a’ b’ c’ d’
3 a’ b’ c’ d
4 a’ b’ c d
Trait
Ta
xon
1 2 3 4
a’bcd a’b’c’d’ a’b’c’d a’b’cd
a’
d’
c’
b
Unique changes
Convergences or reversals
b’
5 steps
abcd
Applying parsimony
• Consider four taxa (1-4) and four characters (A-D)• Ancestral state: abcd
A B C D
1 a’ b c d
2 a’ b’ c’ d’
3 a’ b’ c’ d
4 a’ b’ c d
Trait
Ta
xon
1 4 3 2
a’bcd a’b’cd a’b’c’da’b’c’d’
a’
d’
c’
Unique changes
Convergences or reversals
b’
4 steps
abcd
Strengths and weaknesses of parsimony
Strengths
Weaknesses
.
Parsimony practicePosition
Taxon 1234567K AGTACCGL AAGACTAM AACCTTAN AAAGTTA
Which unrooted tree is most parsimonious?
L
M
N
K
L
K N
M
N
L
M
K
Plot each change on each tree. Positions 1 and 2 are done.
Which positions help to determine relationships?
22
2
Inferring the direction of evolution
Chimp
Human
Gorilla
Bonobo
Orangutan
Mouse
ACGCTAGCTACG
ACGCTAGCTACG
ACGCTAGCTAGG
ACGCTAGCTAGG
ACGCTAGCTAGG
ACGCTAGCTAGGWhere did the mutation occur, and what was the change?
III. Maximum likelihood (and Bayesian)
Maximum likelihood: a starting sketch
• Probabilities – transition: 0.2 transversion: 0.1 no change 0.7
A
CT
GTransitions
Tra
nsv
ersi
on
s
A
T
A
G
G
C
A
G
G
A
A
C
G
G
G
A
G
G
G
G
Find the tree with the highest probability
Maximum likelihood: a starting sketch
• Probabilities – transition: 0.2 transversion: 0.1 no change 0.7
A
CT
GTransitions
Tra
nsv
ersi
on
s
A
T
A
G
G
C
A
G
G
A
A
C
G
G
G
A
G
G
G
G
A
T
G
G
G
A
T
A
G
G
Find the tree with the highest probability
P = (.7)(.1)(.2)(.7)(.7)
Maximum likelihood: a starting sketch
• Probabilities – transition: 0.2 transversion: 0.1 no change 0.7
A
CT
GTransitions
Tra
nsv
ersi
on
s
A
T
A
G
G
C
A
G
G
A
A
C
G
G
G
A
G
G
G
G
A
T
A
G
G
A
A
G
G
G
A
A
G
G
A
C
A
G
G
A
P = (.7)(.1)(.2)(.7)(.7)
P = (.7)(.1)(.7)(.7)(.7)
P = (.1)(.2)(.7)(.7)(.2)Find the tree with the highest probability
Assessment of Maximum Likelihood (also Bayesian)
• Strengths
• Weaknesses
Characters to use in phylogeny
• Morphology
• DNA sequence
Challenges of using DNA data
Alignment can be very challenging!
Taxon 1 AATGCGCTaxon 2 AATCGCT
Taxon 1 AATGCGCTaxon 2
Informative sequences evolve at moderate rates
• Too slow?– not enough variation– Taxon 1 AATGCGC– Taxon 2 AATGCGC– Taxon 3 AATGCGC
Polytomy
Example of insufficient evidence: metazoan phylogeny
Fungi
Metazoans
Challenges: sunflower phylogeny
= 15 spp!= 12 spp!
• Recent radiation (200,000 years)• Many species, much hybridization• Need more rapidly evolving markers!!
Informative sequences evolve at moderate rates
• Too fast?– homoplasy likely– “saturation” – only 4 possible states for DNA– Taxon 1 ATTCTGA– Taxon 2 GTAGTGG– Taxon 3 CGTGCTG
Polytomy
Saturation• Imagine changing one nucleotide every hour to a random
nucleotide• Split the ancestral population in 2.
ACGTGCT
One hour
Four hours
12 hours
ACTTGCT
ACGAGCT
ACCTGAA
GCGATCC
ACCAGAA
AGCCTCC
8 hours
AGCGGAA
GAGCTCC
Red indicates multiple mutations at a site
24 hours?
Saturation: mammalian mitochondrial DNA
Forces of evolution and phylogeny reconstruction
How does each force affect the ability to reconstruct phylogeny?
mutation?
drift?
selection?
non-random mating?
migration?
Phylogeny case study I: whalesAre whales ungulates (hoofed mammals)? Figure 14.4
Whales: DNA sequence data
Hillis, D. A. 1999.
How reliable is this tree? Bootstrapping.
How consistent are the data?• Take the dataset (5 taxa, 10 characters)
• Create a new data set by sampling characters at random, with replacement
Taxon 1 2 3 4 5 6 7 8 9 10
Human A C G T T G T A C T
Chimp A G G T T C T A T T
Bonobo A G G T T C T A T G
Gorilla A C T T G C T G T C
Orang T C G T G T A C C C
Taxon 3 8 2 6 10 10 5 8 8 7 3
Human G A C G T T T A A T G
Chimp G A G C T T T A A T G
Bonobo G A G C G G T A A T G
Gorilla T G C C C C G G G T T
Orang G C C T C C G C C A G
Whales: DNA sequence data
Hillis, D. A. 1999.
Molecular clocks
Basic idea of molecular clocks
chimps
humans
whales
hippos56 mya
60 substitutions
6 substitutions
Challenges for phylogeny: gene flow
Sunflower annuals
Different genes may have different histories!
Phylogeny summary
Phylogeny study questions1) Explain in words the difference between monophyletic, paraphyletic,
and polyphyletic taxa. Draw a hypothetical phylogeny representing each type. Give an actual example of a commonly recognized paraphyletic taxon in both animals and in plants.
2) How can a reconstructed phylogeny be used to determine if a similar character in two taxa is due to homoplasy?
3) Whales are classified as cetaceans, not artiodactyl ungulates. This makes artiodactyls paraphyletic – why? What is the evidence that whales belong in the artiodactyls?
4) Phenetics (distance methods) and cladistics (parsimony) differ in the ways they recognize and use similarities among taxa to form phylogenetic groupings. What types of similarity does each school recognize, and how useful is each type of similarity considered to be for identifying groups?
Phylogeny study questions5) What is “bootstrapping” in the context of phylogenetic analysis, and
why is this procedure performed?
6) Why are maximum likelihood methods increasing in popularity for reconstructing phylogenies? In your answer, include a short description of how this method identifies the best phylogeny.
7) For what kinds of data can maximum likelihood methods of phylogeny construction be used? Why is this so? What types of data are typically not used, and why?
8) Would animal mitochondrial DNA provide a reasonable molecular tool for evaluating deep phylogenetic relationships between animal phyla? What about ribosomal DNA? Justify your answers.
9) Integrative question: Draw a pair of axes with “Time since divergence” on the x axis and “percent of sites that are the same” on the y axis. Draw a graph that shows the basic pattern for third codon sites: is your graph linear? Explain why or why not.
Phylogeny study questions10) You are studying a group of species that lives in two very different
environments. You build two phylogenies: one is based on a locus that is probably under divergent selection in the two environments, while the other phylogeny is based on a neutral locus. Which phylogeny would be more likely to represent the species history? why?
11) For a number of years, Anolis lizards are found in similar micro-habitats on many separate islands in the Carribean are very similar to each other (for example, large lizards that feed on the ground, smaller lizards that feed on tree trunks, and very small lizards that feed at the tops of branches). Two different, historical explanations have been proposed to explain this pattern: each morph has evolved repeatedly on each island, or each morph has evolved just once, then dipsersed. Sketch a phylogeny that would support each hypothesis.
12) Integrative question: the Cameroon lake cichlid phylogeny, showing that the lake species were monophyletic, was based on mitochondrial DNA. Explain why this might not reflect the species history. How could you be more certain about the phylogeny?
13) Explain why allopolyploid taxa pose problems for phylogenies.
top related