phylogenetics continued tests by tuesday because some problems with scantrons
TRANSCRIPT
PHYLOGENETICSCONTINUED
TESTSBYTUESDAYBECAUSESOMEPROBLEMSWITHSCANTRONS
consensus tree• can also ask equally-
supported trees (equally parsimonious, equal likelihood) how well they all support same nodes
• doesn’t have to involve subset of data like in bootstrap
•may summarize the stable parts of tree across 2+ trees
ba c d e ba c d e
ba c d e
CONSENSUS phylogenyis not fully resolved
where there is disagreementamong equally-ranked trees
blue indicates dinosaurs with bifurcation of neural spine in vertebraehttp://svpow.com/papers-by-sv-powsketeers/wedel-and-taylor-2013-on-sauropod-neural-spine-bifurcation/
support for the method
•do we believe phylogeny reconstruction works? need to test it against a known history
•(fish(salamander(bird(mouse,human)) we feel pretty strongly about
•experimental phylogenetics uses virus evolution to go one step further
experimental evolution
growing T7 phage
on E. coli plates; speed up mutation
process by adding mutagen
40generations
40generations
40generations
experimental evolution
•so phylogeny is known, and ancestral strains can be kept in freezer
•sequence part of DNA and use parsimony, likelihood, and other approaches
•consistently got the right (TRUE) answer!
•can also track “traits” on this tree, e.g. changes in growth rate and plaque size on E. coli plates (and check against actual ancestors)
# DNA # DNA mutations mutations
on this on this branchbranch
Text: “Because constructing phylogenies, and science more broadly, is often a process of evaluating evidence, scientists often test the effectiveness of the methodologies used to draw conclusions.”
Rem: Rem: each each
branch is branch is 40 40
generatiogenerationsns
case studies
•text goes through Origin of Tetrapods, Human phylogeny, Darwins finches, HIV
•show phylogeny, explain the likely mechanisms for pattern
well-supported phylogeny of rabies virus
lineages, coded by host bat
species
Phylogeny: how?Methods from Streicker et al (2010 bat rabies phylogeny paper)
what gene what gene region(s)?region(s)?
PCR of PCR of gene with gene with primersprimers
how how sequence sequence
data data generatedgenerated
sampling sampling efforteffort
Phylogeny: how?Methods from Streicker et al (2010 bat rabies phylogeny paper)
tree criterion: tree criterion: uses statistical uses statistical model of DNA model of DNA
evolutionevolutionevery type of every type of
mutation happens mutation happens at different rate, at different rate,
as observedas observed
mutations happen mutations happen at different rates at different rates across codons in across codons in protein-coding protein-coding
genesgenes
are our data are our data consistently consistently
supporting same supporting same phylogeny?phylogeny?
outgroup comparison ‘roots’ phylogeny at outgroup comparison ‘roots’ phylogeny at ancestral nodeancestral node
Phylogeny: how?Methods from Streicker et al (2010 bat rabies phylogeny paper)
coalescent: coalescent: statistical model of statistical model of
how different how different evolutionary evolutionary
histories of drift, histories of drift, selection, selection,
migration, and migration, and change in change in
population size are population size are associated with associated with
DATADATA
treat bat species as treat bat species as locations and ask locations and ask how frequently how frequently
migration of virus migration of virus among species among species could explain could explain
pattern we see pattern we see now?now?
oh, no. now it is oh, no. now it is getting gnarly.getting gnarly.
For RNA viruses, rapid viral evolution and the biological similarity of closely related host species have been proposed as key determinants of the occurrence and long-term outcome of cross-species transmission. Using a data set of hundreds of rabies viruses sampled from 23 North American bat species, we present a general framework to quantify per capita rates of cross-species transmission and reconstruct historical patterns of viral
establishment in new host species using molecular sequence data. These estimates demonstrate diminishing frequencies of both cross-species transmission and host shifts with increasing phylogenetic distance between bat species .
Evolutionary constraints on viral host range indicate that host species barriers may trump the intrinsic mutability of RNA viruses in determining the fate of emerging host-virus interactions.
analysisindicates
rate of virusjumping from
one host to another
so this study requires TWO phylogenies (virus
and bats)CST: cross-species transmission
neutrality• neutral: doesn’t affect
fitness of organism
• compare mutations in protein coding regions: synonymous mutations do not change amino acid, nonsynonymous do
• if much of diversity is neutral (or nearly so), mutations will accumulate and fix (become a substitution) in populations regularly through time
“molecular clock” works for many genome partitionsneutrality acts as our NULL HYPOTHESIS
• different homologous genome regions have different rates, slower rates when more functional constraints
• remember: fossil record, biogeography/geology, mutation accumulation studies help us estimate substitution rate µ
isthmus closes via volcanicuplift ~3.5mya
two locations - are theytwo populations?
different allele frequencies,distinct clades on tree: yes
compare cytochrome oxidase mtDNA gene:
7% divergence
• d=2µt
• µ is the rate of mutations going to fixation (substitutions), under neutrality the mutation rate IS the substitution rate because selection doesn’t accelerate or halt or change probability of fixation
• here we know t=3,500,000 years, d=0.07
• µ = d/2t = (0.07)/(7,000,000) = 1x10-8
• another way to put it, rate of divergence (2µ) ~2% per million years
time(t), rate µ along 2 branches
•what is our assumption in those slides about clock calibration?
•how would YOU test that?
•idea is any mutation is equally likely to become a substitution
•how have we divided (point) mutations up so far?
neutrality• neutral: doesn’t affect
fitness of organism
• compare mutations in protein coding regions: synonymous mutations do not change amino acid, nonsynonymous do
• if much of diversity is neutral (or nearly so), mutations will accumulate and fix (become a substitution) in populations regularly through time
synonymous is assumed neutral
• so we can ask if nonsynonymous substitutions happen at a different rate
• neutrality: nonsynonymous divergence (dN) = synonymous divergence (dS) rate
• rate, not number of mutations - remember many more ways for a mutation to be nonsynonymous than synonymous
• does dN/dS =1? (book, elsewhere often this is called kA/kS; adjusts for the “more ways” of nonsynonymy)
This is the dN:dSor kA:kS approach
we have been discussing
if kA:kS >> 1, changehas been selected FORif kA:kS << 1, change
is generally BAD
if kA:kS ~ 1neutrality
positive selection: amino acid change is favored
functional constraints lead to high levels of homology: change is generally bad (purifying
selection)
region of high homology led todiscovery of new functional
regionthat influences mammalian heart
disease