from basic concepts to advanced applications molecular evolution and phylogeny by ofir cohen the...
TRANSCRIPT
![Page 1: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/1.jpg)
From basic Concepts to Advanced applications
Molecular Evolution and Phylogeny
By Ofir Cohen
The Bioinformatics UnitG.S. Wise Faculty of Life Science
Tel Aviv University, Israel2011
http://ibis.tau.ac.il/twiki/bin/view/Bioinformatics/Phylogeny
![Page 2: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/2.jpg)
2 of ~28
Darwin’s teachings– common descent and Tree-like evolution
Introduction – The tree concept
![Page 3: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/3.jpg)
3 of ~28
Common Descent – Modern evidence
Introduction – The tree concept
"The unity of life is no less remarkable than its diversity" "The unity of life is no less remarkable than its diversity" THEODOSIUS DOBZHANSK
![Page 4: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/4.jpg)
4 of ~28
What is a Phylogenetic Tree? Phylogenetic tree:
(hypothetical) historical pattern of evolutionary relationships among organisms
Introduction – The tree concept
Homo
Bos
Mus
Rattus0.011
0.025
0.012
0.011
Gallus
0.038
0.066
0.01
Root
Node
Leaf
Branch
(Greek: phylon = race and genetic = birth)
sps
Horizontal branch length –proportional to evolutionary distances (unit = substitution / site)
![Page 5: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/5.jpg)
5 of ~28
Molecular evidence of HIV transmission in a
criminal case
Introduction - Anecdotes
Metzker, Michael L. et al. (2002) Proc. Natl. Acad. Sci. USA 99, 14292-14297
![Page 6: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/6.jpg)
6 of ~28
Criminal investigation
August 1994 a nurse tests negative for HIV. breaks off a messy 10 year affair with a doctor. Three weeks later the doctor gives his ex-mistress a vitamin B-12 shot
In January 1995, the nurse tests positive for both HIV and hepatitis C.
The doctor’s office records from the day are missing (but eventually found). The doctor had withdrawn blood samples from a known HIV patient and a known hepatitis C patient
the same day as the vitamin B-12 shot. The nurse had never had contact with either patient
Introduction - Anecdotes
Circumstantial evidence that the doctor injected blood from a patient of his into this ex-girlfriend….
How can this be proved using a phylogenetic approach?
![Page 7: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/7.jpg)
7 of ~28
HIV – short background
Extreme heterogeneity Within each patient there are many different viral
strains ("quasi-species")
Introduction - Anecdotes
![Page 8: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/8.jpg)
8 of ~28
History of the virus:
gp120PATIENT
VICTIM
CONTROLS
©2002 National Academy of Sciences, U.S.A.
Introduction - Anecdotes
![Page 9: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/9.jpg)
9 of ~28
History of the virus:
RT VICTIM
PATIENT
Introduction - Anecdotes
Source sequences that are paraphyletic (other sequences are nested within them)
with respect to the recipient sequences provide evidence for the direction of transmission.
![Page 10: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/10.jpg)
10 of ~28
Phylogenetic analysis: Not only among organisms - Cancer
phylogenyA phylogeny of acute myeloid leukemia (AML) subtypes
Riester et al. 2010Liu et al. 2009
![Page 11: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/11.jpg)
11 of ~28
Phylogenetic analysis: Not only in biology – Language evolution
Russell and Atkinson. 2003
Researchers learn the evolution of languages by treating them like genomes.
Instead of COGs (gene families), analyze COGNATES (words families)
![Page 12: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/12.jpg)
12 of ~28
Comparative Genomics – "All life is one"
Compare homologues sequences
![Page 13: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/13.jpg)
13 of ~28
Newick format with branch lengths
(A:0.3,((B1:0.1,B2:0.1):0.3,(C1:0.1,C2:0.1):0.5):0.3);
0.1
A
B1
C1
C2
B2http://tree.bio.ed.ac.uk/software/figtree/
![Page 14: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/14.jpg)
14 of ~28
Alignment and phylogeny are mutually dependant
Inaccurate tree building
MSA
Sequence alignment
0.4
Phylogeny reconstruction
Unaligned sequences
![Page 15: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/15.jpg)
16 of ~28
Multiple sequence alignment (MSA)Several advanced MSA programs are available.
Today we will use two: MAFFT – fastest and one of the most accurate PRANK – distinct from all other MSA programs because of its
correct treatment of insertions/deletions
![Page 16: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/16.jpg)
17 of ~28
MAFFT Web server & download:
http://align.bmr.kyushu-u.ac.jp/mafft/online/server/ Efficiency-tuned variants
quick & dirty or slow but accurate
Nucleic Acids Research, 2002, Vol. 30, No. 14 3059-3066© 2002 Oxford University Press
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
Kazutaka Katoh, Kazuharu Misawa1, Kei-ichi Kuma and Takashi Miyata*
![Page 17: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/17.jpg)
18 of ~28
Choosing a MAFFT strategy
qu
ick &
dirty slow
bu
t accurate
![Page 18: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/18.jpg)
19 of ~28
Choosing a MAFFT strategy
qu
ick &
dirty slow
bu
t accurate
![Page 19: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/19.jpg)
20 of ~28
Choosing a MAFFT strategy
qu
ick &
dirty slow
bu
t accurate
![Page 20: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/20.jpg)
21 of ~28
Choosing a MAFFT strategy
L-INS-i
ooooooooooooooooooooooooooooooooXXXXXXXXXXX-XXXXXXXXXXXXXXX------------------
--------------------------------XX-XXXXXXXXXXXXXXX-XXXXXXXXooooooooooo-------
------------------ooooooooooooooXXXXX----XXXXXXXX---XXXXXXXooooooooooo-------
--------ooooooooooooooooooooooooXXXXX-XXXXXXXXXX----XXXXXXXoooooooooooooooooo
--------------------------------XXXXXXXXXXXXXXXX----XXXXXXX------------------
G-INS-i
XXXXXXXXXXX-XXXXXXXXXXXXXXX
XX-XXXXXXXXXXXXXXX-XXXXXXXX
XXXXX----XXXXXXXX---XXXXXXX
XXXXX-XXXXXXXXXX----XXXXXXX
XXXXXXXXXXXXXXXX----XXXXXXX
E-INS-i
oooooooooXXX------XXXX---------------------------------XXXXXXXXXXX-XXXXXXXXXXXXXXXooooooooooooo
---------XXXXXXXXXXXXXooo------------------------------XXXXXXXXXXXXXXXXXX-XXXXXXXX-------------
-----ooooXXXXXX---XXXXooooooooooo----------------------XXXXX----XXXXXXXXXXXXXXXXXXooooooooooooo
---------XXXXX----XXXXoooooooooooooooooooooooooooooooooXXXXX-XXXXXXXXXXXX--XXXXXXX-------------
---------XXXXX----XXXX---------------------------------XXXXX---XXXXXXXXXX--XXXXXXXooooo--------
qu
ick &
dirty slow
bu
t accurate
![Page 21: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/21.jpg)
22 of ~28
MAFFT outputSaving the output Choose a format: Clustal, Fasta, or
click "Reformat" to convert to a selection of other formats
Save page as a text file
A colored view of the alignment
![Page 22: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/22.jpg)
23 of ~28
PRANK
![Page 23: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/23.jpg)
24 of ~28
Classical alignment errors for HIV env
![Page 24: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/24.jpg)
25 of ~28
PRANK Web server: http://www.ebi.ac.uk/goldman-srv/webPRANK/
![Page 25: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/25.jpg)
26 of ~28
PRANK output
If you need a different format – copy the results to the READSEQ sequence converter: http://www-bimas.cit.nih.gov/molbio/readseq/
![Page 26: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/26.jpg)
27 of ~28
Downloadable PRANK http://www.ebi.ac.uk/goldman-srv/prank/prank/
PRANK: A command-line program interface PRANKSTER: A program with graphical user interface
![Page 27: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/27.jpg)
2828
1. Download and unzip the sequence files from my homepage (Google “Ofir Cohen" and look for the workshop materials under "Teaching"). Open "fahA.fas" in Notepad – these are 65 protein sequences in FASTA format.
2. Run PRANKSTER, open the "fahA.fas" file, and run "Alignment""Make alignment"
3. While you wait: Copy the sequences into the MAFFT web server and run the "automatic" "moderate" strategy – which strategy did MAFFT choose for you? Click "Reformat", choose "phylip|phylip4", and save as "fahA.mafft.phylip"
4. When PRANKSTER finishes click FileSave, and save the MSA in Phylip format by the name "fahA.prank.phylip"
![Page 28: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/28.jpg)
29 of ~28
Phylogeny reconstructionDifferent approaches (algorithms / programs): Distance based methods (e.g. neighbor-joining, as in ClustalW)
Fast but inaccurate Maximum parsimony (e.g. MEGA) Maximum likelihood methods (e.g. phyML, RAxML)
Accurate but slower Bayesian methods (e.g. MrBayes)
Most accurate but very slow
ABCDE
Guide tree
A
DCB
E
MSA
Pairwise distance table
![Page 29: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/29.jpg)
30 of ~28
PhyMLThe most widely used maximum likelihood (ML) program Web server & download: http://www.atgc-montpellier.fr/phyml/
Accepts input MSA in PHYLIP format only:
• Interleaved: • Sequencial:
![Page 30: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/30.jpg)
3232
1. Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the phyML webserver (don't forget to choose "Amino-acids" and enter your email)
2. Run it with the local installation of "phyml.bat"
• You should end up with a file: "fahA.prank.phylip_phyml_tree.txt"
![Page 31: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/31.jpg)
33 of ~28
RAxML Web server: http://phylobench.vital-it.ch/raxml-bb/ Similar maximum likelihood (ML) methodology as phyML, but much faster
Faster results Better results in same run-time
![Page 32: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/32.jpg)
3535
1. Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the RAxML webserver (don't forget to tick "Protein sequences" and enter your email)
• Save the resulting tree file as: "fahA.prank.phylip.raxml"
![Page 33: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/33.jpg)
36 of ~28
FigTree: tree visualization and figure creation
Manipulate a node
Manipulate a clade
Manipulate a taxon
![Page 34: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/34.jpg)
37 of ~28
1. Open "fahA.prank.phylip_phyml_tree.txt" in FigTree
2. Play around with the different options and make a pretty figure!
1. Find out how to color specific clades, as below
2. Try each of the three options under "Layout"
3. Export a figure in PDF format(File Export Graphic…)
![Page 35: From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science](https://reader036.vdocument.in/reader036/viewer/2022062518/5697bf891a28abf838c8a379/html5/thumbnails/35.jpg)
38 of ~28
Final Questions…
Thanks for your attention