tutorial: some start-up towards using some “simple” phylogenetic reconstruction packages
DESCRIPTION
Tutorial: Some Start-up Towards Using some “simple” Phylogenetic Reconstruction Packages. Winthrop June 28 – July 2, 2014 Terrell L. Hodge Western Michigan University [email protected]. Recall: EXAMPLE OF Phylogenetic Tree Reconstruction. - PowerPoint PPT PresentationTRANSCRIPT
TUTORIAL: SOME START-UP
TOWARDS USING SOME “SIMPLE” PHYLOGENETIC
RECONSTRUCTION PACKAGES
Winthrop June 28 – July 2, 2014Terrell L. Hodge
Western Michigan University [email protected]
RECALL: EXAMPLE OF PHYLOGENETIC TREE RECONSTRUCTION
Output from “simplest” method: parsimony, which aims to minimize the number of differences in the character states per site:
The output gave three trees, all with the minimum number, 97, of differing character states.
AIM 1: CAN WE RECONSTRUCT THIS TREE OURSELVES, EVEN ELIMINATE THE AMBIGUITY, AS A WAY TO BUILD OUR SKILLS?
The trees were constructed by using “Phylogenetic Tree Constructor”, a BioQuest project outcome available at BEN (BiosciEdNet).
Try to do this now yourself: A screen shot appears on the next page. Go to:
http://www.securebio.umb.edu/cgi-bin/TreeConstructor.pland follow the easy instructions. Parsimony is the
only tree reconstruction method used there.
AIM 2: CAN WE OBTAIN OUR OWN SEQUENCE DATA FROM GENBANK?
One entry point: http://www.ncbi.nlm.nih.gov/
From “Popular Resources” box, can choose “Nucleotide” under “All Databases” to find
DNA sequences“BLAST” to search for sequences similar to
one you already have in hand. The next page has a screen shot.
NEXT: Going to “Nucleotide” to searching for
DNA sequences for “greenbottle fly”; once there, added “and coI” to the search terms to pick out commonly used gene for animal studies
Screen shot on next pages.
NEXT: Going to “GenBank” to view accession
for selected sequence (happens here to be the first one listed on preceding output of Nucleotide search; used link from there).
Screen shot on next pages.
NEXT: Going from accession entry for selected
sequence to view/retrieve sequence in FASTA format.
Screen shot on next pages.
NEXT: One can copy, paste, and save this FASTA formatted sequence,
e.g., in a text file. For later processing, be sure to keep the “>” and then the actual DNA
sequence data together. Only this FASTA format will be wanted for downstream application, so…
rearrange the data as needed, keeping other identifying features elsewhere in your file, e.g.,
gi|317444450|emb|FR719169.1| Lucilia cuprina partial COI gene for cytochrome oxidase subunit 1, isolate LU9
>TAAATTTTACTTCAGCTACTATAATTATTGCTGTACCAACTGGAATTAAAATTTTCAGTTGATTAGCAACTCTTTATGGAACTCAATTAAACTACTCTCCTGCTACTTTATGAGCTTTAGGATTGTATTTTTATTTACTGTAGGAGGTTTAACTGGAGTTGTTTTAGCTAACTCTTCAATTGATATTATTCTACATGATACTTATTACGTAGTAGCTCACTTCCATTATGTTTTATCAATAGGAGCTGTATTTGCTATTATAGCAGGATTTGTTCATTGATACCCTTTATTTACAGGATTAACTTTAAATACTAAGATATTAAAAAGTCAATTTGCTATTATATTCATTGGAGTAAATTTAACATTTTTCCCCCAACATTTTTTAGGATTAGCAGGAATACCACGACGATATTCAGACTACCCAGATGCTTACACAACTTGAAATGTAATTTCTACAATTGGGTCAACAATTTCTTTATTAGGAATTTTATTCTTCTTCTTTATTATTTGAGAAAGTCTTGTATCTCAACGTCAAGTTTTATTCCCTATTCAATTAAATTCATCAATTGAATGATTACAAAATACTCCACCAGCTGAACATAGTTATTCTGAATTACCTTTATTAACTAA
NEXT: To find other sequences which are
similar to your selected one, you can paste the FASTA formatted sequence data just acquired directly into BLAST.
You can also use the “BLAST” link from the accession page of the sequence just identified.
A screen shot of the outcome of choosing option 2 above appears next.
NEXT: To continue on to find other sequences which are similar
to your selected one, you can paste into the box the FASTA formatted sequence data just acquired directly into BLAST, or
proceed using the accession number data conveniently inserted if going directly from a link from a “Nucleotide” search.
For this exercise (a nucleotide-nucleotide search) select “nr/nt” database, under “Program Selection”, select “blastn” (this is optimized
for “somewhat similar sequences”), and check box “show results in a new window” (this is just handy). Other options are described in accompanying handout, “BIOS
5460 Assignment 1”.
A screen shot of the final outcome of then hitting the “BLAST” button appears next.
MORE INFO More info on reading and evaluating the
output of the previous call is in the handout “BIOS 5460 Assignment 1”. (Thanks go to Dr. Todd Barkman of the Department of Biological Sciences at Western Michigan University.)
RECALL: EXAMPLE OF PHYLOGENETIC TREE RECONSTRUCTION
Output from “simplest” method: parsimony, which aims to minimize the number of differences in the character states per site:
The output gave three trees, all with the minimum number, 97, of differing character states.
CAN WE FIND DATA FOR THE OTHER SPECIES AND TRY TO RECONSTRUCT THE TREE OURSELVES?
Search Nucleotide for “flesh fly and coI” and “fruit fly and coI”.
What are the outcomes? In particular, see the box “Top Organisms” on the right-hand side of the respective Nucleotide pages.
AIM 3: ASSUMING WE COULD OBTAIN SEQUENCE DATA, COULD WE USE A MORE SOPHISTICATED PROGRAM TO RECONSTRUCT PHYLOGENETIC TREES, USING DIFFERENT RECONSTRUCTION METHODS?
ONE SINGLE-STOP SHOPPING SITE FOR PHYLOGENETIC TREE RECONSTRUCTION
http://www.phylogeny.fr/ Select the “one-click” option under
“Phylogeny analysis” Upload FASTA formatted sequences in
the box, or… Just to see how it runs, click on “load
example of sequences”, then the “Submit” button.
Phylogeny.frRobust Phylogenetic Analysis For The Non-Specialist
Click there to get preloaded sample sequences.
The “Submit” button is lower down on this screen.
EXPLORE FURTHER See intermediate
processes executed step-by-step by restarting using the “advanced mode” instead of “one-click” under the “phylogenetic analysis” tab on the page-top toolbar.
Restart using “a la carte” on the “phylogenetic analysis tab” instead, and you can then choose to vary the tree reconstruction methods (some options and outcomes shown on next page).
AIM 4: DISCUSSION AND FURTHER RESOURCES – WHAT (ELSE) DO REAL BIOLOGISTS/MATHBIOLOGISTS USE? Mega (e.g., Mega 5) (free) GenSpring (not free) Magic (free, good for undergrads,
out of Davidson) GenePattern (free) Phylip, PAUP, Clustal, more…
HOPE YOU HAD FUN!
There’s so much more to learn…