mboms genomics of model microbes lab 6: molecular phylogenetics

36
MBoMS MBoMS Genomics of Model Microbes Genomics of Model Microbes Lab 6: Molecular phylogenetics Lab 6: Molecular phylogenetics QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture.

Upload: moses-morris

Post on 23-Dec-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

MBoMSMBoMS Genomics of Model Microbes Genomics of Model Microbes

Lab 6: Molecular Lab 6: Molecular phylogeneticsphylogenetics

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 2: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Back to TaxPlotBack to TaxPlot

• Please hand in your final Please hand in your final summary of your tax plot datasummary of your tax plot data

• Peg will look over it and Peg will look over it and provide a final summary of provide a final summary of what the class data looks like what the class data looks like and send it to you by emailand send it to you by email

Page 3: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Back to AlignmentsBack to Alignments

• Your final alignments are due Your final alignments are due in class todayin class today– Please make sure that Peg or Please make sure that Peg or

Michelle has checked them and Michelle has checked them and given you an okaygiven you an okay

– Hand print outs to Peg or Hand print outs to Peg or Michelle for their recordsMichelle for their records

Page 4: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

On to Tree BuildingOn to Tree Building• Today you will get your last tool - you Today you will get your last tool - you

will learn how to create phylogenetic will learn how to create phylogenetic treestrees

• We will start with an introduction to We will start with an introduction to tree-building and then end with some tree-building and then end with some lab exercises to give you some practice lab exercises to give you some practice building treesbuilding trees

• Your homework for next Tuesday is to Your homework for next Tuesday is to create and print out a tree for each of create and print out a tree for each of your proteins - you will make 6 trees in your proteins - you will make 6 trees in total and each tree will have 6 taxa (3 total and each tree will have 6 taxa (3 from one species and 3 from the second from one species and 3 from the second species)species)

Page 5: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Creating protein-based Creating protein-based phylogeniesphylogenies

• The last step in our analysis is The last step in our analysis is to create molecular phylogenies to create molecular phylogenies from each of our alignmentsfrom each of our alignments– We will use these data, combined We will use these data, combined

with the matrixes of similarity for with the matrixes of similarity for each protein, to determine if all of each protein, to determine if all of our proteins have evolved in a our proteins have evolved in a similar, vertical, fashionsimilar, vertical, fashion

• To begin, you need to learn a To begin, you need to learn a bit about phylogeneticsbit about phylogenetics

Page 6: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Phylogenetic Phylogenetic inferenceinference

•Phylogenetic inference is premised on

– The inheritance of ancestral characters– The existence of an evolutionary history

defined by changes in these characters

•Results in a tree-like model of evolution– Except when there is paralogy or lateral

transfer

Olsen, 2006

Page 7: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Character Character evolutionevolution

• Heritable changes in features Heritable changes in features

(morphology, gene sequences, (morphology, gene sequences,

etc.) provide the basis for inferring etc.) provide the basis for inferring phylogeniesphylogenies– Such changes are usually referred to as the Such changes are usually referred to as the

states of characters (presence or absence, states of characters (presence or absence, nucleotide at a specific site, etc.)nucleotide at a specific site, etc.)

– Their utility depends on how often the Their utility depends on how often the changes that produce different character changes that produce different character states occur independently (homoplasy)states occur independently (homoplasy)

Page 8: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Unique & Un-reversed Unique & Un-reversed CharactersCharacters

• Given a heritable evolutionary change that Given a heritable evolutionary change that is unique and un-reversed (e.g. the origin is unique and un-reversed (e.g. the origin of hair), the presence of the novelty in any of hair), the presence of the novelty in any taxa must be due to inheritance from the taxa must be due to inheritance from the ancestorancestor– Similiarly, absence in any taxa must be because Similiarly, absence in any taxa must be because

the taxa are not descendants of that ancestorthe taxa are not descendants of that ancestor

• The novelty will be a homology acting as a The novelty will be a homology acting as a marker for the descendants of the marker for the descendants of the ancestorancestor– The taxa with the novelty will be a clade (eg. The taxa with the novelty will be a clade (eg.

Mammalia)Mammalia)

Page 9: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

HairHairBecause hair evolved Because hair evolved

only once and is un-only once and is un-reversed it is reversed it is

homologous and homologous and provides unambiguous provides unambiguous evidence for the clade evidence for the clade

MammaliaMammalia

LizardLizard

FrogFrog

HumanHuman

DogDog

Change in stateChange in state

HairHair

presentpresentabsentabsent

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 10: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

HomoplasyHomoplasy• Homoplasy is similarity that Homoplasy is similarity that

is not homologous, i.e. not is not homologous, i.e. not due to inheritance from a due to inheritance from a common ancestorcommon ancestor

• It is the result of independent It is the result of independent evolution (convergence, evolution (convergence, parallelism, reversal)parallelism, reversal)

• Homoplasy can provide Homoplasy can provide misleading evidence of misleading evidence of phylogenetic relationshipsphylogenetic relationships

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 11: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

HomoplasyHomoplasy

A Fundamental Problem with A Fundamental Problem with Phylogenetic InferencePhylogenetic Inference

• If there were no homoplastic If there were no homoplastic similarities, inferring similarities, inferring phylogenies would be easy - phylogenies would be easy - all the pieces of the jig-saw all the pieces of the jig-saw would fit together neatlywould fit together neatly

Page 12: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Incongruence or Incongruence or IncompatibilityIncompatibility

LizardLizard

FrogFrog

HumanHuman

DogDog

LizardLizard

FrogFrog

HumanHuman

DogDogHairHairTailTail

These trees and characters are incongruentThese trees and characters are incongruent

Both trees cannot be correct and at least one Both trees cannot be correct and at least one character must be homoplasticcharacter must be homoplastic

Page 13: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Distinguishing Homology Distinguishing Homology and Homoplasyand Homoplasy

• Morphologists use a variety of Morphologists use a variety of techniques to distinguish techniques to distinguish homoplasy and homologyhomoplasy and homology– Homologous characters are expected to Homologous characters are expected to

display detailed similarity, in position, display detailed similarity, in position, structure and developmentstructure and development

– Homoplastic similarities are more likely Homoplastic similarities are more likely to be superficialto be superficial

– As recognized by Darwin, congruence As recognized by Darwin, congruence with other characters provides the with other characters provides the most compelling evidence for homologymost compelling evidence for homology

Page 14: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

The Importance of The Importance of CongruenceCongruence

““The importance for The importance for classification of trifling classification of trifling

characters, mainly depends on characters, mainly depends on their being correlated with their being correlated with several other characters of several other characters of

more or less importance. The more or less importance. The value indeed of an aggregate value indeed of an aggregate

of characters is very of characters is very evident…..a classification evident…..a classification

founded on any single founded on any single character, however important character, however important

that may be, has always that may be, has always failed.”failed.”

Charles Darwin, Origin of Charles Darwin, Origin of SpeciesSpecies

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 15: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Homoplasy & Molecular Homoplasy & Molecular DataData

• Incongruence and therefore Incongruence and therefore homoplasy can be common in homoplasy can be common in molecular datamolecular data– One reason is that characters have a One reason is that characters have a

limited number of alternative states limited number of alternative states (e.g. A, G, C, T)(e.g. A, G, C, T)

– In addition, these states are chemically In addition, these states are chemically identical, so that homology and identical, so that homology and homoplasy are equally similar and homoplasy are equally similar and cannot be distinguished through cannot be distinguished through detailed study of structure or detailed study of structure or developmentdevelopment

Page 16: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Homology & HomoplasyHomology & Homoplasy• Each nucleotide position can be considered Each nucleotide position can be considered

homologoushomologous– Although in some taxa the position is not present Although in some taxa the position is not present

because of an insertion or deletionbecause of an insertion or deletion

• Example: The phylogeny of the four taxa is Example: The phylogeny of the four taxa is knownknown– The two trees illustrate character state homology The two trees illustrate character state homology

(for character 2) and homoplasy (for character 4)(for character 2) and homoplasy (for character 4)– Note that the sequence alignment consists of Note that the sequence alignment consists of

characters in which a nucleotide is missing characters in which a nucleotide is missing because of an insertion or deletion mutationbecause of an insertion or deletion mutation

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 17: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Homology & HomoplasyHomology & Homoplasy

• This example illustrates that definitions of This example illustrates that definitions of homology will be contingent on the choice of homology will be contingent on the choice of taxa being comparedtaxa being compared– For instance, if species C was not included, For instance, if species C was not included,

character 6 would not existcharacter 6 would not exist

• Examples of homologous charactersExamples of homologous characters– In addition to nucleotides and amino acids, many In addition to nucleotides and amino acids, many

other character states can be considered other character states can be considered homologoushomologous

– For instance, the presence of absence of an intron, For instance, the presence of absence of an intron, a transposable element, an insertion or deletiona transposable element, an insertion or deletion

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 18: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Phylogenetic Phylogenetic treestrees

• A phylogenetic tree is a A phylogenetic tree is a statement about the statement about the evolutionary relationship evolutionary relationship between a set of between a set of homologous characters of homologous characters of one or several organismsone or several organisms

• It is composed of lines It is composed of lines called branches that called branches that intersect and terminate at intersect and terminate at nodesnodes– The nodes at the tips of the The nodes at the tips of the

branches represent taxa, on branches represent taxa, on in the case of sequence data, in the case of sequence data, the sequences, that exist the sequences, that exist todaytoday

– The internal nodes represent The internal nodes represent ancestral taxa, whose ancestral taxa, whose properties we can only infer properties we can only infer from existing taxafrom existing taxa

Page 19: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Unrooted treesUnrooted trees

For 4 taxa there are only three For 4 taxa there are only three possible unrooted treespossible unrooted trees

A

A

A

B

B

B

C

C

C

D

D

D

Page 20: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Rooted treeRooted tree• A tree is rooted if A tree is rooted if

there is a particular there is a particular node, the root, from node, the root, from which a unique which a unique directional path directional path leads to each extant leads to each extant taxontaxon

• In this tree, the root In this tree, the root is the only internal is the only internal node from which all node from which all other nodes can by other nodes can by reached by moving reached by moving forward, toward the forward, toward the tipstips

• The root is the The root is the common ancestor of common ancestor of all the taxa in the all the taxa in the treetree

A

B

C

D

E

R

X Y

Z

Page 21: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Rooted treesRooted trees• Once a root is identified, 5 different Once a root is identified, 5 different

rooted trees can be created for rooted trees can be created for each of the three unrooted trees, each of the three unrooted trees, each with a distinctive branching each with a distinctive branching pattern reflecting a different pattern reflecting a different evolutionary history for the evolutionary history for the relationships shown in the unrooted relationships shown in the unrooted trees…here are a fewtrees…here are a few

AB

C

D

AB

C

D

A

B

C

D

Page 22: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

PhylogenePhylogenetic treetic tree

A

B

C

D

E

• This is a rooted tree whose This is a rooted tree whose branch tips represent 5 taxa branch tips represent 5 taxa (A-E)(A-E)

• The numbers on the The numbers on the branches indicate changes in branches indicate changes in a sequence that occurred a sequence that occurred along that branchalong that branch

• e.g. between X and Y 3 e.g. between X and Y 3 changes occurred and changes occurred and between Y and D 1 change between Y and D 1 change occurredoccurred

• This tree is additive This tree is additive because the distance because the distance between any two nodes between any two nodes equals the sum of the lengths equals the sum of the lengths of the all the branches of the all the branches between thembetween them

• A node is bifurcating if it A node is bifurcating if it has only two immediate has only two immediate descendent lineagesdescendent lineages

R

1

X Y

Z21

1

2

1

7

3

Page 23: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics
Page 24: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Different ways to view Different ways to view phylogenetic treesphylogenetic trees

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 25: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Assessing Phylogenetic Assessing Phylogenetic HypothesesHypotheses

• We use numerical phylogenetic We use numerical phylogenetic methods because most data includes methods because most data includes potentially misleading evidence of potentially misleading evidence of relationshipsrelationships

• Thus, we need to assess the confidence Thus, we need to assess the confidence we can place in our hypotheseswe can place in our hypotheses

• This is not always simpleThis is not always simple

Page 26: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Reliability testsReliability tests

Reliability refers to the probability Reliability refers to the probability that members of a clade will be part that members of a clade will be part

of the true treeof the true tree

Page 27: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

BootstrappingBootstrapping

• Phylogenetic bootstrapping allows us to Phylogenetic bootstrapping allows us to generate a series of pseudo-samples generate a series of pseudo-samples which we can use to estimate sampling which we can use to estimate sampling variancevariance– Random resampling (with replacement) of Random resampling (with replacement) of

characters from the original data to generate characters from the original data to generate pseudoreplicate data matrices identical in size to pseudoreplicate data matrices identical in size to the original matrixthe original matrix

– These replicates are subjected to the same These replicates are subjected to the same phylogenetic searches as the original datasetphylogenetic searches as the original dataset

• Bootstrap support for a group of interest Bootstrap support for a group of interest is calculated as the proportion of times is calculated as the proportion of times the group is obtained in the replicatesthe group is obtained in the replicates

Page 28: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics
Page 29: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Exercise 1Exercise 1• You will use the tree output from You will use the tree output from

CLUSTALW as your “first pass” CLUSTALW as your “first pass” phylogenetic reconstruction toolphylogenetic reconstruction tool– CLUSTALW is not designed to produce CLUSTALW is not designed to produce

publishable trees, but it does produce a publishable trees, but it does produce a reasonable representation of the reasonable representation of the relationships inferred from the multiple relationships inferred from the multiple alignment it createdalignment it created

• We will use these cladograms and We will use these cladograms and phylograms created in CLUSTALW to phylograms created in CLUSTALW to visualize the relationships implied visualize the relationships implied from our multiple alignmentsfrom our multiple alignments

Page 30: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Exercise 1, contExercise 1, cont• Go into CLUSTALW and redo your final Go into CLUSTALW and redo your final

alignments for each proteinalignments for each protein– this should be simple to do as you will be using the this should be simple to do as you will be using the

data files you have already createddata files you have already created– Each final alignment will include the protein Each final alignment will include the protein

sequences from 3 genomes from each of your 2 sequences from 3 genomes from each of your 2 species (6 sequences)species (6 sequences)

– You should end up with a total of 6 final alignmentsYou should end up with a total of 6 final alignments

• Once in CLUSTALW, run the alignment Once in CLUSTALW, run the alignment programprogram– In the output, go to the bottom of the page, where In the output, go to the bottom of the page, where

the phylogram or cladogram is providedthe phylogram or cladogram is provided– You will want to print out the phylogram for each You will want to print out the phylogram for each

protein, be sure the branch lengths are included protein, be sure the branch lengths are included (see toggle switch to include or exclude branch (see toggle switch to include or exclude branch lengths)lengths)

Page 31: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

CLUSTAL 2.0.5 multiple sequence alignment

sp1gen1 MLTTRYKLLLAAN 13sp1gen3 MATTRYKLLLAAA 13sp1gen2 MLTTRYKLLLAAA 13sp2gen2 MLTTRAKLLLRRA 13sp2gen3 MLTTRALLLLRRA 13sp2gen1 MLTTRYKLLLRRA 13 * *** ***

Sp1gen1 0.08

Sp1gen3 0.08

Sp2gen1 0.0

Sp2gen2 0.0

Sp2gen3 0.08

Sp1gen2 0.0

Protein A

Page 32: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Sp1gen1 0.0

Sp1gen3 0.0

Sp2gen1 0.4

Sp2gen2 0.0

Sp2gen3 0.08

Sp1gen2 0.08

Protein B

CLUSTAL 2.0.5 multiple sequence alignment

sp1gen1 MAAASSSRHLYN 12sp1gen3 MAAASSSRHLYN 12sp1gen2 MAASSSSRHLYN 12sp2gen1 MAAASSSRHLYY 12sp2gen2 MAAASSSRHLLL 12sp2gen3 MATASSSRHLLL 12 **::******

Page 33: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Sp1gen1 0.0

Sp1gen3 0.0

Sp2gen1 0.4

Sp2gen2 0.0

Sp2gen3 0.08

Sp1gen2 0.08 Protein B

Sp1gen1 0.08

Sp1gen3 0.08

Sp2gen1 0.0

Sp2gen2 0.0

Sp2gen3 0.08

Sp1gen2 0.0

Protein A

Page 34: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Exercise 2Exercise 2• Compare the trees produced by Compare the trees produced by

each protein for your two specieseach protein for your two species– Are the branching patterns the same?Are the branching patterns the same?– Are the branch lengths the same?Are the branch lengths the same?

• Describe the general pattern that Describe the general pattern that emerges and not particular emerges and not particular exceptions to this patternexceptions to this pattern

• Do your data argue for the Do your data argue for the existence of a species concept for existence of a species concept for your species?your species?

Page 35: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Exercise 3Exercise 3

• How are we going to merge all of How are we going to merge all of these data to develop some sort of these data to develop some sort of overview?overview?– Can you envision one figure that Can you envision one figure that

would represent all the species and would represent all the species and all the protein?all the protein?

– Do we need a different figure for each Do we need a different figure for each protein?protein?

– Do we need a table or graph of some Do we need a table or graph of some sort?sort?

Page 36: MBoMS Genomics of Model Microbes Lab 6: Molecular phylogenetics

Please BRING WITH YOU Please BRING WITH YOU to our final labto our final lab

• Your taxplot figure/table and a one Your taxplot figure/table and a one paragraph description of what you paragraph description of what you learned about YOUR species learned about YOUR species genomes from this exercisegenomes from this exercise

• Your final alignments for each Your final alignments for each proteinprotein

• Your phylograms (with branch Your phylograms (with branch lengths) for each proteinlengths) for each protein

PRINT THESE OUT BEFORE CLASSPRINT THESE OUT BEFORE CLASS