characters.ppt

71
Introduction to Introduction to characters and characters and parsimony analysis parsimony analysis

Upload: pammy98

Post on 11-May-2015

1.234 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Characters.ppt

Introduction to characters Introduction to characters and parsimony analysisand parsimony analysis

Page 2: Characters.ppt

Genetic RelationshipsGenetic Relationships

• Genetic relationships exist between individuals within Genetic relationships exist between individuals within populationspopulations

• These include ancestor-descendent relationships and more These include ancestor-descendent relationships and more indirect relationships based on common ancestryindirect relationships based on common ancestry

• Within sexually reducing populations there is a network of Within sexually reducing populations there is a network of relationshipsrelationships

• Genetic relations within populations can be measured with Genetic relations within populations can be measured with a coefficient of genetic relatednessa coefficient of genetic relatedness

Page 3: Characters.ppt
Page 4: Characters.ppt

Phylogenetic RelationshipsPhylogenetic Relationships• Phylogenetic relationships exist between lineages (e.g. Phylogenetic relationships exist between lineages (e.g.

species, genes)species, genes)

• These include ancestor-descendent relationships and more These include ancestor-descendent relationships and more indirect relationships based on common ancestryindirect relationships based on common ancestry

• Phylogenetic relationships between species or lineages are Phylogenetic relationships between species or lineages are (expected to be) tree-like(expected to be) tree-like

• Phylogenetic relationships are not measured with a simple Phylogenetic relationships are not measured with a simple coefficient coefficient

Page 5: Characters.ppt

Phylogenetic RelationshipsPhylogenetic Relationships• Traditionally phylogeny reconstruction was dominated by Traditionally phylogeny reconstruction was dominated by

the search for ancestors, and ancestor-descendant the search for ancestors, and ancestor-descendant relationshipsrelationships

• In modern phylogenetics there is an emphasis on indirect In modern phylogenetics there is an emphasis on indirect relationshipsrelationships

• Given that all lineages are related, closeness of Given that all lineages are related, closeness of phylogenetic relationships is a relative concept. phylogenetic relationships is a relative concept.

Page 6: Characters.ppt

Phylogenetic relationshipsPhylogenetic relationships• Two lineages are more closely related to each other than to Two lineages are more closely related to each other than to

some other lineage if they share a more recent common some other lineage if they share a more recent common ancestor - this is the cladistic concept of relationshipsancestor - this is the cladistic concept of relationships

• Phylogenetic hypotheses are hypotheses of common Phylogenetic hypotheses are hypotheses of common ancestry ancestry

Page 7: Characters.ppt

Phylogenetic TreesPhylogenetic Trees

A CLADOGRAM

Page 8: Characters.ppt

CLADOGRAMS AND CLADOGRAMS AND PHYLOGRAMSPHYLOGRAMS

ABSOLUTE TIME or DIVERGENCE

RELATIVE TIME

BA

C

DE

H

GF

I

J

B C ED H GI J FA

Page 9: Characters.ppt

Trees - Rooted and UnrootedTrees - Rooted and Unrooted

Page 10: Characters.ppt

Characters and Character Characters and Character StatesStates

• Organisms comprise sets of featuresOrganisms comprise sets of features

• When organisms/taxa differ with respect to When organisms/taxa differ with respect to a feature (e.g. its presence or absence or a feature (e.g. its presence or absence or different nucleotide bases at specific sites in different nucleotide bases at specific sites in a sequence)a sequence) the different conditions are the different conditions are called called character states character states

• The collection of character states with The collection of character states with respect to a feature constitute a respect to a feature constitute a charactercharacter

Page 11: Characters.ppt

Character evolutionCharacter evolution• Heritable changes (in morphology, gene Heritable changes (in morphology, gene

sequences, etc.) produce different character statessequences, etc.) produce different character states

• Similarities and differences in character states Similarities and differences in character states provide the basis for inferring phylogeny (i.e. provide the basis for inferring phylogeny (i.e. provide evidence of relationships)provide evidence of relationships)

• The utility of this evidence depends on how often The utility of this evidence depends on how often the evolutionary changes that produce the the evolutionary changes that produce the different character states occur independentlydifferent character states occur independently

Page 12: Characters.ppt

Unique and unreversed charactersUnique and unreversed characters• Given a heritable evolutionary change that is Given a heritable evolutionary change that is uniqueunique

and and unreversedunreversed (e.g. the origin of hair) in an ancestral (e.g. the origin of hair) in an ancestral species, the presence of the novel character state in species, the presence of the novel character state in any taxa must be due to inheritance from the ancestorany taxa must be due to inheritance from the ancestor

• Similarly, absence in any taxa must be because the Similarly, absence in any taxa must be because the taxa are not descendants of that ancestortaxa are not descendants of that ancestor

• The novelty is a The novelty is a homologyhomology acting as badge or marker acting as badge or marker for the descendants of the ancestorfor the descendants of the ancestor

• The taxa with the novelty are a clade (e.g. Mammalia)The taxa with the novelty are a clade (e.g. Mammalia)

Page 13: Characters.ppt

Unique and unreversed charactersUnique and unreversed characters• Because hair evolved only once and is unreversed Because hair evolved only once and is unreversed

(not subsequently lost) it is (not subsequently lost) it is homologoushomologous and provides and provides unambiguous evidence for of relationshipsunambiguous evidence for of relationships

Lizard

Frog

Human

Dog

HAIR

absentpresent

change or step

Page 14: Characters.ppt

To distinguish between an To distinguish between an ancestral and a derived ancestral and a derived

character state:character state:(1) If a sequence has the same base as the common ancestor then it is

the primitive or pleisomorphic state; otherwise it is a derived or apomorphic state.

Pleisomorphy

Apomorphy

Page 15: Characters.ppt

To distinguish between an To distinguish between an ancestral and a derived ancestral and a derived

character state:character state:

(2)Unique derived character states are autapomorphies , shared derived states are synapomorphies.

Page 16: Characters.ppt

• Homoplasy is similarity that is not homologous Homoplasy is similarity that is not homologous (not due to common ancestry)(not due to common ancestry)

• It is the result of independent evolution It is the result of independent evolution (convergence, parallelism, reversal)(convergence, parallelism, reversal)

• Homoplasy can provide misleading evidence of Homoplasy can provide misleading evidence of phylogenetic relationships (if mistakenly phylogenetic relationships (if mistakenly interpreted as homology)interpreted as homology)

Homoplasy - Independent evolution

Page 17: Characters.ppt

Homoplasy:Homoplasy:Homoplasy is a poor indicator of evolutionary relationships because the similarity

does not reflect shared ancestry.

It is sometimes useful to distinguish between different types of homoplasy ….Convergence, Parallel substitution and Reversals (Secondary Loss)

Page 18: Characters.ppt

Homoplasy - independent evolutionHomoplasy - independent evolution

HumanLizard

Frog Dog

TAIL (adult)

absentpresent

• Loss of tails evolved independently in humans and frogs - there are two steps on the true tree

Page 19: Characters.ppt

Homoplasy - misleading evidence of Homoplasy - misleading evidence of phylogenyphylogeny

• If misinterpreted as homology, the absence of tails If misinterpreted as homology, the absence of tails would be evidence for a wrong tree: grouping would be evidence for a wrong tree: grouping humans with frogs and lizards with dogshumans with frogs and lizards with dogs

Human

Frog

Lizard

Dog

TAIL

absentpresent

Page 20: Characters.ppt

Homoplasy - reversalHomoplasy - reversal• Reversals are evolutionary changes back to an Reversals are evolutionary changes back to an

ancestral conditionancestral condition

• As with any homoplasy, reversals can provide As with any homoplasy, reversals can provide misleading evidence of relationshipsmisleading evidence of relationships

True tree Wrong tree101 2 3 4 5 67 8 91 2 3 4 5 6 7 8 9 10

Page 21: Characters.ppt

Parallel evolution: the Parallel evolution: the independent evolution of same independent evolution of same

feature from same ancestral feature from same ancestral condition.condition.

Page 22: Characters.ppt

Convergent evolution: the Convergent evolution: the independent evolution of same independent evolution of same

feature from different ancestral feature from different ancestral condition.condition.

Page 23: Characters.ppt

Homoplasy - a fundamental Homoplasy - a fundamental problem of phylogenetic inferenceproblem of phylogenetic inference

• If there were no homoplastic similarities If there were no homoplastic similarities inferring phylogeny would be easy - all the inferring phylogeny would be easy - all the pieces of the jig-saw would fit together neatlypieces of the jig-saw would fit together neatly

• Distinguishing the misleading evidence of Distinguishing the misleading evidence of homoplasy from the reliable evidence of homoplasy from the reliable evidence of homology is a fundamental problem of homology is a fundamental problem of phylogenetic inferencephylogenetic inference

Page 24: Characters.ppt

Homoplasy and IncongruenceHomoplasy and Incongruence• If we assume that there is a single correct If we assume that there is a single correct

phylogenetic tree then:phylogenetic tree then:

• When characters support conflicting phylogenetic When characters support conflicting phylogenetic trees we know that there must be some misleading trees we know that there must be some misleading evidence of relationships among the evidence of relationships among the incongruent incongruent or or incompatibleincompatible characters characters

• Incongruence between two characters implies that at Incongruence between two characters implies that at least one of the characters is homoplastic and that at least one of the characters is homoplastic and that at least one of the trees the character supports is wrongleast one of the trees the character supports is wrong

Page 25: Characters.ppt

Incongruence or IncompatibilityIncongruence or Incompatibility

• These trees and characters are incongruent - both trees cannot These trees and characters are incongruent - both trees cannot be correct, at least one is wrong and at least one character must be correct, at least one is wrong and at least one character must be homoplasticbe homoplastic

Lizard

Frog

Human

Dog

HAIR

absentpresent

Human

Frog

Lizard

Dog

TAIL

absentpresent

Page 26: Characters.ppt

Distinguishing homology and Distinguishing homology and homoplasy homoplasy

• Morphologists use a variety of techniques to Morphologists use a variety of techniques to distinguish homoplasy and homologydistinguish homoplasy and homology

• Homologous features are expected to display detailed Homologous features are expected to display detailed similarity (in position, structure, development) similarity (in position, structure, development) whereas homoplastic similarities are more likely to be whereas homoplastic similarities are more likely to be superficialsuperficial

• As recognised by Charles Darwin congruence with As recognised by Charles Darwin congruence with other characters provides the most compelling other characters provides the most compelling evidence for homologyevidence for homology

Page 27: Characters.ppt

The importance of congruenceThe importance of congruence

• ““The importance, for classification, of trifling The importance, for classification, of trifling characters, mainly depends on their being characters, mainly depends on their being correlated with several other characters of correlated with several other characters of more or less importance. The value indeed of more or less importance. The value indeed of an aggregate of characters is very an aggregate of characters is very evident ........ a classification founded on any evident ........ a classification founded on any single character, however important that may single character, however important that may be, has always failed.”be, has always failed.”

• Charles Darwin: Origin of Species, Ch. 13Charles Darwin: Origin of Species, Ch. 13

Page 28: Characters.ppt

CongruenceCongruence

• We prefer the ‘true’ tree because it is supported We prefer the ‘true’ tree because it is supported by multiple congruent charactersby multiple congruent characters

Lizard

Frog

Human

Dog

MAMMALIAHairSingle bone in lower jawLactationetc.

Page 29: Characters.ppt

Homoplasy in molecular dataHomoplasy in molecular data• Incongruence and therefore homoplasy can be Incongruence and therefore homoplasy can be

common in molecular sequence datacommon in molecular sequence data– There are a limited number of alternative character There are a limited number of alternative character

states ( e.g. Only A, G, C and T in DNA)states ( e.g. Only A, G, C and T in DNA)

– Rates of evolution are sometimes highRates of evolution are sometimes high

• Character states are chemically identical Character states are chemically identical – homology and homoplasy are equally similarhomology and homoplasy are equally similar

– cannot be distinguished by detailed study of cannot be distinguished by detailed study of similarity and differencessimilarity and differences

Page 30: Characters.ppt

Parsimony analysisParsimony analysis

• Parsimony methods provide one way of Parsimony methods provide one way of choosing among alternative phylogenetic choosing among alternative phylogenetic hypotheses hypotheses

• The parsimony criterion favours hypotheses The parsimony criterion favours hypotheses that maximise congruence and minimise that maximise congruence and minimise homoplasyhomoplasy

• It depends on the idea of the fit of a character to It depends on the idea of the fit of a character to a treea tree

Page 31: Characters.ppt

Character Fit Character Fit • Initially, we can define the fit of a character to Initially, we can define the fit of a character to

a tree as the minimum number of steps a tree as the minimum number of steps required to explain the observed distribution of required to explain the observed distribution of character states among taxa character states among taxa

• This is determined by This is determined by parsimonious character parsimonious character optimizationoptimization

• Characters differ in their fit to different treesCharacters differ in their fit to different trees

Page 32: Characters.ppt

Character FitCharacter Fit

Page 33: Characters.ppt

Parsimony AnalysisParsimony Analysis• Given a set of characters, such as aligned Given a set of characters, such as aligned

sequences, parsimony analysis works by sequences, parsimony analysis works by determining the fit (number of steps) of each determining the fit (number of steps) of each character on a given treecharacter on a given tree

• The sum over all characters is called The sum over all characters is called Tree Tree LengthLength

• Most parsimonious trees (MPTs) have the Most parsimonious trees (MPTs) have the minimum tree length needed to explain the minimum tree length needed to explain the observed distributions of all the charactersobserved distributions of all the characters

Page 34: Characters.ppt

Parsimony informative sitesParsimony informative sites

• Not all sites are considered informative Not all sites are considered informative for tree constructionfor tree construction

• The only sites considered The only sites considered parsimony-parsimony-informative informative are those where at least 2 are those where at least 2 sequences have one character state at this sequences have one character state at this site and at least 2 others have a site and at least 2 others have a DIFFERENT IDENTICAL character DIFFERENT IDENTICAL character state.state.

Page 35: Characters.ppt

Most parsimonious tree construction

Sequence 1 G C T G A A C T C C

Sequence 2 G C T A A A C T G C

Sequence 3 G A G G A G C A G C

Sequence 4 G A G A A T T A C C

* * * * *

1

2

3

4

1

2

3

4

1 2

3 4

A

C

A

C

C

C

C

C

A

AA

A

1 2 3

C A A A A A

Tree num ber 1 2 3

Num ber of stepsCharacter 2 1 2 2Character 3 1 2 2Character 4 2 2 1Character 8 1 2 2Character 9 2 1 2

Total 7 9 9

Page 36: Characters.ppt

The differences between Wagner and Fitch Parsimony

Wagner Parsimony

There is an ordered change from one character state to another. This does not prevent

reversals of character state, merely that not all character states are free to change into any

other character states. Instead, transformation must proceed in an ordered manner

through a progression of character states.

Fitch parsimony

Any character state is free to change into any other character state and also this process is

reversible. There is no ordering of the character states and i t is not necessary for one

character state to pass through another in order to become transformed.

12

34

12

3

4

Wagner Parsimony

Fitch Parsimony

Page 37: Characters.ppt

Operation of the Fitch Operation of the Fitch AlgorithmAlgorithm

A GC T G

CT

CTG

AG

G

Sequence 1 Sequence 2

Sequence 3 Sequence 4

Sequence 5

Page 38: Characters.ppt

Parsimony in practiceParsimony in practice

Of these two trees, Tree 1 has the shortest length and is the most parsimoniousBoth trees require some homoplasy (extra steps)

Page 39: Characters.ppt

Class exercise in the operation of the Fitch Algorithm :

What is the total observed length of this tree ?

A A G T C

Page 40: Characters.ppt

Results of parsimony analysisResults of parsimony analysis• One or more most parsimonious treesOne or more most parsimonious trees

• Hypotheses of character evolution associated with Hypotheses of character evolution associated with each tree (where and how changes have occurred) each tree (where and how changes have occurred)

• Branch lengths (amounts of change associated with Branch lengths (amounts of change associated with branches)branches)

• Various tree and character statistics describing the fit Various tree and character statistics describing the fit between tree and databetween tree and data

• Suboptimal trees - optionalSuboptimal trees - optional

Page 41: Characters.ppt

Character typesCharacter types

• Characters may differ in the costs Characters may differ in the costs (contribution to tree length) made by different (contribution to tree length) made by different kinds of changeskinds of changes

• WagnerWagner (ordered, additive) (ordered, additive)

00 11 22 (morphology, unequal costs)(morphology, unequal costs)

• Fitch Fitch (unordered, non-additive)(unordered, non-additive)

AA G (morphology, molecules) G (morphology, molecules)

TT C C (equal costs for all changes)(equal costs for all changes)

one step

two steps

Page 42: Characters.ppt

Character typesCharacter types• Sankoff Sankoff (generalised)(generalised) AA G (morphology, molecules) G (morphology, molecules)

TT C C (user specified costs)(user specified costs)• For example, differential weighting of transitions and For example, differential weighting of transitions and

transversionstransversions• Costs are specified in a Costs are specified in a stepmatrixstepmatrix• Costs are usually symmetric but can be asymmetric Costs are usually symmetric but can be asymmetric

also (e.g. costs more to gain than to loose a restriction also (e.g. costs more to gain than to loose a restriction site)site)

one step

five steps

Page 43: Characters.ppt

StepmatricesStepmatrices• Stepmatrices specify the costs of changes within a characterStepmatrices specify the costs of changes within a character

A C G TA 0 5 1 5C 5 0 5 1G 1 5 0 5T 5 1 5 0

To

From

A G

CT

PURINES (Pu)

PYRIMIDINES (Py)

transitions Py Py Pu Pu

tra

nsv

ers

ion

s

Py

Pu

Different characters (e.g 1st, 2nd and 3rd) codon positions can also have differentweights

Page 44: Characters.ppt

Weighted parsimonyWeighted parsimony• If all kinds of steps of all characters have equal If all kinds of steps of all characters have equal

weight then parsimony:weight then parsimony:– Minimises homoplasy (extra steps)Minimises homoplasy (extra steps)

– Maximises the amount of similarity due to Maximises the amount of similarity due to common ancestry common ancestry

– Minimises tree lengthMinimises tree length

• If steps are weighted unequally parsimony If steps are weighted unequally parsimony minimises tree length - a weighted sum of the minimises tree length - a weighted sum of the cost of each charactercost of each character

Page 45: Characters.ppt

Why weight characters?Why weight characters?• Many systematists consider weighting unacceptable, but weighting is Many systematists consider weighting unacceptable, but weighting is

unavoidable (unweighted = equal weights)unavoidable (unweighted = equal weights)• Transitions may be more common than transversionsTransitions may be more common than transversions• Different kinds of transitions and transversions may be more or less Different kinds of transitions and transversions may be more or less

commoncommon• Rates of change may vary with codon positionsRates of change may vary with codon positions• The fit of different characters on trees may indicate differences in their The fit of different characters on trees may indicate differences in their

reliabilitiesreliabilities

• However, equal weighting is the commonest procedure and is the simplest However, equal weighting is the commonest procedure and is the simplest (but probably not the best) approach(but probably not the best) approach

Ciliate SSUrDNA data

Num

ber

of

Chara

cters

0

50

100

150

200

250

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 210

Number of steps

Page 46: Characters.ppt

Different kinds of changes Different kinds of changes differ in their frequenciesdiffer in their frequencies

ToA C G T

From

A

C

G

T

Transitions

Transversions

Unambiguous changeson most parsimonious tree of Ciliate SSUrDNA

Page 47: Characters.ppt

Parsimony - advantagesParsimony - advantages

• is a simple method - easily understood operationis a simple method - easily understood operation

• does not seem to depend on an explicit model of does not seem to depend on an explicit model of evolutionevolution

• gives both trees and associated hypotheses of gives both trees and associated hypotheses of character evolutioncharacter evolution

• should give reliable results if the data is well should give reliable results if the data is well structured and homoplasy is either rare or widely structured and homoplasy is either rare or widely (randomly) distributed on the tree(randomly) distributed on the tree

Page 48: Characters.ppt

Parsimony - disadvantagesParsimony - disadvantages• May give misleading results if homoplasy is common or May give misleading results if homoplasy is common or

concentrated in particular parts of the tree, e.g:concentrated in particular parts of the tree, e.g:- thermophilic convergencethermophilic convergence- base composition biasesbase composition biases- long branch attractionlong branch attraction

• Underestimates branch lengthsUnderestimates branch lengths• Model of evolution is implicit - behaviour of method not well Model of evolution is implicit - behaviour of method not well

understoodunderstood• Parsimony often justified on purely philosophical grounds - we Parsimony often justified on purely philosophical grounds - we

must prefer simplest hypotheses - particularly by must prefer simplest hypotheses - particularly by morphologistsmorphologists

• For most molecular systematists this is uncompellingFor most molecular systematists this is uncompelling

Page 49: Characters.ppt

Parsimony can be inconsistentParsimony can be inconsistent• Felsenstein (1978) developed a simple model phylogeny including four Felsenstein (1978) developed a simple model phylogeny including four

taxa and a mixture of short and long branchestaxa and a mixture of short and long branches

• Under this model parsimony will give the wrong treeUnder this model parsimony will give the wrong tree

• With more data the certainty that parsimony will give the wrong tree increases - so that parsimony is statistically inconsistent

• Advocates of parsimony initially responded by claiming that Felsenstein’s result showed only that his model was unrealistic

• It is now recognised that the long-branch attraction (in the Felsenstein Zone) is one of the most serious problems in phylogenetic inference

Long branches are attracted but the similarity is homoplastic

Page 50: Characters.ppt

Finding optimal trees - exact Finding optimal trees - exact solutionssolutions

• Exact solutions can only be used for small Exact solutions can only be used for small numbers of taxanumbers of taxa

• Exhaustive searchExhaustive search examines all possible examines all possible trees trees

• Typically used for problems with less Typically used for problems with less than 10 taxathan 10 taxa

Page 51: Characters.ppt

Finding optimal trees - exhaustive searchFinding optimal trees - exhaustive search

A

B C

1

2a

Starting tree, any 3 taxa

A

B D

C

A

BD C

A

B C

D2b 2c

E

E

EE

E

Add fourth taxon (D) in each of three possible positions -> three trees

Add fifth taxon (E) in each of the five possible positions on each of the three trees -> 15 trees, and so on ....

Page 52: Characters.ppt

Finding optimal trees - heuristics Finding optimal trees - heuristics

• The number of possible trees increases exponentially with The number of possible trees increases exponentially with the number of taxa making exhaustive searches the number of taxa making exhaustive searches impractical for many data sets (an NP complete problem)impractical for many data sets (an NP complete problem)

• Heuristic methods are used to search tree space for most Heuristic methods are used to search tree space for most parsimonious trees by building or selecting an initial tree parsimonious trees by building or selecting an initial tree and swapping branches to search for better onesand swapping branches to search for better ones

• The trees found are not guaranteed to be the most The trees found are not guaranteed to be the most parsimonious - they are best guessesparsimonious - they are best guesses

Page 53: Characters.ppt

Finding optimal trees - heuristicsFinding optimal trees - heuristics• Stepwise additionStepwise addition AsisAsis - the order in the data matrix - the order in the data matrix ClosestClosest -starts with shortest 3-taxon tree adds taxa in order -starts with shortest 3-taxon tree adds taxa in order

that produces the least increase in tree length (greedy that produces the least increase in tree length (greedy heuristic)heuristic)

Simple Simple - the first taxon in the matrix is a taken as a - the first taxon in the matrix is a taken as a reference - taxa are added to it in the order of their reference - taxa are added to it in the order of their decreasing similarity to the referencedecreasing similarity to the reference

RandomRandom - taxa are added in a random sequence, many - taxa are added in a random sequence, many different sequences can be useddifferent sequences can be used

• Recommend random with as many (e.g. 10-100) addition Recommend random with as many (e.g. 10-100) addition sequences as practicalsequences as practical

Page 54: Characters.ppt

Finding most parsimonious trees - Finding most parsimonious trees - heuristicsheuristics

• Branch Swapping:Branch Swapping:

Nearest neighbor interchange (NNI)Nearest neighbor interchange (NNI)

Subtree pruning and regrafting (SPR)Subtree pruning and regrafting (SPR)

Tree bisection and reconnection (TBR)Tree bisection and reconnection (TBR)

Other methods .... Other methods ....

Page 55: Characters.ppt

Finding optimal trees - heuristicsFinding optimal trees - heuristics

• Nearest neighbor interchange (NNI)Nearest neighbor interchange (NNI)

A

B

C DE

F

G

A

B

D CE

F

G

A

B

C D

E

F

G

Page 56: Characters.ppt

Finding optimal trees - heuristicsFinding optimal trees - heuristics

• Subtree pruning and regrafting (SPR)Subtree pruning and regrafting (SPR)

A

B

C DE

F

G

A

B

C DE

F

G

C

D

G

B

A

E F

Page 57: Characters.ppt

Finding optimal trees - heuristicsFinding optimal trees - heuristics

• Tree bisection and reconnection (TBR)Tree bisection and reconnection (TBR)

Page 58: Characters.ppt

Finding optimal trees - heuristicsFinding optimal trees - heuristics

• Branch SwappingBranch Swapping Nearest neighbor interchange (NNI)Nearest neighbor interchange (NNI) Subtree pruning and regrafting (SPR)Subtree pruning and regrafting (SPR) Tree bisection and reconnection (TBR)Tree bisection and reconnection (TBR)

• The nature of heuristic searches means we cannot The nature of heuristic searches means we cannot know which method will find the most know which method will find the most parsimonious trees or all such treesparsimonious trees or all such trees

• However, TBR is the most extensive swapping However, TBR is the most extensive swapping routine and its use with multiple random addition routine and its use with multiple random addition sequences should work wellsequences should work well

Page 59: Characters.ppt

Tree space may be populated by local minima Tree space may be populated by local minima and islands of optimal treesand islands of optimal trees

GLOBAL MINIMUM

LocalMinimum

LocalMinima

TreeLength

RANDOM ADDITION SEQUENCE REPLICATES

SUCCESSFAILURE FAILURE

Branch SwappingBranch Swapping

Branch Swapping

Page 60: Characters.ppt

Parsimonious Character OptimizationParsimonious Character Optimization

A B C D E

*

*0 => 1

==

OR parallelism 2 separate origins 0 => 1 (DELTRAN)

originandreversal(ACCTRAN)

0 0 1 1 0

1 => 0

Homoplastic characters often have alternative equally parsimonious optimizationsCommonly used varieties are:ACCTRAN - accelerated transformationDELTRAN - delayed transformation

Consequently, branch lengths are not always fully determined

PAUP reports minimum and maximum branch lengths

Page 61: Characters.ppt

Multiple optimal treesMultiple optimal trees

• Many methods can yield multiple equally Many methods can yield multiple equally optimal treesoptimal trees

• We can further select among these trees with We can further select among these trees with additional criteria, butadditional criteria, but

• Typically, relationships common to all the Typically, relationships common to all the optimal trees are summarised with optimal trees are summarised with consensus consensus treestrees

Page 62: Characters.ppt

Consensus methodsConsensus methods

• A consensus tree is a summary of the agreement A consensus tree is a summary of the agreement among a set of fundamental treesamong a set of fundamental trees

• There are many consensus methods that differ in:There are many consensus methods that differ in:

1. the kind of agreement1. the kind of agreement

2. the level of agreement2. the level of agreement

• Consensus methods can be used with multiple trees Consensus methods can be used with multiple trees from a single analysis or from multiple analysesfrom a single analysis or from multiple analyses

Page 63: Characters.ppt

Strict consensus methodsStrict consensus methods• Strict consensus methods require agreement across all the Strict consensus methods require agreement across all the

fundamental treesfundamental trees• They show only those relationships that are unambiguously They show only those relationships that are unambiguously

supported by the parsimonious interpretation of the datasupported by the parsimonious interpretation of the data• The commonest method (The commonest method (strict component consensusstrict component consensus) )

focuses on clades/components/full splitsfocuses on clades/components/full splits• This method produces a consensus tree that includes all and This method produces a consensus tree that includes all and

only those full splits found in all the fundamental treesonly those full splits found in all the fundamental trees• Other relationships (those in which the fundamental trees Other relationships (those in which the fundamental trees

disagree) are shown as unresolved polytomiesdisagree) are shown as unresolved polytomies• Implemented in PAUPImplemented in PAUP

Page 64: Characters.ppt

Strict consensus methodsStrict consensus methods

A B C D E F G A B C E D F G

TWO FUNDAMENTAL TREES

STRICT COMPONENT CONSENSUS TREE

A B C D E F G

Page 65: Characters.ppt

Majority-rule consensus methodsMajority-rule consensus methods• Majority-rule consensus methods require agreement across Majority-rule consensus methods require agreement across

a majority of the fundamental treesa majority of the fundamental trees• May include relationships that are not supported by the May include relationships that are not supported by the

most parsimonious interpretation of the datamost parsimonious interpretation of the data• The commonest method focuses on clades/components/full The commonest method focuses on clades/components/full

splitssplits• This method produces a consensus tree that includes all and This method produces a consensus tree that includes all and

only those full splits found in a majority (>50%) of the only those full splits found in a majority (>50%) of the fundamental treesfundamental trees

• Other relationships are shown as unresolved polytomiesOther relationships are shown as unresolved polytomies• Of particular use in bootstrappingOf particular use in bootstrapping• Implemented in PAUPImplemented in PAUP

Page 66: Characters.ppt

Majority rule consensusMajority rule consensus

A B C D E F G A B C E D F G

A B C E D F G

MAJORITY-RULE COMPONENT CONSENSUS TREE

A B C E F D G

100

66

66

66

66

THREE FUNDAMENTAL TREES

Numbers indicate frequency ofclades in the fundamental trees

Page 67: Characters.ppt

Reduced consensus methodsReduced consensus methods• Focuses upon any relationships (not just full splits)Focuses upon any relationships (not just full splits)• Reduced consensus methods occur in strict and Reduced consensus methods occur in strict and

majority-rule varietiesmajority-rule varieties• Other relationships are shown as unresolved Other relationships are shown as unresolved

polytomiespolytomies• May be more sensitive than methods focusing only on May be more sensitive than methods focusing only on

clades/components/full splitsclades/components/full splits• Strict reduced consensus methods are implemented in Strict reduced consensus methods are implemented in

RadConRadCon

Page 68: Characters.ppt

Reduced consensus methodsReduced consensus methodsTWO FUNDAMENTAL TREES

STRICT REDUCED CONSENSUS TREE Taxon G is excluded

A B C D E F

A B C D E F G

Strict component consensuscompletely unresolved

A B C D E F G A G B C D E F

Page 69: Characters.ppt

Consensus methodsConsensus methodsThree fundamental trees

Spirostomumum

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymena

TracheloraphisEuplotesGruberia

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymenaSpirostomumumEuplotesTracheloraphisGruberia

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymenaEuplotesSpirostomumumTracheloraphisGruberia

From these 3 fundamental trees , construct (1) the Strict component tree

(2) The Strict reduced cladistic (3) The majority rule tree

Page 70: Characters.ppt

Consensus methodsConsensus methods

Spirostomumum

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymena

TracheloraphisEuplotesGruberia

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymenaSpirostomumumEuplotesTracheloraphisGruberia

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymenaEuplotesSpirostomumumTracheloraphisGruberia

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymenaTracheloraphisSpirostomumEuplotesGruberia

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymenaSpirostomumEuplotesTracheloraphisGruberia

Ochromonas

SymbiodiniumProrocentrumLoxodesTetrahymenaSpirostomumTracheloraphisGruberia

Three fundamental trees

majority-rule

strict (component) strict reduced cladisticEuplotes excluded

100

100100

100

6666

Page 71: Characters.ppt

Consensus methodsConsensus methods• Use strict methods to identify those relationships Use strict methods to identify those relationships

unambiguously supported by parsimonious unambiguously supported by parsimonious interpretation of the datainterpretation of the data

• Use reduced methods where consensus trees are Use reduced methods where consensus trees are poorly resolvedpoorly resolved

• Use majority-rule methods in bootstrappingUse majority-rule methods in bootstrapping

• Avoid other methods which have ambiguous Avoid other methods which have ambiguous interpretationsinterpretations