distance between tree topologies. d c h splits a b e f g {a}{bcdefgh} {b}{acdefgh} {ab}{cdefgh}...
Post on 22-Dec-2015
217 views
TRANSCRIPT
Distance between tree topologies
D
CH
Splits
A
B
E F
G
{A}{BCDEFGH}{B}{ACDEFGH}{AB}{CDEFGH}{C}{ABDEFGH}{CD}{ABEFGH}{ABCD}{EFGH}
Each split represents a branch and there is a 1-1 correspondence between the tree topology and the list of all splits
Splits
Splits, which correspond to external branches, are trivial (found in all tree topologies).{A}{BCDEFGH},{B}{ACDEFGH},{C}{ABDEFGH}
Splits, which correspond to internal branches, are those which determine the topology.{AB}{CDEFGH},{CD}{ABEFGH},{ABCD}{EFGH}
Splits
For an unrooted tree with n leaves, there are 2n-3 branches, n external branches and n-3 internal branches -> n-3 non trivial splits.
Shared internal branches
DC
H
A
B
E
F G
DC
H
A
B
E
F
G
Internal branches exist in one tree but not in the other
DC
H
A
B
E
F G
DC
H
A
B
E
F
G
Robinson-Foulds distance = 6
Robinson-Foulds distance
•The distance was suggested in: Roubinson DF and Foulds LR (1981) Comparison of phylogenetic trees. Math Biosci. 53:131-147.
•For an unrooted tree with n taxa, the min distance is 0, the max is 2(n-3).
•The distance ignores branch lengths.
•Zero branches are not treated as multifurcations.•Note that the splits {R1}{R2} and {R2}{R1} are identical.
Kuhner-Felsenstein’s “BRANCH SCORE”.distance
•The distance was suggested in: Kuhner MK and Felsenstein J (1994) A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol. Biol. Evol. 11:459-468.
•The motivation is to extend RF distance so that it accounts ALSO for differences in branch lengths.
•The distance was used to evaluate performance of ML, NJ, and MP in simulations (distance between inferred tree and “true” tree).
Branch-Score (Bs) distance
Bs =
If a branch is found in both tree (shared split) – its contribution to the distance is the square of the differences between the branches’ lengths in the two trees.
If a branch is found only in one tree – it is considered that a branch of length 0 exist in the other tree
CA
D
B
D
A
CBxb
xab
ydxd
xa xcya
ycyac
yb
222222 )0()0()()()()( acabddccbbaa yxyxyxyxyxBs
Branch-Score (Bs) distance
Bs extends RF if one replaces all branch lengths to equal 1
CA
D
B
D
A
CBxb
xab
ydxd
xa xcya
ycyac
yb
2)01()01()11()11()11()11( 222222 RF
222222 )0()0()()()()( acabddccbbaa yxyxyxyxyxBs
Another look at the Bs distance
Consider an array of all possible splits for n taxa.(B1,B2,…..,BN).
Each tree can be represented by such an array, in which Bi = 0, if the split is not found in the tree, and the length of the relevant branch if the split is found.
Bs distance between (B1,B2,…..,BN) and (B1’,B2’,…..,BN’) becomes
Bs distance is the square Euclidean distance, and hence it is a distance (e.g., the triangle inequality holds).
2
1
)'( BiBiBsN
i
Are these distances true distances
Formally, a distance must have 3 properties:D(a,a)=0 for all a.D(a,b)=D(b,a) for all a,b (symmetry).D(a,c)<=D(a,b)+D(b,c) for all a,b,c (The triangle inequality).
Bs distance is the square Euclidean distance, and hence it is a distance (e.g., the triangle inequality holds).
2
1
)'( BiBiBsN
i