the evolution trees from: computational biology by r. c. t. lee s. j. shyu department of computer...
Post on 19-Dec-2015
213 views
TRANSCRIPT
The Evolution TreesThe Evolution Trees
From: Computational Biology by R. C. T. Lee
S. J. ShyuDepartment of Computer Science
Ming Chuan University
S. J. Shyu
Biological AssumptionBiological Assumption All species evolve from a
common ancestor. Root: the suspected
common ancestor Leaves: species (alive) Internal nodes: unknown
species Length on edge (a, b):
the time needed to evolve from a to b
S. J. Shyu
Assumptions on Evolution Assumptions on Evolution Trees for Computer Scientists Trees for Computer Scientists
(I)(I) Rooted evolution tree
Degree of each internal node is 3, except the root.
Unrooted evolution tree Degree of each internal node is 3.
S. J. Shyu
Assumptions on Evolution Assumptions on Evolution Trees for Computer Scientists Trees for Computer Scientists
(II)(II) The input is a distance matrix among all of the species. The distances satisfy the triangular inequality relation
ship. Depending upon different conditions, different ETs will
be constructed to reflect the distances among species. Let d(si,sj) (dt(si,sj)) be the distance between si and si in
the distance matrix (some evolution tree). Then dt(i,j)d(i,j).
If the ET is rooted, then the distance from the root too all leaves are the same.
S. J. Shyu
How many unrooted evolutioHow many unrooted evolution trees are there? (I)n trees are there? (I)
NE(n)=2n-3(by induction)
Whenever a newspecies is added,#edgs+=2.
S. J. Shyu
How many unrooted evolution How many unrooted evolution trees are there? (II)trees are there? (II)
Every edge is possible to be split to add a new species.
Number of unrooted ETs for n species:TU(n+1)=(2n-3)TU(n) orTU(n)=(2n-5)TU(n-1)
S. J. Shyu
How many rooted How many rooted evolution trees are there? evolution trees are there?
(I)(I) Every edge in an unrooted ET is possible
to be split to add a root to turn the ET into a rooted one.
S. J. Shyu
How many rooted How many rooted evolution trees are there? evolution trees are there?
(II)(II) Number of rooted ETs for n species:
TR(n)=(2n-3)TU(n)TU(n)=TU(n+1)
# of rooted ETs is much larger than that of unrooted ETs.
It is desirable to consider unrooted Ets. Still, we cannot explain evolution by an un
rooted tree.
S. J. Shyu
Transforming unrooted ETs Transforming unrooted ETs into rooted onesinto rooted ones
Add a species which is exceeding different from the species analyzed. This outlier species causes a long link that can be used to identify a root.
S. J. Shyu
Distance matrix vs. evoluitDistance matrix vs. evoluiton treeson trees
The input of an evolution tree problem is a distance matrix.
We are asked to construct an evolution tree to perfectly reflect these distances.
The goodness of an evolution tree is evaluated under some criterion.
S. J. Shyu
Criteria of Criteria of evoluiton treesevoluiton trees
Let d(si,sj) (dt(si,sj)) be the distance between si and sj in the distance matrix (some evolution tree).
1. Minimax ETsmax. of (dt(si,sj)-d(si,sj)) is minimized
2. Minisum ETstotal sum of all pairs of distances (dt(si,sj)) is minimized
3. Minisize ETstotal length of the tree is minimized
S. J. Shyu
The complexity of ET problemsThe complexity of ET problems
Minimax Minisum Minisize
Unrooted NP-complete
NP-complete
Unknown
Rooted O(n2) NP-complete
NP-complete
S. J. Shyu
A Minimax rooted ET AlgorithmA Minimax rooted ET Algorithm Idea: Preserve the longest distance
Suppose d(si,sj) is the longest distance in the input matrix.
dt(si,sj) = d(si,sj)
• Recursively applyfor Ti and Tj
• Which species are in Ti (Ti)?
S. J. Shyu
5.4 The determination of weights5.4 The determination of weights when the ET structure is given when the ET structure is given
What’s the minisize unrooted ET if the ET structure is given?
As above for minisize unrooted ET
As above for minimax unrooted ET
How to determine the structure of ET is a problem. (# of possible ET is exponential to n.)
By linear programming
Open
S. J. Shyu
5.5 UPGMA for rooted ETs5.5 UPGMA for rooted ETs A heuristic to determine a reasonably go
od structure of rooted ETs.
S. J. Shyu
5.6 The neighbor joining5.6 The neighbor joining method for unrooted ETs method for unrooted ETs
A heuristic to determine a reasonably good structure of unrooted ETs.
S. J. Shyu
Example (NJ) (I)Example (NJ) (I)
w(x, s1)=1/3(d(s1,s2)+d(s1,s3)+d(s1,s4)) =1/3(4+4+3)=3.67
(the mean of the distances from this species to all other species)
w(x, si) = average(si) = (d(si,sj))/(n-1)ij
1 center
S. J. Shyu
Example (NJ) Example (NJ) (III)(III)
OC-NC = (average(s1)+average(s2)) - (average(s1)+average(s2)+d(s1,s2))/2 =(average(s1)+average(s2)-d(s1,s2))/2
d(s1,s2) is preserved
S. J. Shyu
5.6 An Approximation Algorithm5.6 An Approximation Algorithm for an unrooted minisize ETfor an unrooted minisize ET
unrooted minisize ET: no polynomial algorithm; never been proved to be NP-complete
An approximation algorithm with size smaller than twice of the size of an optimal solution:
S. J. Shyu
Example (I)Example (I)
Minimal spanning tree an unrooted minisize ETan unrooted minisize ETWith error ratio With error ratio 11
S. J. Shyu
Proof of correctness (I)Proof of correctness (I) Evolution tree?
Degree of each internal node=3 dt(si,sj) d(si,sj)
dt(si,sj) = dMST(si,sj) dMST (si,sj) dt(si,sj) (triangular
inequality)
Error ratio 1 1
S. J. Shyu
Hamiltonian cycle and Hamiltonian cycle and Euler cycleEuler cycle
Hamiltonian cycle: a cycle visiting all of the nodes in G=(V,E) exactly once
Euler cycle: a cycle traversing each edge in G=(V,E) exactly once
S. J. Shyu
Proof of correctness (II)Proof of correctness (II) Error ratio 1 1
1. |MST| |TSP| P Delete any edge in TSP |P|<|TSP| |MST| |P|
2. |A|=|MST| |TSP|3. |TSP| 2 2|OPT|4. Duplicate every edge of a tree there is an Euler tour (Et)5. T: optimal ET |Et| 2 2|OPT|6. CEt: the cycle of species of Et |CEt| |Et| (dt(si,sj)
d(si,sj)) 7. |A|=|MST||TSP||CEt||Et|22|OPT|
S. J. Shyu
5.6 The minimal spanning tree preservation5.6 The minimal spanning tree preservationapproach for ET constructionapproach for ET construction
D: input distance matrixDt: distance matrix of the ET
MST preservation: MST(D) is an MST(Dt)