the evolution trees from: computational biology by r. c. t. lee s. j. shyu department of computer...

31
The Evolution Trees The Evolution Trees From: Computational Biology by R. C. T. Lee S. J. Shyu Department of Computer Science Ming Chuan University

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

The Evolution TreesThe Evolution Trees

From: Computational Biology by R. C. T. Lee

S. J. ShyuDepartment of Computer Science

Ming Chuan University

S. J. Shyu

Biological AssumptionBiological Assumption All species evolve from a

common ancestor. Root: the suspected

common ancestor Leaves: species (alive) Internal nodes: unknown

species Length on edge (a, b):

the time needed to evolve from a to b

S. J. Shyu

Assumptions on Evolution Assumptions on Evolution Trees for Computer Scientists Trees for Computer Scientists

(I)(I) Rooted evolution tree

Degree of each internal node is 3, except the root.

Unrooted evolution tree Degree of each internal node is 3.

S. J. Shyu

Assumptions on Evolution Assumptions on Evolution Trees for Computer Scientists Trees for Computer Scientists

(II)(II) The input is a distance matrix among all of the species. The distances satisfy the triangular inequality relation

ship. Depending upon different conditions, different ETs will

be constructed to reflect the distances among species. Let d(si,sj) (dt(si,sj)) be the distance between si and si in

the distance matrix (some evolution tree). Then dt(i,j)d(i,j).

If the ET is rooted, then the distance from the root too all leaves are the same.

S. J. Shyu

How many unrooted evolutioHow many unrooted evolution trees are there? (I)n trees are there? (I)

NE(n)=2n-3(by induction)

Whenever a newspecies is added,#edgs+=2.

S. J. Shyu

How many unrooted evolution How many unrooted evolution trees are there? (II)trees are there? (II)

Every edge is possible to be split to add a new species.

Number of unrooted ETs for n species:TU(n+1)=(2n-3)TU(n) orTU(n)=(2n-5)TU(n-1)

S. J. Shyu

How many rooted How many rooted evolution trees are there? evolution trees are there?

(I)(I) Every edge in an unrooted ET is possible

to be split to add a root to turn the ET into a rooted one.

S. J. Shyu

How many rooted How many rooted evolution trees are there? evolution trees are there?

(II)(II) Number of rooted ETs for n species:

TR(n)=(2n-3)TU(n)TU(n)=TU(n+1)

# of rooted ETs is much larger than that of unrooted ETs.

It is desirable to consider unrooted Ets. Still, we cannot explain evolution by an un

rooted tree.

S. J. Shyu

Transforming unrooted ETs Transforming unrooted ETs into rooted onesinto rooted ones

Add a species which is exceeding different from the species analyzed. This outlier species causes a long link that can be used to identify a root.

S. J. Shyu

Distance matrix vs. evoluitDistance matrix vs. evoluiton treeson trees

The input of an evolution tree problem is a distance matrix.

We are asked to construct an evolution tree to perfectly reflect these distances.

The goodness of an evolution tree is evaluated under some criterion.

S. J. Shyu

Criteria of Criteria of evoluiton treesevoluiton trees

Let d(si,sj) (dt(si,sj)) be the distance between si and sj in the distance matrix (some evolution tree).

1. Minimax ETsmax. of (dt(si,sj)-d(si,sj)) is minimized

2. Minisum ETstotal sum of all pairs of distances (dt(si,sj)) is minimized

3. Minisize ETstotal length of the tree is minimized

S. J. Shyu

The complexity of ET problemsThe complexity of ET problems

Minimax Minisum Minisize

Unrooted NP-complete

NP-complete

Unknown

Rooted O(n2) NP-complete

NP-complete

S. J. Shyu

A Minimax rooted ET AlgorithmA Minimax rooted ET Algorithm Idea: Preserve the longest distance

Suppose d(si,sj) is the longest distance in the input matrix.

dt(si,sj) = d(si,sj)

• Recursively applyfor Ti and Tj

• Which species are in Ti (Ti)?

S. J. Shyu

Minimax rooted ET AlgorithmMinimax rooted ET Algorithm

S. J. Shyu

ExampleExample

S. J. Shyu

5.4 The determination of weights5.4 The determination of weights when the ET structure is given when the ET structure is given

What’s the minisize unrooted ET if the ET structure is given?

As above for minisize unrooted ET

As above for minimax unrooted ET

How to determine the structure of ET is a problem. (# of possible ET is exponential to n.)

By linear programming

Open

S. J. Shyu

ExamplesExamples

S. J. Shyu

5.5 UPGMA for rooted ETs5.5 UPGMA for rooted ETs A heuristic to determine a reasonably go

od structure of rooted ETs.

S. J. Shyu

Example (UPGMA)Example (UPGMA)

S. J. Shyu

5.6 The neighbor joining5.6 The neighbor joining method for unrooted ETs method for unrooted ETs

A heuristic to determine a reasonably good structure of unrooted ETs.

S. J. Shyu

Example (NJ) (I)Example (NJ) (I)

w(x, s1)=1/3(d(s1,s2)+d(s1,s3)+d(s1,s4)) =1/3(4+4+3)=3.67

(the mean of the distances from this species to all other species)

w(x, si) = average(si) = (d(si,sj))/(n-1)ij

1 center

S. J. Shyu

Example (NJ) (II)Example (NJ) (II)Geometrical center

OC-NC=8.67-6.33 =2.34

S. J. Shyu

Example (NJ) Example (NJ) (III)(III)

OC-NC = (average(s1)+average(s2)) - (average(s1)+average(s2)+d(s1,s2))/2 =(average(s1)+average(s2)-d(s1,s2))/2

d(s1,s2) is preserved

S. J. Shyu

5.6 An Approximation Algorithm5.6 An Approximation Algorithm for an unrooted minisize ETfor an unrooted minisize ET

unrooted minisize ET: no polynomial algorithm; never been proved to be NP-complete

An approximation algorithm with size smaller than twice of the size of an optimal solution:

S. J. Shyu

Example (I)Example (I)

Minimal spanning tree an unrooted minisize ETan unrooted minisize ETWith error ratio With error ratio 11

S. J. Shyu

Proof of correctness (I)Proof of correctness (I) Evolution tree?

Degree of each internal node=3 dt(si,sj) d(si,sj)

dt(si,sj) = dMST(si,sj) dMST (si,sj) dt(si,sj) (triangular

inequality)

Error ratio 1 1

S. J. Shyu

Hamiltonian cycle and Hamiltonian cycle and Euler cycleEuler cycle

Hamiltonian cycle: a cycle visiting all of the nodes in G=(V,E) exactly once

Euler cycle: a cycle traversing each edge in G=(V,E) exactly once

S. J. Shyu

Proof of correctness (II)Proof of correctness (II) Error ratio 1 1

1. |MST| |TSP| P Delete any edge in TSP |P|<|TSP| |MST| |P|

2. |A|=|MST| |TSP|3. |TSP| 2 2|OPT|4. Duplicate every edge of a tree there is an Euler tour (Et)5. T: optimal ET |Et| 2 2|OPT|6. CEt: the cycle of species of Et |CEt| |Et| (dt(si,sj)

d(si,sj)) 7. |A|=|MST||TSP||CEt||Et|22|OPT|

S. J. Shyu

5.6 The minimal spanning tree preservation5.6 The minimal spanning tree preservationapproach for ET constructionapproach for ET construction

D: input distance matrixDt: distance matrix of the ET

MST preservation: MST(D) is an MST(Dt)

S. J. Shyu

Example (I)Example (I)

S. J. Shyu

Example (II)Example (II)