a linear-time algorithm for computing inversion distance between signed permutations with an...

52
A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented by Anat Heilper

Post on 22-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

A Linear-Time Algorithm for Computing Inversion Distance between signed

Permutations with an experimental Study

David Bader, Bernard Moret, Mi Yan

Presented by Anat Heilper

Page 2: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Motivation

Evolutionary process on many single-chromosome organisms, consists mostly of inversions.

Due to that, phylogenies are reconstructed based on gene order, using inversion distance as a measure of evolutionary distance between 2 genomes.

Page 3: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Model assumptions

Each genome is an ordering (circular/linear) of a fixed set of genes {g1,g2,…gn}.

Each gene is given with an orientation: positive(gi), or negative(-gi).

Page 4: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Inversion

Let G be the genome with the signed ordering(linear or circular) g1, g2,…gn.

Inversion(i, j), i j: g1, g2,…, gi-1, gi, gi+1,…,gj, gj+1,…, gn

g1, g2,…, gi-1, -gj, -gj-1,…, -gi , gj+1,…,gn

Page 5: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

What happens if j<i?

Circular case: rotate the circular ordering until the proper relationship between the indices is received.

Linear case: not applicable.

Inversion

Page 6: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Inversion Distance

Minimum number of inversions needed to transform one permutation into the other.

Computing Inversion Distance between unsigned permutations is NP-hard.

Page 7: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Inversion Distance Sorting

Computing the shortest sequence of inversions can be also treated as a sorting problem.

Also Known as “sorting by reversals”.

Page 8: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Previous work:Hannenhalli and Pevsner, 1995 – first polynomial-time algorithm.

Berman and Hannenhalli, 1996 – runs in O(n(n)), where (n) is the inverse Ackerman function

Bader, Moret, and Yan, 2000 – runs in linear-time algorithm.

Page 9: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

What we shall see today?

Present a simple and practical, linear time algorithm to compute the connected components of the overlap graph.

Present experimental evidence that this algorithm is really efficient.

Page 10: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Transformation

Given a signed permutation of {1…n}, we transform it into an unsigned permutation of {0… 2n+1}:

Substitute each positive element x by the ordered pair (2x-1, 2x).

Substitute each negative element –x by the ordered pair (2x,2x-1).

(0) = 0, (2n+1) = 2n+1.

Page 11: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Example

+3 +9 -7 +5 -10 +8 +4 -6 +11 +2 +1

5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

X (2x-1, 2x)-X(2x,2x-1)(0) = 0, (2n+1) = 2n+1

Page 12: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

AssumptionsBoth permutations have been turned into unsigned permutations.

Both unsigned permutations are transformed so that the first permutation becomes the identity permutation (0,1,..2n,2n+1).

This is why this problem can be viewed as sorting: find the number of inversions needed to transform the given permutation into the identity permutation.

Page 13: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Cycle graph We represent the unsigned permutation by an edge-colored graph called the “cycle graph”.

Properties of the graph:2n+2 vertices.

For each i, 0 i n, there’s a gray edge between vertices (2i) and (2i+1).

There is a black edge between vertices 2i and 2i+1.

Page 14: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Cycle graph

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

i

(i)

The resulting graph consists of disjoint cycles in which edges alternate colors.

Page 15: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Overlapping edges2 gray edges ((i) , (j)) and ((k), (t)) overlap if the 2 intervals [i,j] and [k,t] overlap but neither one contains the other.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

i

(i)

Page 16: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Overlapping cycles2 cycles C1 and C2 overlap if there exist overlapping gray edges e1 C1 and e2C2.

Extent of a cycle C: is the interval [C.B, C.E] where C.B = min {i| i C}, B stands for the beginning of the cycle,

and C.E = max{i| i C}, E stands for the ending of the cycle.

Extent of a set of cycles {c1…ck} is [B,E] where B = min {Ci.B| 0 i k} and E = max{Ci.E| 0 i k}.

Page 17: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Overlap Graph of permutation One vertex for each cycle in the cycle graph.

An edge between any 2 vertices that correspond to overlapping cycles.

Page 18: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Overlap graph

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

i

(i)

a

b

c

d

e f

One vertex for each cycle in the cycle graph.

Page 19: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Overlap grapha

b

c

d

e f

a fb d c

e

An edge between any 2 vertices that correspond to overlapping cycles.

Page 20: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

By now we know how to:Transform a signed permutation to an unsigned permutation.

Create an overlap graph from the cycle graph.

Compute the minimum number of inversion and inversion sequence from the overlap graph.

The bottleneck of the algorithm is building the overlap graph(o(n^2)).

How can we improve the algorithm?Build a much smaller graph which captures the same information as the overlap graph.

Page 21: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Building the overlap forest

Creating the overlap forest in two scans of the permutation:

First scan: Build a trivial forest F0 in which each node is its own forest.

Second scan: Iterative refinement of the forest, at each iteration we expand the domain where we search for overlapping cycles with one node. (2n+2 iterations overall).

Active tree: a tree rooted at f is active at stage j whenever j lies properly within the extent of f; the extent of the active trees is stores in a stack

Page 22: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Let Fj-1 be the forest constructed from processing the elements 0 through j-1. Let f be the cycle containing element j of the permutation. Build Fj from Fj-1 as follows:

If j is the beginning of its own cycle f then it is the root of a single node tree.

Otherwise if cycle f overlaps with cycle g, add an edge (g,f) and compute the combined extent of g and of the tree rooted at f.

Building the overlap forest – refinement stage

Page 23: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Building the overlap graph1) Scan the permutation, label each position i with

C[i].B, and set up [C[i].B, C[i].E].2) Initialize empty stack.3) For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].b) Extent C[i].

While(top.B > C[i].B){Extent.B min{extent.B, top.B}Extent.E max{ extent.E, top.E}parent[top.B] C[i].BPop top

End while}Top.B min{extent.B, top.B}Top.E max{extent.E, top.E}

c) if (i == top.E) then pop top.

i is the beginning of its own cycle.

Top is the top

element of the stack

The cycle i belongs to intersection with

another cycle

The tree at the top of the stack isn’t active anymore

Page 24: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Example of building an overlap forest

i

(i)

C[i].B

C[i].E

C[i]

First stage: setup a

b

c

d

e f

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 11 11 15 15 17 17 11 11 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 25: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 11 11 15 15 17 17 11 11 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

i

(i)

C[i].B

C[i].E

C[i]

i = 0:

a)i = C[0].B push A

b)extent A

while(0=top.B>C[i].B=0)?NO

C[0].B 0

C[0].E21

c) if(0==21)?NO

A

A

extent

topFor i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

Page 26: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

A

i

(i)

C[i].B

C[i].E

C[i]

A

extent

topi = 1:

a)1=i C[1].B=0

b) extent A

while(0=top.B>C[i].B=0)

C[1].B 0

C[1].E21

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 11 11 15 15 17 17 11 11 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

Page 27: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

B

A

i

(i)

C[i].B

C[i].E

C[i]

B

extent

i = 2:

a)i = C[2].B push B

b) extent B

while(2=top.B>C[i].B=2)

C[2].B 2

C[2].E13

i = 3:

very similar to i = 2.

top

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 11 11 15 15 17 17 11 11 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

Page 28: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

C

B

A

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

C

extent

i = 4:

a)i = C[4].B push C

b) extent C

while(4=top.B>C[i].B=4)

C[4].B 4

C[4].E11

i = 5:

very similar to the case of i = 4.

i

(i)

C[i].B

C[i].E

C[i]

top

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 11 11 15 15 17 17 11 11 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 29: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

D

C

B

A

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

D

extent

i = 6:

a)i = C[6].B push D

b) extent D

while(6=top.B>C[i].B=6)

C[6].B 6

C[6].E15

i = 7:

very similar to the case of i = 6.

i

(i)

C[i].B

C[i].E

C[i]

top

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 11 11 15 15 17 17 11 11 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 30: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

E

D

C

B

A

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

E

extent

i = 8:

a)i = C[8].B push E

b) extent E

while(8=top.B>C[i].B=8)

C[8].B 8

C[8].E17

i = 9:

very similar to the case of i = 8.

i

(i)

C[i].B

C[i].E

C[i]

top

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 11 11 15 15 17 17 11 11 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 31: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

E

D

C

B

A

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

C

extent

i = 10:

a)i C[10].B

b) extent C

while(8=top.B>C[i].B=4)

extent.B min{4,8}

extent.Emax{11, 17}

parent[ 8] 4

pop top

while(top.B=6>C[i].B =4)

extent.B min{4,6}

extent.Emax{17,15}

parent[ 6] 4

pop top

i

(i)

C[i].B

C[i].E

C[i]

top

D

C

B

A

top

c

e

c

e d

[4,17]

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 11 11 15 15 17 17 11 11 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 32: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

C

B

A

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

E

extent

i = 10 continue:

while(top.B = 4>C[i].B =4)?NO

top.B min{4,4}

top.Emax{17,11}

c) )if (10 == 17)? no

i

(i)

C[i].B

C[i].E

C[i]

top

c

e d

[4,17]

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 17 17 15 15 17 17 17 17 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 33: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

C

B

A

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

C

extent

i = 11 :

a)11=i C[11].B=4

b)extent C

while(4=top.B>C[i].B= 4)

top.bmin{4,4}

top.Emax{17,17}

c) )if (11 == 17)? NO

i

(i)

C[i].B

C[i].E

C[i]

top

c

e d

[4,17]

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 13 13 17 17 15 15 17 17 17 17 13 13 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 34: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

C

B

A

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

B

extent

i

(i)

C[i].B

C[i].E

C[i]

top

c

e d[2,17]

i = 12 :

a)12=i C[12].B=2

b)extent B

while(4=top.B>C[i].B= 2)

Extent.B min{2, 4}

extent.E max{13, 17}

Parent[4]2

Pop top

while(2=top.B>C[i].B=2)?NO

top.B min{2,2}

top.E max{17,13}

c) if(12==17)?NO

b

B

A

top

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 17 17 17 17 15 15 17 17 17 17 17 17 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 35: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

B

extent

i

(i)

C[i].B

C[i].E

C[i]

c

e d

i = 13 :

a)13=i C[13].B=2?NO

b)extent B

while(2=top.B>C[i].B= 2)

top.B min{2,2}

top.E max{17,17}

c) if(13==17)?NO

bB

A

top

[2,17]

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 17 17 17 17 15 15 17 17 17 17 17 17 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 36: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

D

extent

i

(i)

C[i].B

C[i].E

C[i]

c

e d

i = 14 :

a)14=i C[14].B=6?NO

b)extent D

while(6=top.B>C[i].B= 6)?NO

top.B min{6,2}

top.E max{15,17}

c) if(14==17)?NO

i = 15..16:

similar to i=14

i=17: pop top

bB

A

top

[2,17]

Atop

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 17 17 17 17 15 15 17 17 17 17 17 17 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 37: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

i

(i)

C[i].B

C[i].E

C[i]

i = 18:

a)i = C[18].B push F

b)extent F

while(18=top.B>C[i].B=18)?NO

C[18].B 18

C[18].E23

c) if(18==23)?NO

i=19:

very similar to i=18

F

extent

c

e d

bF

A

top

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 17 17 17 17 15 15 17 17 17 17 17 17 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 38: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].

b) Extent C[i].While (top.B > C[i].B){

Extent.B min{extent.B, top.B}

Extent.E max{ extent.E, top.E}

parent[top.B] C[i].B

Pop top

End while }

Top.B min{extent.B, top.B}

Top.E max{extent.E, top.E}

c) If(i==top.E) then pop top.

i

(i)

C[i].B

C[i].E

C[i]

i = 20:

a)i C[20].B

b)extent A

while(18=top.B>C[i].B=0)

extent.B min{0,18}

extent.Emax{21, 23}

parent[18] 0

Pop top

i=21:

i==top.Epop top

A

extent

c

e d

bF

A

top

F

A

Atop

top

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0 5 6 17 18 14 13 9 10 20 19 15 16 7 8 12 11 21 22 3 4 1 2 23

0 0 2 2 4 4 6 6 8 8 4 4 2 2 6 6 8 8 18 18 0 0 18 18

21 21 17 17 17 17 15 15 17 17 17 17 17 17 15 15 17 17 23 23 21 21 23 23

A A B B C C D D E E C C B B D D E E F F A A F F

Page 39: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Lemma 1At iteration i of step (3) of the algorithm, if the tree rooted at top is active and i lies on cycle f and we have f.B < top.B, h In the tree rooted at top such that h overlaps with f.Proof: top is active it must have been pushed onto the stack before the current iteration (top.B <i) and we didn’t reach the end of top’s extent yet (i < top.E). i must be contained in top’s extent (top.B<i<top.E). Since i lies on the cycle f that begins before top (f.B<top.B), there must be an edge from cycle f that overlaps with top

Page 40: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Theorem

The algorithm produces a forest

in which each tree is composed of

exactly those nodes that form

a connected component.

Page 41: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Proof

after each iteration, the trees in the forest correspond exactly to the connected components determined by the permutation values scanned up to that point (induction on the number applications of step 3 of the algorithm:Base: each tree of F0 has a single node and no two nodes belong to the same connected component.Step: assume invariant after (i-1) iterations and let i lie on cycle f. We prove that the nodes of the tree containing i form the same set as the nodes of the connected component containing i (other connected components are unaffected and so still obey the invariant).

Page 42: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Proof A node in the tree containing i must be in the same connected component as I.

i = f.B: then, nothing changes in the overlap graph ( and thus in the connected components);

from step(3): the forest remains unchanged, so that the invariant is preserved.

i>f.B: ( top,f) will be added to the forest whenever f.B<top.B holds. (top,f) will join the sub tree rooted at f with that rooted at top into a single sub tree.

f.B<top.B h in the tree rooted at top such that h and f overlap (lemma 1) (h,f) must belong to the overlap graph (h,f) connecting the connected component containing f with that containing top merging them into a single connected component the invariant is preserved.

Page 43: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Whenever (j,i) and (k,l) with j<k<i<l, are gray edges on cycles f and h, respectively, then edge (f,h) must belong to the overlap graph. In such a case, our algorithm ensures that edge (h,f) belong to the overlap forest.

Proof A node in the same connected component as I must be in the tree containing i.

j k i l

h

f

Page 44: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Complexity:1) Setup and Initialize empty stack.2) For i 0 to 2n+1

a) If (i == C[i].B) then push C[i].b) Extent C[i].

While(top.B > C[i].B){Extent.B min{extent.B, top.B}Extent.E max{ extent.E, top.E}parent[top.B] C[i].BPop top

End while} Top.B min{extent.B, top.B} Top.E max{extent.E, top.E}

c) if (i == top.E) then pop top.

Linear time

Each cycle is inserted exactly once to the

stack, and is poped after adding an edge to the graph, or after passing the end of it. Thus also

linear time

Every step of the algorithm takes

linear time,so the entire algorithm,

runs in worst case linear time.

Page 45: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Experimental setupSigned permutations of length 10, 20, 40, 60, 80, 160, 320, and 640:

For each length, 10 groups of 3 signed permutations were generated from the identity permutation using Nadeau and Taylor’s model

5 evolutionary rates were used: 4, 16, 64, 236, and 1024 inversions per edge.

For each length, 10 groups of 3 permutations were also generated and used as an extreme test case.

Page 46: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

For each of these test suites, the 3 distances among the 3 genomes in each group were computed 20,000 times in tight loop. An average and standard deviation were computed over the 10 groups.

Computed inversion distance are expected to be at most twice the evolutionary rate, since there are 2 edges between each pair of genomes.

Experimental setup

Page 47: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Run time of connected components computation as a function of the permutation size

Page 48: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Total run time as a function of the permutation size

Page 49: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Inversion distance as function of the permutation size

Page 50: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Speed comparison of Bader linear time algorithm and that of the UF approach(only connected components)

Page 51: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

Speed comparison of Bader’s linear-time algorithm and that of the UF algorithm

Page 52: A Linear-Time Algorithm for Computing Inversion Distance between signed Permutations with an experimental Study David Bader, Bernard Moret, Mi Yan Presented

summaryPresented a simple, practical, linear time algorithm for computing inversion distance between two signed permutations, with a detailed experimental study.