simple algorithm for sorting the fibonacci string rotations

14
Simple Algorithm for Sorting the Fibonacci String Rotations Manolis Christodoulakis Manolis Christodoulakis King’s College London Joint work with Costas S. Iliopoulos Yoan José Pinzón Ardila

Upload: bryga

Post on 06-Feb-2016

21 views

Category:

Documents


2 download

DESCRIPTION

Simple Algorithm for Sorting the Fibonacci String Rotations. Manolis Christodoulakis King’s College London Joint work with Costas S. Iliopoulos Yoan Jos é Pinz ó n Ardila. Our Goal. What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Simple Algorithm for Sorting the Fibonacci String Rotations

Simple Algorithm for Sorting theFibonacci String Rotations

Manolis ChristodoulakisManolis Christodoulakis King’s College London

Joint work with Costas S. IliopoulosYoan José Pinzón Ardila

Page 2: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 2

Our GoalOur Goal

What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)?

Relationship between different rotations of a Fibonacci string

What is their lexicographic order? Side effect: we can deduce the symbol

stored at any position of any Fibonacci string in constant time (without using , provided that the fn values are known)

Page 3: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 3

Fibonacci Strings & NumbersFibonacci Strings & Numbers

The n-th Fibonacci stringFn = Fn-1Fn-2 n≥2 F0=b, F1=a

The n-th Fibonacci numberfn = fn-1+fn-2 n≥2 f0=1, f1=1

F2 a= b

F3 a= b a

F4 a= b a a b

F1 a=

F0 b= f0 1=

f1 1=f2 2=f3 3=f4 5=

Page 4: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 4

NotationNotation

The i-th rotation of a string

where i is taken modulo n.

rank(i,x) = the rank of Ri(x) rot(ρ,x) = the rotation whose rank is ρ

0 1 … i-1 i …n-1x =

0 1 … i-1 i …n-1Ri(x)

=

Page 5: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 5

Burrows-Wheeler Transform (BWT)Burrows-Wheeler Transform (BWT)

M.Burrows and D.J.Wheeler. 1994 Purpose: to make a string more

compressible BWT Algorithm:

1. Create list of all rotations2. Sort them3. Output last symbol of every rotation4. Output the rank of the 0-th rotation

Page 6: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 6

BWT on Fibonacci StringsBWT on Fibonacci Strings

F5 = abaababa, f5 = 8

R0(F5) a= b a a b a b aR1(F5) b= a a b a b a aR2(F5) a= a b a b a a bR3(F5) a= b a b a a b aR4(F5) b= a b a a b a aR5(F5) a= b a a b a a bR6(F5) b= a a b a a b aR7(F5) a= a b a a b a b

R0(F5) a= b a a b a b a

R1(F5) b= a a b a b a a

R2(F5) a= a b a b a a b

R3(F5) a= b a b a a b a

R4(F5) b= a b a a b a a

R5(F5) a= b a a b a a b

R6(F5) b= a a b a a b a

R7(F5) a= a b a a b a b

Page 7: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 7

Properties of Fibonacci StringsProperties of Fibonacci Strings

The number of ‘b’ in Fn is fn-2

Proof: By induction.

C.S.Iliopoulos, D.W.Moore and W.F.Smyth. 1997Fn = Fn-2Fn-3…F1un, un = ba (n odd)

un = ab (n even)

Let’s call this the IMS formula.

Page 8: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 8

Similarities in RotationsSimilarities in Rotations

R0(Fn) differs from Rfn-2(Fn) in 2 symbols Proof:

R0(Fn) = Fn-2Fn-3…F1un

Rfn-2(Fn) = Fn-3…F1unFn-2 (1)

R0(Fn) = Fn-1Fn-2

= Fn-3…F1un-1Fn-2 (2) Ri(Fn) differs from Ri+fn-2(Fn) in 2 symbols Proof:

Ri(Fn) = Ri(R0(Fn))

Ri+fn-2(Fn) = Ri(Rfn-2(Fn))

Page 9: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 9

Relative Order of RotationsRelative Order of Rotations

Ri(Fn) < Ri+fn-2(Fn) for n odd, i fn-1-1 Proof:

R0(Fn) = Fn-3…F1un-1Fn-2

Rfn-2(Fn) = Fn-3…F1un Fn-2

For i=fn-1-1:

Ri(Fn) = bFn-2Fn-3…F1a

Ri+fn-2(Fn)= aFn-2Fn-3…F1b

Similarly, Ri(Fn) > Ri+fn-2(Fn) for n even, i fn-1-1

= Fn-3 … F1 ab Fn-2

= Fn-3 … F1 ba Fn-2

Page 10: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 10

Sorted List of RotationsSorted List of Rotations

We proved (n odd):Ri(Fn) < Ri+fn-2(Fn) i fn-1-1 (3)

We will now prove that there is no j s.t.Ri(Fn) < Rj(Fn) < Ri+fn-2(Fn)

Proof: (constructive)Start at i=fn-1 and construct the partial list

Ri Ri+fn-2 Ri+2fn-2 Ri+3fn-2 … Ri+kfn-2 …

for as long asi+kfn-2 fn-1-1 (mod fn) kfn-1

I.e. the list is complete!

Page 11: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 11

Identify Rotation Identify Rotation (i)(i) by Rank by Rank ((ρρ))

Therefore, for n odd:rot(ρ,Fn) = fn-1

= (ρfn-2-1) mod fn

Similarly, for n even, the sorted list is constructed bottom-up giving

rot(ρ,Fn) = (-(ρ+1)fn-2-1) mod fn

+ρfn-2) mod fn(

Page 12: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 12

Identify Rank Identify Rank ((ρρ)) of a Rotation of a Rotation (i)(i)

This is simply the inverse of the previous function

n oddrank(i,Fn) = ((i+1)fn-2) mod fn

n evenrank(i,Fn) = ((i+1)fn-2-1) mod fn

Page 13: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 13

Symbols of Fibonacci StringsSymbols of Fibonacci Strings

Fn[i] = ? Observe that

Fn[i] = Ri(Fn)[0]

In the sorted list of rotations, the first fn-1 rotations start with ‘a’, the rest with ‘b’

Thus Fn[i] can be deduced from rank(i,Fn)

If rank(i,Fn) ≤ fn-1 then Fn[i]=a else b.

Page 14: Simple Algorithm for Sorting the Fibonacci String Rotations

SOFSEM 2006 14

BWT & Fibonacci ― The Quick WayBWT & Fibonacci ― The Quick Way

The first fn-2 symbols of BWT are ‘b’ Proof: (n odd)

We proved the first fn-2 rotations have index

(ρ·fn-2-1)modfn for 0 ≤ ρ < fn-2

The last symbol of these rotations isFn[ (ρ·fn-2-1 )modfn ]

Which for 0 ≤ ρ < fn-2 is ‘b’

The next fn-1 symbols of BWT are ‘a’ Proof: Consequence of previous lemma

+fn-1