simple algorithm for sorting the fibonacci string rotations

Post on 06-Feb-2016

21 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Simple Algorithm for Sorting the Fibonacci String Rotations. Manolis Christodoulakis King’s College London Joint work with Costas S. Iliopoulos Yoan Jos é Pinz ó n Ardila. Our Goal. What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)? - PowerPoint PPT Presentation

TRANSCRIPT

Simple Algorithm for Sorting theFibonacci String Rotations

Manolis ChristodoulakisManolis Christodoulakis King’s College London

Joint work with Costas S. IliopoulosYoan José Pinzón Ardila

SOFSEM 2006 2

Our GoalOur Goal

What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)?

Relationship between different rotations of a Fibonacci string

What is their lexicographic order? Side effect: we can deduce the symbol

stored at any position of any Fibonacci string in constant time (without using , provided that the fn values are known)

SOFSEM 2006 3

Fibonacci Strings & NumbersFibonacci Strings & Numbers

The n-th Fibonacci stringFn = Fn-1Fn-2 n≥2 F0=b, F1=a

The n-th Fibonacci numberfn = fn-1+fn-2 n≥2 f0=1, f1=1

F2 a= b

F3 a= b a

F4 a= b a a b

F1 a=

F0 b= f0 1=

f1 1=f2 2=f3 3=f4 5=

SOFSEM 2006 4

NotationNotation

The i-th rotation of a string

where i is taken modulo n.

rank(i,x) = the rank of Ri(x) rot(ρ,x) = the rotation whose rank is ρ

0 1 … i-1 i …n-1x =

0 1 … i-1 i …n-1Ri(x)

=

SOFSEM 2006 5

Burrows-Wheeler Transform (BWT)Burrows-Wheeler Transform (BWT)

M.Burrows and D.J.Wheeler. 1994 Purpose: to make a string more

compressible BWT Algorithm:

1. Create list of all rotations2. Sort them3. Output last symbol of every rotation4. Output the rank of the 0-th rotation

SOFSEM 2006 6

BWT on Fibonacci StringsBWT on Fibonacci Strings

F5 = abaababa, f5 = 8

R0(F5) a= b a a b a b aR1(F5) b= a a b a b a aR2(F5) a= a b a b a a bR3(F5) a= b a b a a b aR4(F5) b= a b a a b a aR5(F5) a= b a a b a a bR6(F5) b= a a b a a b aR7(F5) a= a b a a b a b

R0(F5) a= b a a b a b a

R1(F5) b= a a b a b a a

R2(F5) a= a b a b a a b

R3(F5) a= b a b a a b a

R4(F5) b= a b a a b a a

R5(F5) a= b a a b a a b

R6(F5) b= a a b a a b a

R7(F5) a= a b a a b a b

SOFSEM 2006 7

Properties of Fibonacci StringsProperties of Fibonacci Strings

The number of ‘b’ in Fn is fn-2

Proof: By induction.

C.S.Iliopoulos, D.W.Moore and W.F.Smyth. 1997Fn = Fn-2Fn-3…F1un, un = ba (n odd)

un = ab (n even)

Let’s call this the IMS formula.

SOFSEM 2006 8

Similarities in RotationsSimilarities in Rotations

R0(Fn) differs from Rfn-2(Fn) in 2 symbols Proof:

R0(Fn) = Fn-2Fn-3…F1un

Rfn-2(Fn) = Fn-3…F1unFn-2 (1)

R0(Fn) = Fn-1Fn-2

= Fn-3…F1un-1Fn-2 (2) Ri(Fn) differs from Ri+fn-2(Fn) in 2 symbols Proof:

Ri(Fn) = Ri(R0(Fn))

Ri+fn-2(Fn) = Ri(Rfn-2(Fn))

SOFSEM 2006 9

Relative Order of RotationsRelative Order of Rotations

Ri(Fn) < Ri+fn-2(Fn) for n odd, i fn-1-1 Proof:

R0(Fn) = Fn-3…F1un-1Fn-2

Rfn-2(Fn) = Fn-3…F1un Fn-2

For i=fn-1-1:

Ri(Fn) = bFn-2Fn-3…F1a

Ri+fn-2(Fn)= aFn-2Fn-3…F1b

Similarly, Ri(Fn) > Ri+fn-2(Fn) for n even, i fn-1-1

= Fn-3 … F1 ab Fn-2

= Fn-3 … F1 ba Fn-2

SOFSEM 2006 10

Sorted List of RotationsSorted List of Rotations

We proved (n odd):Ri(Fn) < Ri+fn-2(Fn) i fn-1-1 (3)

We will now prove that there is no j s.t.Ri(Fn) < Rj(Fn) < Ri+fn-2(Fn)

Proof: (constructive)Start at i=fn-1 and construct the partial list

Ri Ri+fn-2 Ri+2fn-2 Ri+3fn-2 … Ri+kfn-2 …

for as long asi+kfn-2 fn-1-1 (mod fn) kfn-1

I.e. the list is complete!

SOFSEM 2006 11

Identify Rotation Identify Rotation (i)(i) by Rank by Rank ((ρρ))

Therefore, for n odd:rot(ρ,Fn) = fn-1

= (ρfn-2-1) mod fn

Similarly, for n even, the sorted list is constructed bottom-up giving

rot(ρ,Fn) = (-(ρ+1)fn-2-1) mod fn

+ρfn-2) mod fn(

SOFSEM 2006 12

Identify Rank Identify Rank ((ρρ)) of a Rotation of a Rotation (i)(i)

This is simply the inverse of the previous function

n oddrank(i,Fn) = ((i+1)fn-2) mod fn

n evenrank(i,Fn) = ((i+1)fn-2-1) mod fn

SOFSEM 2006 13

Symbols of Fibonacci StringsSymbols of Fibonacci Strings

Fn[i] = ? Observe that

Fn[i] = Ri(Fn)[0]

In the sorted list of rotations, the first fn-1 rotations start with ‘a’, the rest with ‘b’

Thus Fn[i] can be deduced from rank(i,Fn)

If rank(i,Fn) ≤ fn-1 then Fn[i]=a else b.

SOFSEM 2006 14

BWT & Fibonacci ― The Quick WayBWT & Fibonacci ― The Quick Way

The first fn-2 symbols of BWT are ‘b’ Proof: (n odd)

We proved the first fn-2 rotations have index

(ρ·fn-2-1)modfn for 0 ≤ ρ < fn-2

The last symbol of these rotations isFn[ (ρ·fn-2-1 )modfn ]

Which for 0 ≤ ρ < fn-2 is ‘b’

The next fn-1 symbols of BWT are ‘a’ Proof: Consequence of previous lemma

+fn-1

top related