powerpoint presentation · sample-midterm solution. comp 4420. comp 4420 question 1 (6) 3 suffix...

219
Sample-Midterm Solution

Upload: others

Post on 09-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Sample-Midterm Solution

Page 2: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 3: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

3

Suffix Tree (Suffix Trie):

1- Consider all suffixes of text 𝑇 except for the ones which are a prefix of another suffix.

2- Build a compressed Trie to store all suffixes of (1).

3- Insert the suffixes in decreasing order of length.

Assume our text is 𝑇 = 00001. Let’s built the suffix tree for it.

Page 4: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

4

1- All suffixes except for the ones which are a prefix of another suffix

All suffixes of 𝑇 = 00001 :

0 0 0 0 10 0 0 0 10 0 0 0 10 0 0 0 10 0 0 0 1

Page 5: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

5

1- All suffixes except for the ones which are a prefix of another suffix

All suffixes of 𝑇 = 00001 :

0 0 0 0 10 0 0 0 10 0 0 0 10 0 0 0 10 0 0 0 1

There is no suffix which is a prefix of another suffix.

Page 6: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

6

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

All suffixes of 𝑇 = 00001 :

0 0 0 0 10 0 0 0 10 0 0 0 10 0 0 0 10 0 0 0 1

Page 7: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

7

All suffixes of 𝑇 = 00001 :

0 0 0 0 10 0 0 0 10 0 0 0 10 0 0 0 10 0 0 0 1

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

a) Align suffixes along the left

0 0 0 0 1

0 0 0 1

0 0 1

0 1

1

Page 8: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

8

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

b) Built the tree from the root. First, consider a node as the root.

0 0 0 0 1

0 0 0 1

0 0 1

0 1

1

Page 9: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

9

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the first bits of all suffixes?

0 0 0 0 1

0 0 0 1

0 0 1

0 1

1

0

Page 10: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

10

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the first bits of all suffixes?

0 0 0 0 1

0 0 0 1

0 0 1

0 1

1

0

2 options ⇒ Draw one edge for each

0 1

Page 11: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

11

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the second bits of all suffixes?

0 0 0 0 1

0 0 0 1

0 0 1

0 1

00 1

1

Page 12: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

12

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the second bits of all suffixes?

0 0 0 0 1

0 0 0 1

0 0 1

0 1

0

2 options

0 1

11 No option0 1

1

Page 13: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

13

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the third bits of all suffixes?

0 0 0 0 1

0 0 0 1

0 0 1

00 1

1 10 1

0 1

Page 14: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

14

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the third bits of all suffixes?

0 0 0 0 1

0 0 0 1

0 0 1

00 1

1 10 1

2 0 12 options No option010 1

Page 15: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

15

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the forth bits of all suffixes?

00 1

1 10 1

2 010 1

0 0 0 0 1

0 0 0 1

0 0 1

Page 16: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

16

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the forth bits of all suffixes?

0 0 0 0 1

0 0 0 1

00 1

1 10 1

2 010 1

0 0 12 options No option3 0010 1

Page 17: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

17

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the fifth bits of all suffixes?

0 0 0 0 1

00 1

1 10 1

2 010 1

3 0010 1

0 0 0 1

Page 18: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

18

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the fifth bits of all suffixes?

0 0 0 0 1

00 1

1 10 1

2 010 1

3 0010 1

0 0 0 11 options

No option000141

Page 19: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

19

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the sixth bits of all suffixes?

00 1

1 10 1

2 010 1

3 0010 1

00014

0 0 0 0 1

1

Page 20: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 44201

QUESTION 1(6)

20

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

c) How many options for the sixth bits of all suffixes?

00 1

1 10 1

2 010 1

3 0010 1

00014

0 0 0 0 1 No option00001

Page 21: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 44201

QUESTION 1(6)

21

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

d) Compress the tree (eliminate nodes of child)

00 1

1 10 1

2 010 1

3 0010 1

00014

00001

Page 22: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 44201

QUESTION 1(6)

22

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

d) Compress the tree (eliminate nodes of child)

00 1

1 10 1

2 010 1

3 0010 1

00014

00001

Page 23: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

23

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

d) Compress the tree (eliminate nodes of child)

00 1

1 10 1

2 010 1

3 0011

0001

00001

Page 24: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

24

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

d) Compress the tree (eliminate nodes of child)

00 1

1 10 1

2 010 1

3 0010 1

000100001

Page 25: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

25

2&3- Build a compressed Trie, while inserting the suffixes in decreasing order of length

Done!

00 1

1 10 1

2 010 1

3 0010 1

000100001

Page 26: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 1(6)

26

The suffix tree of a bit string can be a binary tree of height 𝑂(𝑛).

00 1

1 10 1

2 010 1

3 0010 1

000100001

00 1

1 10 1

010

n0 1

0…010…001

𝑇 = 0 0 0 0 1 𝑇 = 0 0 …0 1

𝑛 zeros4 zeros

Height=4

Height=𝑛

Page 27: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 28: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

28

Christofides Algorithm:

1- Take MST, but do not double edges

2- Create a minimum weight matching between odd-

degree vertices (the result is an Eulerian subgraph)

3- Take the Eulerian tour

4- Take shortcuts to avoid re-visiting vertices

Page 29: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

29

a b c d e f g h

y

x

Planar Points

Page 30: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

30

a b c d e f g h

y

x

1- Take MSTIf you remove any edge and add another one instead of it, the new

edge will be longer.

Page 31: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

31

a b c d e f g h

y

x

2- Minimum weight matching for odd-degree vertices2.1 – Find the odd-degree vertices

Page 32: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

32

a b c d e f g h

y

x

a b c d e f g h

y

x

2- Minimum weight matching for odd-degree vertices2.2 – Compare all possible matchings for them to find the minimum

Page 33: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

33

a b c d e f g h

y

x

a b c d e f g h

y

x

2- Minimum weight matching for odd-degree vertices2.2 – Compare all possible matchings for them to find the minimum

𝑑 𝑎, 𝑥<𝑑(𝑎, 𝑦)

𝑑(ℎ, 𝑦) < 𝑑(ℎ, 𝑥)

⇒ 𝑑(𝑎, 𝑥) + 𝑑(ℎ, 𝑦) < 𝑑(𝑎, 𝑦) + 𝑑(ℎ, 𝑥)

Page 34: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

34

a b c d e f g h

y

x

a b c d e f g h

y

x

2- Minimum weight matching for odd-degree vertices2.2 – Compare all possible matchings for them to find the minimum

𝑑 𝑎, 𝑥<𝑑(𝑎, 𝑦)

𝑑(ℎ, 𝑦) < 𝑑(ℎ, 𝑥)

⇒ 𝑑(𝑎, 𝑥) + 𝑑(ℎ, 𝑦) < 𝑑(𝑎, 𝑦) + 𝑑(ℎ, 𝑥)

Min. Matching

Page 35: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

35

a b c d e f g h

y

x

3- Take the Eulerian tour𝑎 → 𝑏 → 𝑐 → 𝑑 → 𝑒 → 𝑓 → 𝑔 → ℎ → 𝑦 → 𝑏 → 𝑥 → 𝑎

1 2 3 4 5 6 7

89

10

MST + matching for odd-degree vertices

Page 36: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 442010

QUESTION 2

36

a b c d e f g h

y

x

4- Take shortcutsRemove repetitive vertices

𝑎 → 𝑏 → 𝑐 → 𝑑 → 𝑒 → 𝑓 → 𝑔 → ℎ → 𝑦 → 𝑏 → 𝑥 → 𝑎

1 2 3 4 5 6 7

89

Page 37: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 442010

QUESTION 2

37

a b c d e f g h

y

x

4- Take shortcutsRemove repetitive vertices

𝑎 → 𝑏 → 𝑐 → 𝑑 → 𝑒 → 𝑓 → 𝑔 → ℎ → 𝑦 → 𝑏 → 𝑥 → 𝑎

1 2 3 4 5 6 7

89

×

Page 38: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 2

38

a b c d e f g h

y

x

4- Resulting TSP Tour𝑎 → 𝑏 → 𝑐 → 𝑑 → 𝑒 → 𝑓 → 𝑔 → ℎ → 𝑦 → 𝑥 → 𝑎

1 2 3 4 5 6 7

89

Page 39: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 40: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

40

Push-Relabel algorithm:

1- Assign the height of the source 𝑛 (no. vertices) and the height of other vertices 0.

2- Push the max. possible flow through the outgoing edges of the source.- Increased the excess of the neighbors accordingly.- The capacity of reverse edges are increased (as any push operation).

3- Choose a vertex 𝑥 with positive excess (active vertex) with maximum height.

4- If there is a downhill outgoing edge (𝑥, 𝑢)then push max possible excess from 𝑥 to 𝑢;

otherwise,relabel 𝑥 by incrementing its height.

5- Repeat from step (2) until there is no active vertex

Page 41: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

41

1- Assign the height of the source 𝑛 (no. vertices) and the height of othervertices 0.

cH=0E=0

SH=6

aH=0E=0

dH=0E=0

tH=0

bH=0E=015

4

4

610

3

4

4

3

4

1

8

Page 42: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

42

2- Push the max. possible flow through the outgoing edges of the source.

cH=0E=0

SH=6

aH=0E=0

dH=0E=0

tH=0

bH=0E=015

4

4

610

3

4

4

3

4

1

8

Page 43: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

43

2- Push the max. possible flow through the outgoing edges of the source.a) Increased the excess of the neighbors accordingly.

cH=0E=0

SH=6

aH=0E=15

dH=0E=0

tH=0

bH=0E=015

4

4

610

3

4

4

3

4

1

8

Page 44: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

44

2- Push the max. possible flow through the outgoing edges of the source.b) The capacity of reverse edges are increased.

cH=0E=0

SH=6

aH=0E=15

dH=0E=0

tH=0

bH=0E=00

4

4

610

3

4

4

3

4

1

8

15

Page 45: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

45

2- Push the max. possible flow through the outgoing edges of the source.b) The capacity of reverse edges are increased.

cH=0E=0

SH=6

aH=0E=15

dH=0E=0

tH=0

bH=0E=0

4

4

610

3

4

4

3

4

1

8

15

Page 46: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

46

3- Choose a vertex 𝑥 with positive excess (active vertex) with maximum height

cH=0E=0

SH=6

aH=0E=15

dH=0E=0

tH=0

bH=0E=0

4

4

610

3

4

4

3

4

1

8

15

Page 47: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

47

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

cH=0E=0

SH=6

aH=0E=15

dH=0E=0

tH=0

bH=0E=0

4

4

610

3

4

4

3

4

1

8

15

Page 48: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

48

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

cH=0E=0

SH=6

aH=0E=15

dH=0E=0

tH=0

bH=0E=0

4

4

610

3

4

4

3

4

1

8

15

Page 49: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

49

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

cH=0E=0

SH=6

aH=1E=15

dH=0E=0

tH=0

bH=0E=0

4

4

610

3

4

4

3

4

1

8

15

Page 50: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

50

3- Choose a vertex 𝑥 with positive excess (active vertex) with maximum height

cH=0E=0

SH=6

aH=1E=15

dH=0E=0

tH=0

bH=0E=0

4

4

610

3

4

4

3

4

1

8

15

Page 51: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

51

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

cH=0E=0

SH=6

aH=1E=15

dH=0E=0

tH=0

bH=0E=0

4

4

610

3

4

4

3

4

1

8

15

Page 52: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

52

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

cH=0E=0

SH=6

aH=1E=7

dH=0E=0

tH=0

bH=0E=8

4

4

610

3

4

12

3

4

1

15

Page 53: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

53

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

cH=0E=0

SH=6

aH=1E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

Page 54: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

54

cH=0E=0

SH=6

aH=1E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

3- Choose a vertex 𝑥 with positive excess (active vertex) with maximum height

Page 55: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

55

cH=0E=0

SH=6

aH=1E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 56: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

56

cH=0E=0

SH=6

aH=2E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 57: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

57

cH=0E=0

SH=6

aH=3E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 58: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

58

cH=0E=0

SH=6

aH=4E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 59: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

59

cH=0E=0

SH=6

aH=5E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 60: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

60

cH=0E=0

SH=6

aH=6E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 61: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

61

cH=0E=0

SH=6

aH=7E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 62: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

62

cH=0E=0

SH=6

aH=7E=3

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

15

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 63: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

63

cH=0E=0

SH=6

aH=7E=0

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

12

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

3

Page 64: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

64

cH=0E=0

SH=6

aH=7E=0

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

12

3

3- Choose a vertex 𝑥 with positive excess (active vertex) with maximum height

Page 65: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

65

cH=0E=0

SH=6

aH=7E=0

dH=0E=4

tH=0

bH=0E=8

4

4

610

3

4

12

3

5

12

3

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 66: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

66

cH=0E=0

SH=6

aH=7E=0

dH=0E=4

tH=0

bH=1E=8

4

4

610

3

4

12

3

5

12

3

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 67: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

67

cH=0E=0

SH=6

aH=7E=0

dH=0E=4

tH=0

bH=1E=8

4

4

610

3

4

12

3

5

12

3

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 68: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

68

cH=0E=0

SH=6

aH=7E=0

dH=0E=4

tH=0

bH=1E=5

4

4

610

7

12

3

5

12

3

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 69: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

69

cH=0E=3

SH=6

aH=7E=0

dH=0E=4

tH=0

bH=1E=2

4

4

610

7

12

3

5

12

3

4- If there is a downhill outgoing edge then push max possible excessotherwise, relabel the vertex by incrementing its height

Page 70: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

70

cH=0E=3

SH=6

aH=7E=0

dH=0E=4

tH=0

bH=1E=2

4

4

610

7

12

3

5

12

3

3- Choose a vertex 𝑥 with positive excess (active vertex) with maximum height

Continue until get …

Page 71: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 3

71

cH=1E=0

SH=6

aH=7E=0

dH=1E=0

tH=0

bH=8E=0

4

7

37

7

10

3

5

5

10

3- Choose a vertex 𝑥 with positive excess (active vertex) with maximum height

2

3

There is no active bin anymore ⇒ Done!

Page 72: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 73: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

73

Boyer-Moore Algorithm:

1- Calculate the last-occurrence table 𝐿 for the pattern

2- Calculate the suffix skip table 𝑆 for the pattern

3- Compare pattern 𝑃 (P 𝑗 s) with the text 𝑇 (𝑇[𝑖]s), moving backwards

- When a mismatch occurs at 𝑇[𝑖], set j to the last character of 𝑃, that is (𝑚− 1), and 𝑖 to:

𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

5- Repeat from step (3) until pattern is found or get an 𝑖 out of range

Page 74: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

74

1- Calculate the last-occurrence table 𝐿 for the pattern𝐿(𝑥) is defined as the index of the last occurrence of 𝑥 in 𝑃,

or −1 if no such index exists

j : 0 1 2 3 4 5 6 7

P = d c b c a a b c

𝑥 a b c d

𝐿(𝑥)

Page 75: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

75

j : 0 1 2 3 4 5 6 7

P = d c b c a a b c

𝑥 a b c d

𝐿(𝑥) 5

1- Calculate the last-occurrence table 𝐿 for the pattern𝐿(𝑥) is defined as the index of the last occurrence of 𝑥 in 𝑃,

or −1 if no such index exists

Page 76: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

76

j : 0 1 2 3 4 5 6 7

P = d c b c a a b c

𝑥 a b c d

𝐿(𝑥) 5 6

1- Calculate the last-occurrence table 𝐿 for the pattern𝐿(𝑥) is defined as the index of the last occurrence of 𝑥 in 𝑃,

or −1 if no such index exists

Page 77: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

77

j : 0 1 2 3 4 5 6 7

P = d c b c a a b c

𝑥 a b c d

𝐿(𝑥) 5 6 7

1- Calculate the last-occurrence table 𝐿 for the pattern𝐿(𝑥) is defined as the index of the last occurrence of 𝑥 in 𝑃,

or −1 if no such index exists

Page 78: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

78

j : 0 1 2 3 4 5 6 7

P = d c b c a a b c

𝑥 a b c d

𝐿(𝑥) 5 6 7 0

1- Calculate the last-occurrence table 𝐿 for the pattern𝐿(𝑥) is defined as the index of the last occurrence of 𝑥 in 𝑃,

or −1 if no such index exists

Page 79: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

QUESTION 4(A)

79

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗)

Page 80: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

QUESTION 4(A)

80

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗)

the last occurrence of (! 𝒄)

Page 81: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

81

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) 6

the last occurrence of (! 𝒄)

Page 82: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

82

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) 6

the last occurrence of (! 𝒃)𝒄

Page 83: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

83

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) 0 6

the last occurrence of (! 𝒃)𝒄

Page 84: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

84

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) 0 6

the last occurrence of (! 𝒂)𝒃𝒄

Page 85: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

85

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) 1 0 6

the last occurrence of (! 𝒂)𝒃𝒄

Page 86: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

86

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) 1 0 6

the last occurrence of (! 𝒂)𝒂𝒃𝒄

Page 87: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

87

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * a b c d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -4 1 0 6

the last occurrence of (! 𝒂)𝒂𝒃𝒄

Page 88: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

88

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -4 1 0 6

the last occurrence of (! 𝒄)𝒂𝒂𝒃𝒄

Page 89: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

89

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * a a b c d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -5 -4 1 0 6

the last occurrence of (! 𝒄)𝒂𝒂𝒃𝒄

Page 90: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

90

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -5 -4 1 0 6

the last occurrence of (! 𝒃)𝒄𝒂𝒂𝒃𝒄

Page 91: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

91

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * c a a b c d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -6 -5 -4 1 0 6

the last occurrence of (! 𝒃)𝒄𝒂𝒂𝒃𝒄

Page 92: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

92

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -6 -5 -4 1 0 6

the last occurrence of (! 𝒄)𝒃𝒄𝒂𝒂𝒃𝒄

Page 93: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

93

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * b c a a b c d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -7 -6 -5 -4 1 0 6

the last occurrence of (! 𝒄)𝒃𝒄𝒂𝒂𝒃𝒄

Page 94: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

94

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * * * * * * * * d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -7 -6 -5 -4 1 0 6

the last occurrence of (! 𝒅)𝒄𝒃𝒄𝒂𝒂𝒃𝒄

Page 95: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

95

2- Calculate the suffix skip table 𝑆 for the pattern𝑆(𝑗) is defined as the index of the last occurrence of sub-string

! 𝑃 𝑗 . 𝑃[𝑗 + 1 . .𝑚 − 1]

j : -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

P = * c b c a a b c d c b c a a b c

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -8 -7 -6 -5 -4 1 0 6

the last occurrence of (! 𝒅)𝒄𝒃𝒄𝒂𝒂𝒃𝒄

Page 96: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

96

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 97: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

97

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 98: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

98

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

_ L 𝑇 𝑖 = 𝐿 𝑎 = 5

_ 𝑆 𝑗 = 𝑆 6 = 0

_ ⇒ 𝑖 = 6 + 7 − 0 = 13

j: 0 1 2 3 4 5 6 7

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -8 -7 -6 -5 -4 1 0 6

𝑥 a b c d

𝐿(𝑥) 5 6 7 0

Page 99: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

99

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 100: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

100

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 101: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

101

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

_ L 𝑇 𝑖 = 𝐿 𝑏 = 6

_ 𝑆 𝑗 = 𝑆 7 = 6

_ ⇒ 𝑖 = 13 + 7 − 6 = 14

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -8 -7 -6 -5 -4 1 0 6

𝑥 a b c d

𝐿(𝑥) 5 6 7 0

Page 102: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

102

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 103: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

103

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 104: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

104

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

_ L 𝑇 𝑖 = 𝐿 𝑎 = 5

_ 𝑆 𝑗 = 𝑆 7 = 6

_ ⇒ 𝑖 = 14 + 7 − 5 = 16

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -8 -7 -6 -5 -4 1 0 6

𝑥 a b c d

𝐿(𝑥) 5 6 7 0

Page 105: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

105

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 106: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

106

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 107: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

107

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

_ L 𝑇 𝑖 = 𝐿 𝑎 = 5

_ 𝑆 𝑗 = 𝑆 7 = 6

_ ⇒ 𝑖 = 16 + 7 − 5 = 18

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -8 -7 -6 -5 -4 1 0 6

𝑥 a b c d

𝐿(𝑥) 5 6 7 0

Page 108: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

108

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 109: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

109

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 110: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

110

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 111: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

111

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

_ L 𝑇 𝑖 = 𝐿 𝑑 = 0

_ 𝑆 𝑗 = 𝑆 6 = 0

_ ⇒ 𝑖 = 17 + 7 − 0 = 24

𝑗 0 1 2 3 4 5 6 7

𝑆(𝑗) -8 -7 -6 -5 -4 1 0 6

𝑥 a b c d

𝐿(𝑥) 5 6 7 0

Page 112: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

112

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 113: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

113

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 114: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

114

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 115: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

115

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Page 116: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

116

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P: d c b c a a b c

j: 0 1 2 3 4 5 6 7

3- Compare pattern with the text, moving backwardsWhen a mismatch occurs at 𝑇[𝑖], set 𝑖 to𝑖 = 𝑖 + 𝑚 − 1 −min( 𝐿[ 𝑇 𝑖 ] , 𝑆[𝑗] )

Matched! ⇒ Return 𝑖 = 17

Page 117: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(A)

117

i: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

T: c b a d b c a c b a d c b b a c a d c b c a a b c

P:

You can show the result of each comparison in a row of a table

d c b c a a b c

d c b c a a b c

d c b c a a b c

d c b c a a b c

d c b c a a b c

d c b c a a b c

Page 118: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 119: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

119

Compressed Trie (Patricia Trie):

1- Create regular Trie by inserting each item

2- Convert the regular Trie to compressed Trie by

eliminating nodes with only one child

Page 120: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

120

1- Creating regular Trie1.1- Inserting <0001>

Page 121: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

121

1- Creating regular Trie1.1- Inserting <0001>

Search <0001> → unsuccessful Extra bits: 0001 -> Expand Trie

Page 122: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

122

0001

1

0

0

0

1- Creating regular Trie1.1- Inserting <0001>

Search <0001> → unsuccessful Extra bits: 0001 -> Expand Trie

Page 123: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

123

0001

1

0

0

0

1- Creating regular Trie1.2- Inserting <0010>

Page 124: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

124

0001

1

0

0

0

1- Creating regular Trie1.2- Inserting <0010>

Search <0010> → unsuccessful Extra bits: 10 -> Expand Trie

Page 125: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

125

0001 0010

1

1

0

0

0

0

1- Creating regular Trie1.2- Inserting <0010>

Search <0010> → unsuccessful Extra bits: 10 -> Expand Trie

Page 126: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

126

0001 0010

1

1

0

0

0

0

1- Creating regular Trie1.3- Inserting <1000>

Page 127: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

127

0001 0010

1

1

0

0

0

0

1- Creating regular Trie1.3- Inserting <1000>

Search <1000> → unsuccessful Extra bits: 0010 -> Expand Trie

Page 128: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

128

10000001 0010

1

1

1

0

0

0 0

0

00

1- Creating regular Trie1.3- Inserting <1000>

Search <1000> → unsuccessful Extra bits: 1000 -> Expand Trie

Page 129: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

129

10000001 0010

1

1

1

0

0

0 0

0

00

1- Creating regular Trie1.4- Inserting <1001>

Page 130: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

130

10000001 0010

1

1

1

0

0

0 0

0

00

1- Creating regular Trie1.4- Inserting <1001>

Search <1001> → unsuccessful Extra bits: 1 -> Expand Trie

Page 131: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

131

1- Creating regular Trie1.4- Inserting <1001>

Search <1001> → unsuccessful Extra bits: 1 -> Expand Trie

100110000001 0010

1

1

1

1

0

0

0 0

0

00

Page 132: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

132

100110000001 0010

1

1

1

1

0

0

0 0

0

00

1- Creating regular Trie1.5- Inserting <1100>

Page 133: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

133

100110000001 0010

1

1

1

1

0

0

0 0

0

00

1- Creating regular Trie1.5- Inserting <1100>

Search <1100> → unsuccessful Extra bits: 100 -> Expand Trie

Page 134: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

134

1100100110000001 0010

1

1

1

1

1

0

0

0 0

0

00 0

0

1- Creating regular Trie1.5- Inserting <1100>

Search <1100> → unsuccessful Extra bits: 100 -> Expand Trie

Page 135: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

135

1100100110000001 0010

1

1

1

1

1

0

0

0 0

0

00 0

0

1- Creating regular Trie1.6- Inserting <1101>

Page 136: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

136

1100100110000001 0010

1

1

1

1

1

0

0

0 0

0

00 0

0

1- Creating regular Trie1.6- Inserting <1101>

Search <1101> → unsuccessful Extra bits: 1 -> Expand Trie

Page 137: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

137

1100 1101100110000001 0010

1

1

11

1

1

0

0

0 0

0

00 0

0

1- Creating regular Trie1.6- Inserting <1101>

Search <1101> → unsuccessful Extra bits: 1 -> Expand Trie

Page 138: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

138

1100 1101100110000001 0010

1

1

11

1

1

0

0

0 0

0

00 0

0

1- Creating regular TrieDone!

Page 139: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

139

1100 1101100110000001 0010

1

1

11

1

1

0

0

0 0

0

00 0

0

2- Convert to compressed Trie2.1- Write the number of bit is being checked at each node

Page 140: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

140

0

1 1

2

3

2 3

3 3

1100 1101100110000001

3

0010

1

1

11

1

1

0

0

0 0

0

00 0

0

2- Convert to compressed Trie2.1- Write the number of bit is being checked (from the left) at each

node

Page 141: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

141

0

1

2

3

2 3

3 3

1100 1101100110000001

3

0010

1

1

11

1

1

0

0

0 0

0

00 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 142: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

142

0

1

2

3

2 3

3 3

1100 1101100110000001

3

0010

1

1

11

1

1

0

0

0 0

0

00 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 143: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

143

0

1

2

3

2 3

3 3

1100 1101100110000001

3

0010

1

1

11

1

1

0 0

0

00 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 144: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

144

0

1

2

3

2 3

3 3

1100 1101100110000001

3

0010

1

1

11

1

1

0 0

0

00 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 145: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

145

0

1

2 2 3

3 3

1100 1101100110000001

3

0010

1

1

11

1

1

0 0

0

00 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 146: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

146

0

1

2 2 3

3 3

1100 1101100110000001

3

0010

1

1

11

1 0

0

00 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 147: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

147

0

1

2 2 3

3 3

1100 1101100110000001

3

0010

1

1

11

1 0

0

00 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 148: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

148

0

1

2 2 3

3 3

1100 1101100110000001 0010

1

1

11

1 0

0

00 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 149: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

149

0

1

2 2 3

3 3

1100 1101100110000001 0010

1

1

11

0

0

0 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 150: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

150

0

1

2 2 3

3 3

1100 1101100110000001 0010

1

1

11

0

0

0 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 151: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

151

0

1

2 3

3 3

1100 1101100110000001 0010

1

1

11

0

0

0 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 152: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

152

0

1

2 3

3 3

1100 1101100110000001 0010

1

1

110 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 153: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

153

0

1

2 3

3 3

1100 1101100110000001 0010

1

1

110 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 154: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

154

0

1

2

3 3

1100 1101100110000001 0010

1

1

110 0

0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 155: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

155

0

1

2

3 3

1100 1101100110000001 0010

1

110 0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 156: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

156

0

1

2

3 3

1100 1101100110000001 0010

1

110 0

2- Convert to compressed Trie2.2- Eliminate nodes with only one child

Page 157: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 4(B)

157

0

12

3 3

1100 110110011000

0001 0010

1

110 0

2- Convert to compressed TrieDone!

1 100

Page 158: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 159: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(1)

159

Prefix-free Integer Encoding:

1- Write down the binary representation of the integer (say

we need 𝑘 bits for it)

2- Write 𝑘 − 1 zeros before the binary representation

Page 160: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(1)

160

1- Write down the binary representation of the integer

64+ 8+ 4+ 2 = 78

128 64 32 16 8 4 2 1

1 0 0 1 1 1 078=

𝑘 = 7 bits

Page 161: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(1)

161

2- Write 𝑘 − 1 zeros before the binary representation

64+ 8+ 4+ 2 = 78

128 64 32 16 8 4 2 1

1 0 0 1 1 1 078=

𝑘 = 7 bits

0 0 0 0 0 0 1 0 0 1 1 1 0

6 zeros

Page 162: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(1)

162

Done!

64+ 8+ 4+ 2 = 78

128 64 32 16 8 4 2 1

0 0 0 0 0 0 1 0 0 1 1 1 0

Prefix-free Encoding of 78

Page 163: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 164: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

164

Huffman Tree :

1- Consider a height-0 Trie for each character

2- Assign a weight to each Trie (equal to sum of the frequencies of all characters in Trie)

3- Merge two Tries with the least weights

4- Update the weight of the new Trie.

5- Repeat from step (2).

Page 165: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

165

1- Consider a height-0 Trie for each character

c (10) e (5)b (20) d (10)a (55)

Page 166: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

166

2- Assign a weight to each TrieWeight = Sum of the frequencies of all characters in Trie

c (10) e (5)b (20) d (10)a (55)

10 520 1055

Page 167: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

167

3- Merge two Tries with the least weights

c (10) e (5)b (20) d (10)a (55)

10 520 1055

Page 168: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

168

3- Merge two Tries with the least weights 3.1- Select Tries with two least weights

c (10) e (5)b (20) d (10)a (55)

10 520 1055

Page 169: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

169

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

c (10) e (5)b (20) d (10)a (55)

10 520 1055

Page 170: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

170

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

a) Consider one node of two children as the root of the new Trie

c (10) e (5)b (20) d (10)a (55)

10 520 1055

Page 171: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

171

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

b) Assign one of the Tries as the right child, and the other one as the left child

c (10) e (5)b (20) d (10)a (55)

10 520 1055

e (5)d (10)

Page 172: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

172

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

c) Delete the old Tries

c (10)b (20)a (55)

102055

e (5)d (10)

Page 173: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

173

4- Update the weight of the new TrieWeight = Sum of the frequencies of all characters in Trie

c (10)b (20)a (55)

102055

e (5)d (10)

15

Page 174: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

174

c (10)b (20)a (55)

102055

e (5)d (10)

15

3- Merge two Tries with the least weights

Page 175: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

175

c (10)b (20)a (55)

102055

e (5)d (10)

15

3- Merge two Tries with the least weights 3.1- Select Tries with two least weights

Page 176: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

176

c (10)b (20)a (55)

102055

e (5)d (10)

15

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

a) Consider one node of two children as the root of the new Trie

Page 177: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

177

c (10)b (20)a (55)

102055

e (5)d (10)

15

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

b) Assign one of the Tries as the right child, and the other one as the left child

e (5)d (10)

c (10)

Page 178: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

178

b (20)a (55)

2055

e (5)d (10)

c (10)

4- Update the weight of the new TrieWeight = Sum of the frequencies of all characters in Trie

25

Page 179: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

179

b (20)a (55)

2055

e (5)d (10)

c (10)

25

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

a) Consider one node of two children as the root of the new Trie

Page 180: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

180

b (20)a (55)

2055

e (5)d (10)

c (10)

25

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

b) Assign one of the Tries as the right child, and the other one as the left child

e (5)d (10)

c (10)

b (20)

Page 181: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

181

a (55)

55

e (5)d (10)

c (10)

b (20)

4- Update the weight of the new TrieWeight = Sum of the frequencies of all characters in Trie

45

Page 182: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

182

a (55)

55

e (5)d (10)

c (10)

b (20)

45

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

a) Consider one node of two children as the root of the new Trie

Page 183: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

183

3- Merge two Tries with the least weights 3.2- Merge Selected Tries

b) Assign one of the Tries as the right child, and the other one as the left child

a (55)

55

e (5)d (10)

c (10)

b (20)

45

e (5)d (10)

c (10)

b (20)

a (55)

Page 184: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

184

4- Update the weight of the new TrieWeight = Sum of the frequencies of all characters in Trie

100

e (5)d (10)

c (10)

b (20)

a (55)

Page 185: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(2)

185

Done!

100

e (5)d (10)

c (10)

b (20)

a (55)

Page 186: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 187: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

187

Burrows-Wheeler Transform (BWT):

1- Write all cyclic shifts

2- Sort cyclic shifts (lexicographically)

3- Extract last characters from sorted shifts

Page 188: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

188

1- Write all cyclic shifts

E X A M P L E $

Page 189: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

189

1- Write all cyclic shiftsMove the last character of the last result to the first place

E X A M P L E $

$ E X A M P L E

Page 190: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

190

1- Write all cyclic shiftsMove the last character of the last result to the first place

E X A M P L E $

$ E X A M P L E

E $ E X A M P L

L E $ E X A M P

P L E $ E X A M

M P L E $ E X A

A M P L E $ E X

X A M P L E $ E

Page 191: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

191

1- Write all cyclic shiftsMove the last character of the last result to the first place

E X A M P L E $

$ E X A M P L E

E $ E X A M P L

L E $ E X A M P

P L E $ E X A M

M P L E $ E X A

A M P L E $ E X

X A M P L E $ E

Page 192: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

192

1- Write all cyclic shiftsMove the last character of the last result to the first place

E X A M P L E $

$ E X A M P L E

E $ E X A M P L

L E $ E X A M P

P L E $ E X A M

M P L E $ E X A

A M P L E $ E X

X A M P L E $ E

Page 193: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

193

1- Write all cyclic shiftsMove the last character of the last result to the first place

E X A M P L E $

$ E X A M P L E

E $ E X A M P L

L E $ E X A M P

P L E $ E X A M

M P L E $ E X A

A M P L E $ E X

X A M P L E $ E

Page 194: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

194

1- Write all cyclic shiftsMove the last character of the last result to the first place

E X A M P L E $

$ E X A M P L E

E $ E X A M P L

L E $ E X A M P

P L E $ E X A M

M P L E $ E X A

A M P L E $ E X

X A M P L E $ E

Page 195: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

195

1- Write all cyclic shiftsMove the last character of the last result to the first place

E X A M P L E $

$ E X A M P L E

E $ E X A M P L

L E $ E X A M P

P L E $ E X A M

M P L E $ E X A

A M P L E $ E X

X A M P L E $ E

Page 196: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

196

2- Sort cyclic shifts (lexicographically)

E X A M P L E $

$ E X A M P L E

E $ E X A M P L

L E $ E X A M P

P L E $ E X A M

M P L E $ E X A

A M P L E $ E X

X A M P L E $ E

Page 197: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

197

2- Sort cyclic shifts (lexicographically)a) Sort by the first character

E X A M P L E $

$ E X A M P L E

E $ E X A M P L

L E $ E X A M P

P L E $ E X A M

M P L E $ E X A

A M P L E $ E X

X A M P L E $ E

$ E X A M P L E

A M P L E $ E X

E X A M P L E $

E $ E X A M P L

L E $ E X A M P

M P L E $ E X A

P L E $ E X A M

X A M P L E $ E

𝐿1 𝐿2

Page 198: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

198

2- Sort cyclic shifts (lexicographically)b) Sort the strings having the same character at the first place by their

second characters.

$ E X A M P L E

A M P L E $ E X

E X A M P L E $

E $ E X A M P L

L E $ E X A M P

M P L E $ E X A

P L E $ E X A M

X A M P L E $ E

$ E X A M P L E

A M P L E $ E X

E $ E X A M P L

E X A M P L E $

L E $ E X A M P

M P L E $ E X A

P L E $ E X A M

X A M P L E $ E

𝐿2 𝐿3

Page 199: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(3)

199

3- Extract last characters from sorted shifts

$ E X A M P L E

A M P L E $ E X

E $ E X A M P L

E X A M P L E $

L E $ E X A M P

M P L E $ E X A

P L E $ E X A M

X A M P L E $ E

𝐿3

E X L $ P A M E

Page 200: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

Page 201: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

201

LZW Compression:

1- Encode the largest possible prefix 𝑦 that is present in

the table.

2- Add 𝑥𝑐 to table, where 𝑥 is previously encoded

substring and 𝑐 is the first character of 𝑦 (just added)

Page 202: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

202

1- Encode the largest possible prefix 𝑦 that is present in the table

S = A T G A T C A T G A G

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

Page 203: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

203

1- Encode the largest possible prefix 𝑦 that is present in the table

S = A T G A T C A T G A G

0

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦

Page 204: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

204

2- Add 𝑥𝒄 to table,where 𝑥 is previously encoded substring and 𝒄 is the first character of 𝑦

S = A T G A T C A T G A G

0

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

For the first character, we do not have a previously encoded substring. So we do not add any thing to the table!

𝑦

Page 205: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

205

1- Encode the largest possible prefix 𝑦 that is present in the table

S = A T G A T C A T G A G

0 1

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

Page 206: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

206

S = A T G A T C A T G A G

0 1

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦

2- Add 𝑥𝒄 to table,where 𝑥 is previously encoded substring and 𝒄 is the first character of 𝑦

𝑥

𝑥 = A𝒄 = 𝑇

Page 207: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

207

1- Encode the largest possible prefix 𝑦 that is present in the table

S = A T G A T C A T G A G

0 1 2

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

Page 208: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

208

S = A T G A T C A T G A G

0 1 2

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

2- Add 𝑥𝒄 to table,where 𝑥 is previously encoded substring and 𝒄 is the first character of 𝑦

𝑥 = T𝒄 = 𝐺

Page 209: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

209

1- Encode the largest possible prefix 𝑦 that is present in the table

S = A T G A T C A T G A G

0 1 2 4

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

Page 210: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

210

S = A T G A T C A T G A G

0 1 2 4

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

2- Add 𝑥𝒄 to table,where 𝑥 is previously encoded substring and 𝒄 is the first character of 𝑦

𝑥 = G𝒄 = 𝐴

Page 211: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

211

S = A T G A T C A T G A G

0 1 2 4 3

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

1- Encode the largest possible prefix 𝑦 that is present in the table

Page 212: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

212

S = A T G A T C A T G A G

0 1 2 4 3

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

2- Add 𝑥𝒄 to table,where 𝑥 is previously encoded substring and 𝒄 is the first character of 𝑦

𝑥 = AT𝒄 = 𝐶

Page 213: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

213

S = A T G A T C A T G A G

0 1 2 4 3 4

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

1- Encode the largest possible prefix 𝑦 that is present in the table

Page 214: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

214

S = A T G A T C A T G A G

0 1 2 4 3 4

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

2- Add 𝑥𝒄 to table,where 𝑥 is previously encoded substring and 𝒄 is the first character of 𝑦

𝑥 = C𝒄 = 𝐴

Page 215: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

215

S = A T G A T C A T G A G

0 1 2 4 3 4 6

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

1- Encode the largest possible prefix 𝑦 that is present in the table

Page 216: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

216

S = A T G A T C A T G A G

0 1 2 4 3 4 6

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

2- Add 𝑥𝒄 to table,where 𝑥 is previously encoded substring and 𝒄 is the first character of 𝑦

𝑥 = AT𝒄 = 𝐺

Page 217: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

217

S = A T G A T C A T G A G

0 1 2 4 3 4 6 2

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

1- Encode the largest possible prefix 𝑦 that is present in the table

Page 218: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

218

S = A T G A T C A T G A G

0 1 2 4 3 4 6 2

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

𝑦𝑥

2- Add 𝑥𝒄 to table,where 𝑥 is previously encoded substring and 𝒄 is the first character of 𝑦

𝑥 = GA𝒄 = 𝐺

Page 219: PowerPoint Presentation · Sample-Midterm Solution. COMP 4420. COMP 4420 QUESTION 1 (6) 3 Suffix Tree (Suffix Trie): 1- Consider all suffixes of text except for the ones which are

COMP 4420

QUESTION 5(4)

219

S = A T G A T C A T G A G

0 1 2 4 3 4 6 2

Code String

A 0

T 1

G 2

C 3

AT 4

TG 5

GA 6

ATC 7

CA 8

ATG 9

GAG 10

Done!

Code =