trie/suffix trie/suffix tree. trie a trie (from retrieval), is a multi-way tree structure useful for...

10
Trie/Suffix Trie/Suffi x Tree

Upload: clinton-terry

Post on 17-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Trie/Suffix Trie/Suffix Tree

Page 2: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Trie

• A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used to store large dictionaries of English (say) words in spelling-checking programs and in natural-language "understanding" programs. Given the data: – an, ant, all, allot, alloy, aloe, are, ate, be

Page 3: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used
Page 4: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Tire (Cont.)

• The idea is that all strings sharing a common stem or prefix hang off a common node. When the strings are words over {a..z}, a node has at most 27 children - one for each letter plus a terminator.

• The elements in a string can be recovered in a scan from the root to the leaf that ends a string. All strings in the trie can be recovered by a depth-first scan of the tree.

Page 5: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Suffix Trie

• The idea behind suffix trie is to assign to each symbol in a text an index corresponding to its position in the text (i.e., first symbol has index 1, last symbol has index n = # of symbols in the text).

Page 6: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Suffix Trie (Cont.)

• A suffix trie is an ordinary trie in which the input strings are all possible suffixes.

• A suffix of a text [t1 ... tn] is a substring [ti ... tn] where i is an integer between 1 and n.

Page 7: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Suffix Trie (Cont.)

• To demonstrate the structure of the resulting tree we will build the suffix trie corresponding to the following text:

TEXT: G O O G O L $POSITION: 1 2 3 4 5 6 7

Page 8: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Suffix Trie (Cont.)

Page 9: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Suffix Tree

• The suffix tree is created by compacting every unary node in the suffix trie.

Page 10: Trie/Suffix Trie/Suffix Tree. Trie A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used

Suffix Tree (Cont.)