huffman code decoding

2
Huffman Code Decoding Rex Yuan May 13, 2015 Huffman Code Decoding We take a sequence to be decoded and a table of corresponding characters and their probabilities as input. Using the given characters and probabilities, this algorithm build a binary tree by always selecting the two characters with lowest probabilities and make them a subtree with the sum of their probabilities. When going from the root, each traversal to the left subtree represents a ”1” and right a ”0”. Time Complexity Suppose we have a total of m possible character to be encoded and a sequence with length n to be decoded, the time complexity would be linear with regards to m and n, n 2 + m, i.e., O(n 2 ) because in worst case we will iterate through all characters and perform a tree- building operation which in worst case would resemble a completely un-balanced tree, and thus produce a recursion tree containing n levels, and the operations required for each level increments from 1 to n - 1, in turn yielding O(n 2 ) which dominates over the operations needed for decoding, m. Run Time Stats I created the following table using UNIX POSIX time function and round the mean time of 10 trials to five digits after decimal point to calculate the time past. All samples have the ten possible characters to be encoded, that is 0 9. Variable input length implies the length of the sequence to be decoded. Run Time Stats Sample Input Length Huffman Decoding 1 35 0.00031 2 37 0.00858 3 368 0.00902 1

Upload: chih-cheng-yuan

Post on 07-Aug-2015

29 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Huffman Code Decoding

Huffman Code Decoding

Rex Yuan

May 13, 2015

Huffman Code Decoding

We take a sequence to be decoded and a table of corresponding characters andtheir probabilities as input. Using the given characters and probabilities, thisalgorithm build a binary tree by always selecting the two characters with lowestprobabilities and make them a subtree with the sum of their probabilities. Whengoing from the root, each traversal to the left subtree represents a ”1” and righta ”0”.

Time Complexity

Suppose we have a total of m possible character to be encoded and a sequencewith length n to be decoded, the time complexity would be linear with regardsto m and n, n2 + m, i.e.,

O(n2)

because in worst case we will iterate through all characters and perform a tree-building operation which in worst case would resemble a completely un-balancedtree, and thus produce a recursion tree containing n levels, and the operationsrequired for each level increments from 1 to n− 1, in turn yielding O(n2) whichdominates over the operations needed for decoding, m.

Run Time Stats

I created the following table using UNIX POSIX time function and round themean time of 10 trials to five digits after decimal point to calculate the timepast. All samples have the ten possible characters to be encoded, that is 0 9.Variable input length implies the length of the sequence to be decoded.

Run Time StatsSample Input Length Huffman Decoding

1 35 0.000312 37 0.008583 368 0.00902

1

Page 2: Huffman Code Decoding

Pseudo Code

Algorithm 1 Huffman Code Decoding Algorithm

procedure DECODE(prob table, seq)tree = a key-value associating array, with each item’s key the node in that

subtree and value its current sum probability, according to probtable.code table = an array, with length of the how many characters in

prob table, which maps each code to its corresponding decoded character witheach code initialised as empty strings.

while more than one node in tree do sml sub nodes = key of node withsmallest probability in tree sec sub nodes = key of node with second smallestprobability in tree

for node in sml sub nodes docode table[node].key = ”1” + code table[node].key

end forfor node in sec sub nodes do

code table[node].key = ”0” + code table[node].keyend fortree[sml sub nodes + sec sub nodes] = tree[sml sub nodes] +

tree[sec sub nodes]delete tree[sml sub nodes]delete tree[sec sub nodes]

end whiletemp = ””result = ””for code in seq do

temp = temp + codeif temp in code table then

result = result + code table[temp]temp = ””

end ifend forreturn result

end procedure

2