1 huffman codes. 2 ascii use same size encoding for all characters. variable length codes can...

15
1 Huffman Codes

Upload: michael-owens

Post on 18-Jan-2018

220 views

Category:

Documents


0 download

DESCRIPTION

Average Length for a Character Set ACDDE probability length23441 Expected length = 0.2* *3+0.1*4+0.15*4+0.45*1

TRANSCRIPT

Page 1: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

1

Huffman Codes

Page 2: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

2

Huffman Codes• ASCII use same size encoding for all

characters.

• Variable length codes can produce shorter messages than fixed length codes

• Huffman Codes constructs an optimal variable bit length code for an alphabet with known frequency probability for each character

Page 3: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

Average Length for a Character SetA C D D E

probability 0.2 0.1 0.1 0.15 0.45

length 2 3 4 4 1

Expected length = 0.2*2 + 0.1*3+0.1*4+0.15*4+0.45*1

Page 4: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

4

Immediate Decodability

• Prefix codes: No valid code symbol is a prefix of another valid code symbol.– Codes that are immediately decodable!

Page 5: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

5

Huffman's code

• Optimal Codes– For a given character set with known

frequency probability– Immediately decodable.– Average message length for a large

number of messages is minimal

Page 6: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

6

Huffman's Algorithm• Huffman_tree_node: ch, freq, *left and

*right• Initialize a list of Huffman_tree_node– One node for each character containing the

character and the frequency, left, right is NULL.• While there is more than one node in the list:

– Find two nodes with minimal frequency.– Remove those nodes from the list, make a new node, the

new node’s left and right subtrees are the nodes just removed, the new node frequency is the sum of left and right nodes’ frequency

– Label the arc to the left subtree with 0.– Label the arc to the right subtree with 1.– Add the new node to the list.

Page 7: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

ExampleA B C D E

probability 0.2 0.1 0.1 0.15 0.45

B C D A E

0.1NULLNULL

0.1NULLNULL

0.15NULLNULL

0.2NULLNULL

0.45NULLNULL

Remove two nodes Band C which minimal freq, Make a new Huffman_tree_node: BC: left =B, right –C freq=0.1+0.1=0.2,

put the new node into the list, next select D A, visualize the tree,Keep doing this until there is no more than 1 node in the list

D A BC E

0.15NULLNULL

0.2NULLNULL

0.2BC

0.45NULLNULL

Page 8: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

Huffman code

A011

B000

C001

D010

E1

A B C D E

probability 0.2 0.1 0.1 0.15 0.45

Example Result

Page 9: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

9

Huffman Decoding Algorithm

HuffmanDecode(Huffman_Tree ht, String s),

Initialize pointer p to the root of Huffman tree.While end of message string not reached:

move p to the left child if next bit is 0 otherwise move p to the right child If p points to a leaf Display the character at that leaf. Reset p to the root of the Huffman tree.

Page 10: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

10

Huffman Decoding Algorithm

• Decoding 0001011010• Remember that Huffman code is

immediately decodable: B E A D

Page 11: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

11

Implementing a Huffman Code Program

class Huffman_tree_node{//functions to set ch, freq, left and right

char ch; double freq; Huffman_tree_node *left, *right;}

Page 12: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

12

Huffman_Treeclass Huffman_Tree{public: Huffman_Tree(void); ~Huffman_Tree(void) {};

// Add a single node tree to the list. void Add(char c, double frequency); Make_Huffman_code_Tree(void)

private: list<Huffman_tree_node> node_list;};

Page 13: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

13

main.cppHuffman_Tree huffman_tree;

int main(void){ cout << "This is the Huffman code program.\n\n";

huffman_tree.Add('a', 0.2 ); huffman_tree.Add('b', 0.1 ); huffman_tree.Add('c', 0.1 ); huffman_tree.Add('d', 0.15); huffman_tree.Add('e', 0.45);

}

Page 14: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

14

Implementing Huffman’s Algorithm

• function Make_Huffman_code_Tree

• Repeatedly – Sort the list of trees by frequency– Remove the first two trees– Create a new node with these trees as subtrees.

• Frequency is sum of their frequencies– Add the new node to the list.

• Continue until there is only one node on the list.

Page 15: 1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman

15

Huffman_Tree.cppfunction Make_Huffman_code_Tree ()

while (node_list.size() > 1){ Huffman_tree_node* cf1 = new

Huffman_tree_node(node_list.front()); node_list.pop_front(); Huffman_tree_node* cf2 = new

Huffman_tree_node(node_list.front()); node_list.pop_front();

Huffman_tree_node cf3(0, cf1->Freq()+cf2->Freq(), cf1, cf2);

node_list.push_back(cf3); node_list.sort();}