assignment no 6 huffmancoding

Upload: saad-iftikhar

Post on 03-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Assignment No 6 HuffmanCoding

    1/7

    Assignment # 6 1

    Assignment # 6

    Huffman coding

    Saad Iftikhar 039

    Munhal Imran

    Muhammad Hassan Zia 029Moeez Aslam

    Hamza Hashmi

    junaid Afzal Swatti

    Ali Tausif 061

    Armaghan Ahmed

    Zohair Fakhar

    Mohsin Altaf

    Instructor:

    Sir Qasim Umer Khan

  • 8/12/2019 Assignment No 6 HuffmanCoding

    2/7

    Assignment # 6 2

    AbstractIn this assignment we have implemented

    the Huffman entropy encoding algorithm for data

    compression. The results obtained after extensive

    testing with different sets showed acceptable results

    and confirmed the notion that more similar the data

    set the better is compression achieved by Huffman

    compression algorithm.

    I. INTRODUCTIONncomputer science andinformation theory,Huffman

    coding is anentropy encodingalgorithm used forlossless

    data compression.The term refers to the use of avariable-

    length code table for encoding a source symbol (such as a

    character in a file) where the variable-length code table has

    been derived in a particular way based on the estimated

    probability of occurrence for each possible value of the source

    symbol. Huffman coding uses a specific method for choosing

    the representation for each symbol, resulting in aprefix

    code (sometimes called "prefix-free codes", that is, the bit

    string representing some particular symbol is never a prefix of

    the bit string representing any other symbol) that expresses the

    most common source symbols using shorter strings of bits

    than are used for less common source symbols

    II. ASSIGNMENTIn this assignment we were required to implement

    the Huffman algorithm in matlab.

    III. PERFORMANCE

    following are the matlab codes:CODE:

    Class of Huffman code:

    %-----huffman coding-------------%%%%----- version 1--------------%%%%%- data structure (classes)---%%%%%%------18-12-2013------------%%%

    %%

    classdef huffman % data structurevalues that the node has in it

    propertiesleftNode = []rightNode = []probabilitycode = [];symbolhuffy % will store the huufman

    code just for checkend

    end%%%%%%%%%%%--------------%%%%%%%%%%%%%%

    Code for probability finding of data:%- calculating frequency of elements--%%%%%--- Saad Iftikhar-------%%%%%%%%%%%%%%---- 17 december 2013----%%%%%%%%%%

    %% calculate how many same numbers occurfunction[data_unique,data_freq]=frequency(data);clc;% data=[22 33 55 66 11 22 33 44 66];data_unique=unique(data); % thisfunction creates asorted ascending orderarray% with only unique elements no twoelements are repereated

    fori=1:length(data_unique)data_unique1(i)= sum(data ==

    data_unique(i));% this array has the correspondingfrequency of the data% in the unique arrayenddata_unique=data_unique;data_freq=data_unique1;

    data_freq=data_freq/sum(data_freq);end%%%%%%%%%%%------------%%%%%%%%%%%%%%

    Conversion of data from binary indecimal:%--- calculating frequency of elementsbinary version-------------%%%%%%%--- 20-21 december 2013---%%%%%%

    %% convert the data to decimalfunction[convData]=dataConv(data,M);

    Huffman Coding implementation in Matlab

    I

    http://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Information_theoryhttp://en.wikipedia.org/wiki/Entropy_encodinghttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Variable-length_codehttp://en.wikipedia.org/wiki/Variable-length_codehttp://en.wikipedia.org/wiki/Prefix_codehttp://en.wikipedia.org/wiki/Prefix_codehttp://en.wikipedia.org/wiki/Prefix_codehttp://en.wikipedia.org/wiki/Prefix_codehttp://en.wikipedia.org/wiki/Variable-length_codehttp://en.wikipedia.org/wiki/Variable-length_codehttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Entropy_encodinghttp://en.wikipedia.org/wiki/Information_theoryhttp://en.wikipedia.org/wiki/Computer_science
  • 8/12/2019 Assignment No 6 HuffmanCoding

    3/7

    Assignment # 6 3

    %%% always comes a whole no not afraction% M=8;k=log2(M);convData=[];

    remainder=mod(length(data),k);

    % this function here will check if thedata is exactly divisable by k or elsewill append 0 bits ;

    if(remainder~=0)append=k-remainder;else

    append=0;end

    data=[zeros(1,append) data];fordataLength=1:k:length(data)

    string=num2str(data(dataLength:dataLength+k-1));

    decimal=bin2dec(string);convData=[convData decimal];

    end%%%%%%%%%%%------------%%%%%%%%%%%%%%

    Main code of Huffman:%%%%% main code of huffman coding algousing classes-------%%%%%%%%%%-- Saad Iftikhar 18-12-2013-%%%%%%%%---creating binary tree-----%%%%%%

    %%%---huffman code using classes---%%%%%% functioncodedData=sourceHuffman(information,M);clc;

    clear symbol;clear codeHuff;clear codeBits;clear arrr;%% initializingglobal symbol; % global variable willhave the the symbolsglobal codeHuff; % global variable will

    have the the symbols huffman codesglobal codeBits;% global variable willhave the the symbols related length ofhuffman codesdecodeData=[];symbol=[];codeHuff=[];codeBits=[];

    M=4;

    information=[0 0 1 0 1 0 0 0 0 1];% information=randint(1,3000);convdata=dataConv(information,M); % thisfunction her will convert our% from binary format to decimal as restof our program is written for% decimal[data,prob]=frequency(convdata);

    %% Empty Array of Object Huffmanarray = huffman.empty(length(prob),0);array_final =huffman.empty(length(prob),0);

    %% Assign Initial saving all theprobabilities of the numbers in the% probability property of theclass/structure alspfori=1:length(data)

    array(i).probability = prob(i);

    array(i).symbol = data(i);array_final(i).probability =prob(i);

    array_final(i).symbol = data(i);end% here creating a temperary aaray to dothe sorting the algo we are using% is the bubleSort algo for ascendingordertemparray = array;%

    %% Creating the Binary Tree for k =

    1:size(temparray,2)-1 % size(a,2) givessize of the columns% binary tree is where a node/ parent hastwo children and lower% probability one is on left and higherone is on right% here to create a binary tree we have totraverse for the no of nodes -1% here we take size of the colums as sizeis always given as 2 dim vectorfork = 1:size(temparray,2)-1% % First Sort the temp array ussebuble sort%

    fori=1:size(temparray,2)for j = 1:size(temparray,2)-1

    % buble Sort algorithmif (temparray(j).probability

    > temparray(j+1).probability)tempnode = temparray(j);

    % this is the swaping operationtemparray(j) =

    temparray(j+1);temparray(j+1) =

    tempnode;

  • 8/12/2019 Assignment No 6 HuffmanCoding

    4/7

    Assignment # 6 4

    endend

    end

    %% % now we have to Create a new node%

    newnode = huffman; % a node of the

    class of huffman

    % Add the probailities here we arecreating the tree lowest two

    % probability nodes are added intoone single node

    newnode.probability =temparray(1).probability +temparray(2).probability;% new node has the sum of previous twoprobabilities% % now assign the left lowest probabilyone as 0 and higher probabilty oen

    % as 1temparray(1).code = [0];temparray(2).code = [1];

    %% % Attach Children Nodes to the newnode the parent node created

    newnode.leftNode = temparray(1);newnode.rightNode = temparray(2);

    %% % remove the previous two nodes andreplace by parent nodes just like% in C++ we would remove the pointerand of children nodes and replace% by pointer of father node%

    temparray =temparray(3:size(temparray,2)); % fisttwo nodes are gone%% now appending the new parent node%

    temparray = [newnode temparray];%end % end the looping and hence binarytree created%%rootNode = temparray(1); % the root

    node is always the first nodele_code = []; % that will be the finalcode huffman%% % Looping though the tree% % See recursive function loop.m%final_data=[]; % variable definitionlater usedcheck=huffman;

    f=traverse(rootNode,le_code); % here wewill traverse the tree and generatehuffman tree

    % here is the loop for detectingreplacing the data with its huffman code

    forcodeLength=1:length(convdata)forinner=1:length(symbol)

    if(convdata(codeLength)==symbol(inner))level=sum(codeBits(1:inner-1))+1;final_data=[final_data

    codeHuff(level:level+codeBits(inner)-1)];elseendend

    end% codedData=final_data%%%%%%%%%%%----------------%%%%%%%%%%%%%%

    Traversal code for code generation:%%%%%----------- function for traversalof the binary tree-------%%%%%%%%%%---- 20-12-2013------------%%%%%%%%--algorithm for traversing the treewhole to get the code-----%%%%%%%

    functionf = traverse(tempNode,codec)

    global symbol; % these are the global

    variables to store our array code anddata as in recursive functions they arecontinously over writtenglobalcodeHuff;globalcodeBits;

    if ~isempty(tempNode) % if we have thenext root or notcodec = [codec tempNode.code]; % appendwith the previous node

    if~isempty(tempNode.symbol)% disp(tempNode.symbol);tempNode.huffy=[codec];% disp(codec);symbol=[symbol tempNode.symbol];codeHuff=[codeHuff codec];codeBits=[codeBits length(codec)];

    end

    traverse(tempNode.leftNode,codec);traverse(tempNode.rightNode,codec);

  • 8/12/2019 Assignment No 6 HuffmanCoding

    5/7

    Assignment # 6 5

    endf=codec;end%%%%%%%%%-------------%%%%%%%%%%%%%%

    Code for decoding of Huffman:

    %---Huffman decoding algorithm---------%

    %%%%%------- Saad Iftikhar----%%%%%%%%%%%-- 21-12-2013------------------------%

    %% functiondecodedData=decodeHuffman(data,M,rootNode)

    %lets traverse data and create thedecoded string using the structures

    function [realdata,olright] =dHuffman(tempNode,data,M)%% traversing the tree and when we reacha leaf we assign the leaf nodes value tothe data vectordecoded=[]; % definig variablesrealdata=[];i=1;k=1;centerNode=tempNode; % variable of classhuffamnwhile(klength(data)) % if data is

    over this is the leafrealdata=[realdata

    centerNode.symbol];flag=1;k=i+1;break

    endend

    end%% here we convert the data from decimalback to binary formatbinaryReal=dec2bin(realdata,log2(M));binaryReal1=binaryReal';binaryRealFinal=binaryReal1(:);

    binaryRealFinal=binaryRealFinal';forloop=1:length(binaryRealFinal)olright(loop)=str2double(binaryRealFinal(loop));end

    end%%%%%%%%%%%------%%%%%%%%%%%%%%

  • 8/12/2019 Assignment No 6 HuffmanCoding

    6/7

    Assignment # 6 6

    Results:

    Information is the original data its size is 21 bits long. final_data is the Huffmancompressed data its size is greatly reduced to 10 as M=8 ,k=3 in this case quitesimilar data

  • 8/12/2019 Assignment No 6 HuffmanCoding

    7/7

    Assignment # 6 7

    Now in this window it is shown that final data Huffman encoded data when sent to the decoding functions return the original

    data and the sum(ol==information) returns 21 which means all 21 bits of original data and decoded data are a match.

    IV. CONCLUSIONNow the implementation of the Huffman lossless entropy

    encoding compression algorithm has confirmed the notion that

    when data has many similar elements in it this compression

    reduces the length of a code and hence increases the entropyno of useful information sent per bits.