lecture6: arithmetic codes - guceee.guc.edu.eg/courses/communications/comm901... · lecture6:...

SOURCE CODING PROF. A.M.ALLAM

LECTURES11/13/2018 1

Lecture6: ARITHMETIC CODES

In applications where the alphabet size is large; Pmax is generally quite small , and the

amount of deviation of the entropy from the average code length ( or in terms of a

percentage of the rate) is quite small

However, in cases where the alphabet size is small and the probability of occurrence of the

different letters is skewed, the value of Pmax can be quite large and the Huffman code can

become rather inefficient when compared to the entropy

It has been shown that the Huffman algorithm will generate a code whose rate is

within Pmax+0.086 of the entropy, where Pmax is the probability of the most frequently

occurring symbol

a1 0.95

a3 0.03

a2 0.02

0

1

0.8

0.20.2

0

1

a1 0.95

a2 0.02

a3 0.03

0

10

11

Ex: Find the Huffman code for the following source given the corresponding probabilities

symbolbitsxxxL /05.1203.0202.0195.0

symbolbitsH /335.003.0

1log03.0

02.0

1log02.0

8.0

1log95.0 22

ρ=0.715 bits/symbol, average =0.715/0.335= 213% i.e., to code this sequence we would need more

than twice the number of bits promised by the entropy

a1 0.95

a2 0.02

a3 0.03

0

10

11



Encoding the source symbols in longer blocks of symbols can get a rate closer to entropy

Letter Probability Code

a1a1 0.9025 0

a1a2 0.0190 111

a1a3 0.0285 100

a2a1 0.0190 1101

a2a2 0.0004 110011

a2a3 0.0006 110001

a3a1 0.0285 101

a3a2 0.0006 110010

a3a3 0.0009 110000

symbolbitsH /335.0

ρ=0.267 bits/symbol, average =0.715/0335= 82%

symbolbitssymbolbitsL /611.02/222.1

If we group the symbols in blocks of 8 , the redundancy

drops to acceptable values, the corresponding alphabet size

for this level of blocking is 𝟑𝟖=6561

A code of this size is impractical for a number of reasons:

1-Storage of a code like this requires memory that may not be available for many applications

2-While it may be possible to design reasonably efficient encoders, decoding a Huffman code of

this size would be a highly inefficient and time consuming procedure

Huffman's original algorithm is optimal for a symbol by symbol coding (i.e., a stream of

unrelated symbols) with a known input probability distribution

It is not optimal when the symbol by symbol restriction is dropped , or when the probability

mass function are unknown


This grouth is due to that, there must be a block for everypossible combination of symbols , so block number increasesexponentially with their length



We need a way of assigning codewords to particular sequence of length m without

having to generate codes for all sequences of that length. The arithmetic coding

technique fulfills this requirement

Arithmetic coding is similar to Huffman coding; they both achieve their compression by

reducing the average number of bits required to represent a symbol

Unlike Huffman coding, arithmetic coding provides the ability to represent symbols with

fractional values (floating point or rather fixed point representation)

Arithmetic coding is especially useful when dealing with:

1.Sources with small alphabets, such as binary sources

2.Alphabets with highly skewed probabilities

3.When it takes a stream of input symbols and replaces it with a single floating point

number in [1,0)

4.When the modeling and coding aspects of lossless compression are to be kept separate


In arithmetic coding a unique identifier or tag is generated for the sequence to be

encoded. This tag corresponds to a binary fraction , which becomes the binary code

for the sequence

A unique arithmetic code can be generated for a sequence of length m without the

need for generating codewords for all sequences of length m


4

-One possible set of tags for representing sequences of symbols are the numbers in the unit

interval [0 ,1)

Square brackets '[' and ']' mean the adjacent number is included

Parenthesis '('and ')' mean the adjacent number is excluded

-Because the number of numbers in the unit interval is infinite, it should be possible to assign a

unique tag to each distinct sequence of symbols

-In order to do that we need a function that will map sequences of symbols into the unit interval

This function is the Cumulative Distribution Function (CDF) of the random variable

associated with the source

-Consider A ={ a1, a2, … am } is the alphabet for a discrete source and X is a random variable,

we will use the mapping : iaX i )(

This mapping means that given a probability model for the source p we also have the

probability density function


(A) Generate a unique tag or identifier-In order to distinguish a sequence of symbols from another sequence of symbols we need a

unique identifier or tag

)()( iaPiXP

and the CDF of X is )()(1

kXPiFi

k

X

i.e., we map the symbols or letters to number


11/13/2018 5

Hence, for each symbol ai with a nonzero probability we have a

distinct value of FX(i) in the unit interval




Generating Tag Graphically:

Divide the unit interval into subintervals of the form

[𝐹𝑋(i− 1), 𝐹𝑋(i)), i= 1, . . ., m


Ex: For the alphabet source A ={ a1, a2, a3 } with P(a1)=0.7, P(a2)=0.1,

and P(a3)=0.2

Using the mapping equations, FX ( 1) = 0.7, FX ( 2) = 0.8, and FX ( 3) = 1

Basically, the procedure for generating the tag works by

reducing the size of the interval in which the tag resides

as more and more elements of the sequence are received

i=1, [𝐹𝑋(0), 𝐹𝑋(1)) [0 , 0.7)

i=2, [𝐹𝑋(1), 𝐹𝑋(2)) [0.7 , 0.8)

i=3, [𝐹𝑋(2), 𝐹𝑋(3)) [0.8 , 1)

We associate the subinterval [𝐹𝑋(i− 1), 𝐹𝑋(i)), with the

symbol ai ; a1 , a2, a3 respectively

𝐹𝑋(0)=0.0

𝐹𝑋(1)=0.7

𝐹𝑋(2)=0.8

𝐹𝑋(3)=1.0

a1

a2

a3

For sequence of symbols of

length one the tag is the

midpoint for each interval

0.35, 0.75, 0.9


7

If the first symbol in the input stream to be encoded is 𝑎k =𝑎1 ,

the tag lies in the interval [0 , 0.7)

The appearance of the first symbol in the sequence restricts

the interval containing the tag to one of these subintervals, a1

or a2 or a3

𝐹𝑋(0)=0.0

𝐹𝑋(1)=0.7

𝐹𝑋(2)=0.8

𝐹𝑋(3)=1.0

a1

a2

a3

a1

0.0

0.56

0.7

0.49

a1

a2

a3


The first partition as before corresponds to the symbol a1 , the

second partition corresponds to the symbol a2 , and the third

partition [0.56, 0.7) corresponds to the symbol a3

The first symbol is a1 this subinterval is now partitioned in

exactly the same proportions as the original interval yielding

the subintervals [0.0, 0.49), [0.49 ,0.56), and [0.56, 0.7)


the tag lies in the interval [0.7 , 0.8)


the tag lies in the interval [0.8 , 1)

Once the interval containing the tag has been determined, the

rest of the unit interval is discarded and this restricted interval

is again partitioned in exactly the same proportions as the

original interval

Suppose we want to encode a sequence of symbols a1a2a3


11/13/2018 8

(3) Each succeeding symbol causes the tag to be restricted to a subinterval that is further

partitioned in the same proportions as the original interval and so on


𝐹𝑋(0)=0.0

𝐹𝑋(1)=0.7

𝐹𝑋(2)=0.8

𝐹𝑋(3)=1.0

a1

a2

a3

a1

0.0

0.56

0.7

0.49

a1

a2

a3

a2

a1

a2

a3

0.546

0.539

0.56

0.49

a3

0.546

0.56

0.5558

0.5572

a1

a2

a3

Initial NEWSUB +(Final-Initial) FIRST SUB * RANGE NEW INTERVAL


11/13/2018 9


One popular choice is midpoint of the interval. Let’s use the midpoint of the final

total interval as the tag

The Midpoint Tag

= (0.546+ 0.56)/2=0.553


11/13/2018 10

Mathematical determination of the tag could be either the lower limit of the interval; or the

midpoint of the interval. Taking the midpoint one gets:


75.01.05.07.0)2(2

1)1()( 2 xXPXPaTX

9.02.05.01.07.0)3(2

1}2()1({)( 3 xxXPXPXPaTX

Ex: For the alphabet source A ={ a1, a2, a3 } with P(a1)=0.7, P(a2)=0.1, and P(a3)=0.2

or each ai , TX(ai ) will have a unique value. This value can be used as a unique tag for ai

35.07.05.0)1(2

10)( 1 xXPaTX

Sequence of symbols of length one

Generating Tag Mathematically

Using the mapping equations, FX ( 1) = 0.7, FX ( 2) = 0.8, and FX ( 3) = 1

We can get this result graphically in the previous example the first step

)1()(2

1)1()(

2

1)}({)(

1

1

iXPiFiXPkXPaT X

i

k

iX

0


LECTURES11/13/2018 11


Ex The outcomes of a roll of the die can be mapped into the numbers{ 1 , 2 ,…, 6}

For a fair die P(X) = m = 1/6 for m = 1, 2,…, 6

25.06/15.06/1)2(2

1)1()( 2 xXPXPaTX

0833.0)( 1 aTX

4166.0)( 3 aTX

5833.0)( 4 aTX

9166.0)( 6 aTX

75.0)2(2

1()(

4

1

5

XPkXPaTk

X5)+ 0.5

0


12


)(2

1)()(

1)(

i

i

ay

i

m

X aXPyPaTi

where y < ai means that y precedes ai in the ordering, and the superscript

denotes the length of the sequence

Ex: the sequence consists of two rolls of a die , the outcomes in order would be 11, 12, 13,…, 66

The tag for the sequence 13 would be

P(X= k) = 1/36 for k = 1, 2, . . . ,36

)13(2

1)12()11()13( XPXPXPTX

Note: To generate the tag for the sequence 13 we do not have to generate a

tag for every other possible message

But it requires that the probability of all sequences that is less than the sequence for

which the tag is being generated to be calculated explicitly which is lengthy work as the

requirement that we have codewords for all sequences of a given length (like Huffman)

Sequence of symbols of long length m

72/5)36/1(2

136/136/1)13( XT

lecture6: arithmetic codes - guceee.guc.edu.eg/courses/communications/comm901... · lecture6:...

Documents