lecture6: arithmetic codes - guc · lecture6: arithmetic codes the third element 2, results in the...
TRANSCRIPT
SOURCE CODING PROF. A.M.ALLAM
1
Lecture6: ARITHMETIC CODES
Fortunately, we can compute a tag for a given sequence of symbols, only from
probability of individual symbols, or the probability model using a recursive formula
for the upper and lower limit of the interval contains the tag
)4(2
)()()( nn
X
luXT
Notice that throughout this process we did not need to compute any joint probabilities
Using the midpoint of the interval for the tag, then
Therefore, the tag for any sequence of length m can be computed in a sequential
fashion. The only information required by the tag generation procedure is the CDF
of the source, which can be obtained directly from the probability model
In general, we can show that for any sequence X =( x1x2…xn)
)2()1()( )1()1()1()(
nX
nnnn xFlull
)3()()( )1()1()1()(
nX
nnnn xFlulu
SOURCE CODING PROF. A.M.ALLAM
LECTURES 11/20/2016 2
Lecture6: ARITHMETIC CODES
From the probability model we can get
30.1)(,0.1)3(,82.0)2(,8.0)1(,00)( kforkFFFFkforkF XXXXX
Ex: For the alphabet source A ={ a1, a2, a3 } with P(a1)=0.8, P(a2)=0.02, and P(a3)=0.18.
Encode the sequence 1 3 2 1
Initializing u(0) to 1, and l(0) to 0, then using equation (2) &(3), the first element of the
sequence 1 results in the following update
00)01(0)0()( )0()0()0()1( XFlull
8.08.0)01(0)1()( )0()0()0()1( XFlulu
i.e., the interval containing the tag for the sequence 1 is [0 , 0.8)
The second element of the sequence is 3 , using the update equations we get
656.082.08.0)2()08.0(0)2( xFl X
8.00.18.0)3()08.0(0)2( xFu X
i.e., the interval containing the tag for the sequence 13 is [0.656 , 0.8)
Define the random variable X(ai) = i
SOURCE CODING PROF. A.M.ALLAM
LECTURES 11/20/2016 3
Lecture6: ARITHMETIC CODES
The third element 2, results in the following update equations
i.e., the interval containing the tag for the sequence 132 is [0.7712 , 0.77408)
The last element 1, the upper and lower limits of the interval containing the tag are
The tag for the sequence 1 3 2 1 can be generated using equation (4 ) as
SOURCE CODING PROF. A.M.ALLAM
Lecture6: ARITHMETIC CODES
Decoding Graphically numberorletterlasttheofxx lowial int
ervalsub
ervalsublow
recursiverange
xxx
int
int
SOURCE CODING PROF. A.M.ALLAM
5
Lecture6: ARITHMETIC CODES
-Since, the tag forms a unique representation for the sequence, then the binary representation of
the tag forms a unique binary code for the sequence
(B) Generating Binary Code of the Tag
LECTURES 11/20/2016
-Let us assume a 8 bit binary sign magnitude fixed point representation
comprising a sign bit, three integer bits, and four fractional bits
-The sign bit is used only to represent the sign of the value
(0= positive, 1 = negative) [0 is only considered in arithmetic coding]
-Let us give an example; assume:
Three integer bits that can be used to represent an integer in the range 0 to 7
[Not relevant to Arithmetic coding]
4 bits that can represent from 0.0 to 0.934, which is divided as follows:
SOURCE CODING PROF. A.M.ALLAM
6
Lecture6: ARITHMETIC CODES Signed Fixed-Point Arithmetic
LECTURES 11/20/2016
SOURCE CODING PROF. A.M.ALLAM
7
Lecture6: ARITHMETIC CODES Signed Fixed-Point Arithmetic
LECTURES 11/20/2016
SOURCE CODING PROF. A.M.ALLAM
8
Lecture6: ARITHMETIC CODES Signed Fixed Point Arithmetic
LECTURES 11/20/2016
SOURCE CODING PROF. A.M.ALLAM
9
Lecture6: ARITHMETIC CODES
Hence, A binary code for can be obtained by taking the binary representation of this
number and truncating it
)(xTX
1)(
1log)(
xPxl
LECTURES 11/20/2016
To make the code efficient, the binary representation has to be truncated
We have said that the tag forms a unique representation for the sequence. This means that
the binary representation of the tag forms a unique binary code for the sequence
However, we have placed no restrictions on what values in the unit interval the tag can
take. The binary representation of some of these values would be infinitely long, in which
case, although the code is unique, it may not be efficient
It is efficient as the length of the sequence m increased
mXHxlXH
2)()()(
This code can be proven as unique and, uniquely detectable
SOURCE CODING PROF. A.M.ALLAM
10
Lecture6: ARITHMETIC CODES
Ex: For the alphabet source A ={ a1, a2, a3 , a4} with P(a1)=1/2, P(a2)=1/4, and
P(a3)=P(a4)=1/8.
Using equation (1 ) or graphically you can get the mid point tag for each symbol as
LECTURES 11/20/2016
Define the random variable X(ai) =i
From the probability model we can get
0.1)4(,875.0)3(,75.0)2(5.0)1(,00)( XXXXX FFFFkforkF
The binary code of each symbol are
The truncated length and e binary code of each symbol are
SOURCE CODING PROF. A.M.ALLAM
LECTURES 11/20/2016 11
Lecture6: ARITHMETIC CODES
Tag Generation with Scaling Big Problem
Consider the values of l(n) and u(n) in tag generation , as n gets larger, these
values come closer and closer together
i.e., in order to represent all the subintervals uniquely we need increasing
precision as the length of the sequence increases
however, the binary representation of these values would be infinitely long
,i.e., not efficient code but unique
In a system with finite precision, the two values are bound to converge, and we
will lose all information about the sequence from the point at which the two
values converged
To avoid this situation, we need to rescale the interval
We would also like to perform the encoding incrementally i.e., to transmit portions of the
code as the sequence is being observed, rather than wait until the entire sequence has
been observed before transmitting the first bit
SOURCE CODING PROF. A.M.ALLAM
LECTURES 11/20/2016 12
Lecture6: ARITHMETIC CODES
Synchronized Rescaling and Incremental Coding
Consider the case the interval is confined to either the upper half [0,0.5) with
most significant bit 0 or lower half [0.5,1) with most significant bit is 1
We can indicate to the decoder whether the tag is confined to the upper or lower half of the unit
interval by sending the first bit of the tag a “1” for the upper half and a” 0” for the lower half
, we can ignore the halfencoder and decoder know which half contains the tagOnce the
concentrate on the half containing the tagof the unit interval not containing the tag and
interval as: ) 1,0 and mapping that half interval containing the tag to the full [
E1: [0 ,0.5) → [0, 1) E1(x )= 2x
E2 : [0.5, 1) → [0 ,1) E2(x )= 2(x−0.5)
We can now continue with this process, generating another bit of the tag every
time the tag interval is restricted to either half of the unit interval
SOURCE CODING PROF. A.M.ALLAM
LECTURES 11/20/2016 13
Lecture6: ARITHMETIC CODES
00)01(0)0()( )1()0()0()1(
X
n Flull
8.08.0)01(0)1()( )0()0()0()1( XFlulu
The interval[ 0, 0.8) is not confined to either the upper or lower
half of the unit interval, so we proceed
From the probability model we can get
30.1)(,0.1)3(,82.0)2(,8.0)1(,00)( kforkFFFFkforkF XXXXX
Ex: For the alphabet source A ={ a1, a2, a3 } with P(a1)=0.8, P(a2)=0.02, and P(a3)=0.18.
Encode the sequence 1 3 2 1
Initializing u(0) to 1, and l(0) to 0, then using equation (2) &(3), the first element of the
sequence 1 results in the following update
Define he random variable X(ai) = i
SOURCE CODING PROF. A.M.ALLAM
LECTURES 11/20/2016 14
Lecture6: ARITHMETIC CODES
656.082.08.0)2()08.0(0)2( xFl X
8.00.18.0)3()08.0(0)2( xFu X
The second element of the sequence is 3 which results in the update
The interval [0.656, 0.8) is contained entirely in the upper half of
the unit interval, so we send the binary code 1 and rescale using E2
The third element of the sequence is 2 which results in the update
The interval [0.5424, 0.54816) is contained entirely in the upper half
of the unit interval, so we send the binary code 1 and rescale
SOURCE CODING PROF. A.M.ALLAM
LECTURES 11/20/2016 15
Lecture6: ARITHMETIC CODES
The interval is contained entirely in the lower half of the unit
interval, so we send the binary code 0 and rescale using E1
The interval is contained entirely in the lower half of the unit
interval, so we send the binary code 0 and go to rescale using E1
The interval is contained entirely in the lower half of the unit
interval, so we send the binary code 0 and go to rescale using E1
SOURCE CODING PROF. A.M.ALLAM
16
Lecture6: ARITHMETIC CODES
The interval is contained entirely in the upper half of the unit
interval, so we send the binary code 1 and go to rescale using E2
Continuing with the last element, 1, which results in the update
At this point sending the binary representation of any value in the final tag interval
Generally, this value is taken to be l (n)
In this particular example, it is convenient to use the value of 0.5. The binary representation of
0.5 is .10 , thus, we would transmit a 1 followed by as many 0s as required by the word length of
the implementation being used
Notice that the tag interval size at this stage (0.504256-0.3568) is approximately 64 times
the size it was when we were using the unmodified algorithm (0.773504-0.7712)
It solves the finite precision problem
SOURCE CODING PROF. A.M.ALLAM
LECTURES 11/20/2016 17
Lecture6: ARITHMETIC CODES
The bits that we have been sending with each mapping constitute the tag itself, which
satisfies our desire for incremental encoding which is
1100011
We can find that the binary number .1100011 corresponds to the decimal
number 0.7734375
Notice that this number lies within the final tag interval of the unmodified
algorithm, therefore, we could use this to decode the sequence