hand out compression

7/29/2019 Hand Out Compression

1/54

01014419 DATA COMPRESSION

Dr.Yuttapong Rangsanseri 1

OUTLINE

PART I: INTRODUCTION

1. Introduction

PART II: LOSSLESS COMPRESSION2. Introduction to Information Theory

3. Huffman Coding

4. Arithmetic Coding

5. Dictionary Techniques

6. Predictive Coding

PART III: LOSSY COMPRESSION

7. Information Theory Revisited

8. Scalar Quantization

9. Vector Quantization10. Differential Encoding

12. Transform Coding

PART IV: COMPRESSION STANDARDS

01014419 DATA COMPRESSION Introduction


DATA COMPRESSION:MOTIVATION

Digital representation of analog signals for transmission, storage

(and manipulation)

What?

Speech, audio, images, video, ...

Where?

Digital telephony (GSM) Digital video storage (VCD, DVD)

Digital audio Compact Disk (CD-audio) Digital camcoder (DVC)

Internet telephony Digital (high definition) Television

Internet radio Desktop videoconferencing

Digital image databases (Internet) etc.


2/54



Advantages

Insensitive for noise (during transmission)

Easy storage (CD, DVD, Hard disks)

Easy transmission (on single network) Error detection and correction

Encryption

(Dynamic) Multiplexing

Digital signal processing

Disadvantages

Digitization process introduces a distortion of the information

Increasing sampling rate & number of bits to code each samplereduces the distortion, but increases bit rate

Issue: balance between the bit rate and the perceived distortions by users



Speech/Audio Bit Rates

Telephone speech (PCM)

200 - 3400 Hz, sampled at 8 kHz

Coded 8 bit/sample

Bit rate = 64 kbps

CD quality

Sampled at 44.1 kHz,

Coded 16 bit/sample

Stereo

Bit rate = 1.5 Mbps


3/54



Image

Digital images are represented by bitmaps

Bitmap is a spatial 2D array made up of individualpicture elements calledpixels. Image resolution is defined by:

Spatial resolution : number of pixels in a digital image (rows x columns)

Pixel depth : number of bits used to code a pixel value

1 bit binary (black and white) image

8 bits gray-level (continuous tone) image

24 bits (full) color image



Binary image

Each pixel is stored as a single bit,

referred to as binary orbitonalimage.

640 x 480 image requires

(640x480)/8 = 38,400 bytes.


4/54



Grayscale image

For 8-bit image, each pixel has a gray-

value between 0 and 255, e.g., a dark

pixel might have a value of 10, and a

bright one might be 230.

640 x 480 grayscale image requires

300 kB of storage

(640 x 480 = 307,200).



(a) (b)

(c) (d)

Color image

Example of 24-bit color image (a) and separate R,G,B color channels (b, c, d). 640 x 480, 24-bit color image would require 921.6 kB of storage.


5/54



Image/Video Bit Rates

Still images

512 512 3 bytes/pel = 6.29 Mbits

Needs 112 sec at 56 kbits/s

Video

Lines Bit rate

Video Telephony(CIF)

352 288 10 1.5 12.2 Mbits/s

Broadcast TV 720 480 30 2 166 Mbits/s

HDTV ~1280 ~720 60 2 885 Mbits/s

Pixels/line Frames/s Bytes/Pixel

(ITU-R601)



SOLUTION?

Compression required for efficient transmission

send more data in available bandwidth

send the same data in less bandwidth

more users on same bandwidth

and storage

can store more data

can transfer real-time video from slow storage devices

Also useful for

progressive reconstruction, scalable delivery, browsing

as a front end to other signal processing

Future: Combine compression and subsequent user-specific processing


6/54



EXAMPLE:PLAYING COLOR VIDEO

Consider playing a one-hour movie from a CD-ROM

Assuming color video frames with a resolution of 620x560 pixels and 24 bits/pixel,

it would require about 1 MB/frame of storage

For 30 frames/second, it gives a total of30 MB for one second,

or 108 GB for the whole one-hour movie.

CD-ROMs transfer rate is just about 300 KB/sec.

Therefore, a one-hour movie would be played for 100 hours!

Real-time transmission required a bandwidth of 240 Mbps

The bandwidth of traditional networks (Ethernet, token ring) is too low even for

the transfer of only one motion video in uncompressed form

In high-speed network, only few simultaneous video sessions would be possible

if the data is transmitted in uncompressed form



GENERAL COMMUNICATION SYSTEM

EncoderInput

SignalChannel Decoder

Reconstructed

Signal

Classic Shannon model of point-to-point communication system

Signal The information to be communicated to a user

image video

speech audio

computer files multimedia: mixed datatypes

Channel Portion of communication system out of designers control: random or deterministic

alteration of signal. E.g., addition of random noise, linear or non-linear filtering.

telephone lines ethernet

the air (wireless) deep space

magnetic disk CD


7/54



Encoder What designer gets to do to signal before channel. Can include preprocessing,

sampling, A/D conversion, signal decompositions, modulation, compression. Goal is

to prepare signal for channel in a way decoder can recover good reproduction.

Decoder What decoder gets to do to channel output in order to reconstruct or render a

version of the signal for the user. Can include inverses or approximate inverses of

encoder operations, or other stuff to enhance reproduction.

General goal: Given the signal and the channel, find an encoder and decoder which give

the best possible reconstruction.



Shannon model usually breaks encoder (& decoder) into two pieces:

Information Source

Source Coding

Channel Coding

Source Coding reduce the number of bits in order to save on transmission time

or storage space Compression

Cahnnel Coding typically increases the number of bits or chooses different bits

in order to protect against channel errors Error-control coding

Image coding and Speech coding mean source coding.

Error correction coding means channel coding.


8/54



COMPRESSION TECHNIQUES

Original

data

Reconstructed

data

Compact

representation

Compression

algorithm

Decompression

algorithm

X Y

Xc

Compression techniques = compression/decompression algorithms

Lossless compression

Lossy compression



Lossless compression: X = Y

Can perfectly recover original data (if no storage or transmission bit errors).

Uses of lossless compression:

computer generated data (text, computer program, etc.)

data that will be enhanced/processed to yield more info.

- medical images- satellite images

Lossy compression: XY

Introduce some loss of information, cannot be reconstructed exactly.

Uses of lossy compression:

speech

pictures/video

Human communications

when high compression rates are required


9/54



Measures of Performance

Compression ratio

The ratio of the number of bits required to represent the data before compression to

the number of bits required to represent the data after compression

Example: 8-bit image of 256x256 pixels

before: 65536 bytes

after: 16384 bytes

compression ratio: 4:1

Rate

Average number of bits required to represent a single sample

Example: 2 bit/pixel (from previous example)

Distortion (for lossy compression)

Difference between the original and the reconstructed data

Example: mean squared error, signal to noise ratio



MODELING AND CODING

Development of data compression algorithms can be divided into two phases:

modeling and coding.

Modeling

Extract information about any redundancy that exists in the data and describe the

redundancy in the form of a model

Coding

A description of the model and a description of how the data differ from the model

Encoded, usually in binary alphabet

Residual: Difference between the data and the model


10/54



Example 1Consider the sequence {x1,x2, ... ,x12}:

9 11 11 11 14 13 15 17 16 17 20 21

If we use binary representation we need 5 bits/sample = 60 bits

Model: ...,2,18 =+= nnxn

{ } { }20,19,18,17,16,15,14,13,12,11,10,9 =nx Residual: { } { } { }1,1,1,1,1,0,1,1,1,0,1,0 == nnn xxe Alphabet: { }1,0,1 2 bits/sample 24 bits for{ }ne

Compression ratio = 60:24 = 2.5:1

Rate = 2 bits/sample



Example 2

Consider the sequence {x1,x2, ... ,x13}:

27 28 29 28 26 27 29 28 30 32 34 36 38

Each value is close to the previous one

Predictive coding: use past value(s) to predict the current one

Model: ...,3,2 1 == nxx nn

Residual: { } { } { }2,2,2,2,2,1,2,1,2,1,1,1*, == nnn xxe Alphabet: { }2,2,1,1 Rate = 2 bits/sample


11/54



Example 3

Consider the sequence {x1,x2, ... ,x41}:

a_barayaran_array_ran_far_faar_faaar_away

8-symbol alphabet: {_, a, b, f, n, r, w, y}Fixed-length coding 3 bits/symbol 123 bits

Variable-length coding according to table

Alphabet Codeword

a 1_ 001

b 01100

f 0100n 0011

r 000w 01101

y 0101

Shorter codewords are assigned to alphabets that appear more often

Needs 106 bits

Compression ratio = 123 : 106 = 1.16 : 1

Rate = 106/41 = 2.58 bits/symbol



Example 4 (Braille coding, mid 19th century)

Exploits redundancy in the form of words that repeat

Codes characters as 2x3 array where array elements are from the set:

26

= 64 possible characters

26 reserved for letters a-z

37 reserved for frequently used words

1 used to indicate that next character is a word, not a character


12/54

01014419 DATA COMPRESSION Introduction to information theory


INTRODUCTION TO INFORMATION THEORY(Claude Elwood Shannon, Bell Labs, 1948)

Let eventA be a set of outcomes of some random experiment

LetP(A) be the probability that eventA will occur

The self-information associated with eventA is

)(log)(

1log)( 22 AP

APAi ==

Remark:

log(1) = 0

logP(A) increases asP(A) decreases

low probability events have high self-information

high probability events have low self-information



Property: The information obtained from the occurrence of two independent events is the

sum of the information obtained from the occurrence of the individual events

LetA andB be independent events

The self-information associated with the occurrence of both events is

( )

)()(

)(log)(log

)()(log

)(log)(

BiAi

BPAP

BPAP

ABPABi

+=

=

=

=

Example: LetHand Tbe the outcomes of flipping a coin

Fair coin2

1)()( == TPHP

i(H) = i(T) = 1 bit

Biased coin

8

7)(,

8

1)( == TPHP

i(H) = 3 bits, i(T) = 0.193 bits


13/54



ENTROPY

Ai: independent events, which are sets of outcomes of some experiment Ssuch that

SAii=

The average self-information associated with the random experiment Sis

)(log)()()( iiii APAPAiAPH ==

Also called the entropy associated with the experiment.

Shannon showed that:

If the experiment is a source that put out symbolsAi from a set A, then the entropy is a

measure of the average number of binary symbols needed to code the output of the source.

A is called the alphabetof the source The symbolsAi are referred as the letters

The best that a lossless compression scheme can do is to encode the output of a source

with the average number of bits equal to the entropy of the source.



Example: Consider the sequence

{xn} = 1, 2, 3, 2, 3, 4, 5, 4, 5, 6, 7, 8, 9, 8, 9, 10

Model 1

Assume the frequency of occurrence of each number reflected in sequence

P(1) = P(6) = P(7) = P(10) = 1/16

P(2) = P(3) = P(4) = P(5) = P(8) = P(9) = 2/16

the sequence is iid(independent and identically distributed), the entropy is

bits25.3)(log)(10

1

== =i

iPiPH

Model 2

Assume sample-to-sample correlation

Remove the correlation by taking differences of neighboring sample values

Residual sequence: {rn} = 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1

Alphabet of residual sequence: {1, 1} with P(1) = 3/16 andP(1) = 13/16

Entropy: H = P(1) logP(1) P(1) logP(1) = 0.7 bits


14/54



CODING

Coding: assignment of binary sequence to elements of an alphabet

The set of binary sequence is called a code

Individual members of the set are called codewords

Fixed-length code: uses the same number of bits to represent each letter

Example: ASCII code E = 1000101 Z = 1011010

Variable-length code: uses different number of bits to represent different letters

Fewer bits are used to represent more frequently used symbols

Example: Morse code E = Z =

Average number of bits/symbol (rate)

=

=m

iii anaPl

1

)()( bits/symbol

where n(ai) is the number of bits in the codeword for letterai



Uniquely Decodable Codes

Any given sequence of codewords can be decoded in one, and only one, way.

Example: Four-letter source alphabet S= {a1, a2, a3, a4}

Letter Probability Code 1 Code 2 Code 3 Code 4

a1 0.5 0 0 0 0

a2 0.25 0 1 10 01a3 0.125 1 00 110 011

a4 0.125 10 11 111 0111

Average length 1.125 1.25 1.75 1.875

Entropy: H(S) = 1.75 bits/symbol

Code 1: a1and a2 assigned the codeword 0

when 0 is received, there is no way to know whethera1ora2 was transmitted

Code 2: a2 a1 a1 encoded as 100100 can be decoded as a2 a1 a1 or as a2 a3

Code 3: first 3 codewords all end in a 0Code 4: each codeword starts with a 0


15/54



PREFIX CODE

a code in which no codeword is a prefix to another codeword is call a

prefix code.

Prefix codes are uniquely decodable.

For any uniquely decodable code that is not prefix code, we can find an

equivalent code, i.e., a code with the same number of average bits/symbol.

01014419 DATA COMPRESSION Huffman coding


HUFFMAN CODING(David Huffman, MIT, 1951)

Input: A source alphabet and a probability model

Output: An optimum prefix code, i.e., a prefix code of minimum average length

Observations: In an optimum code, symbols that occur more frequently will have shorter codewords

than symbols that occur less frequently.

In an optimum code, the two symbols that occur least frequently will have the same length.

Adding requirement:

The codewords corresponding to the two lowest probability symbols differ only in the last bit.

Procedure

Source reductions

Find and merge the 2 symbols with the smallest probabilities into a single symbol.

Continue until we reach a source with only 2 symbols.Code assignment

Starting with the smallest source and working back to the original source.

Appending a "0" and "1" to each symbol, arbitrarily

(reversing the order of "0" and "1" would work just as well). Repeat for each reduced symbol until the original source is reached.


16/54



Example: Five-letter alphabet S= {a1, a2, a3, a4, a5} with

P(a1) =P(a3) = 0.2, P(a2) = 0.4, andP(a4) =P(a5) = 0.1

Letter Probability Codeword

a2 0.4 1a1 0.2 01

a3 0.2 000

a4 0.1 0010

a5 0.1 0011

Average length: l = .41 + .22 + .23 + .14 + .14 = 2.2 bits/symbol

Entropy:H(S) = 2.122 bits/symbol

Redundancy = 0.078 bits/symbol

(difference between entropy and average length)

0

1



Minimum Variance Huffman Codes

Minimum variance codes are important in applications where transmission rates are

fixed. They facilitate buffer design.

We would like to get the code with the variance of the length of the codewords is minimized:

minimize [ ]=

m

iii lanaP

1

2)()(

Example: Minimum variance code from the previous example


a1 0.2 10

a2 0.4 00

a3 0.2 11

a4 0.1 010

a5 0.1 011

Average length: l = .42 + .22 + .22 + .14 + .14 = 2.2 bits/symbol


17/54



Two Huffman trees corresponding to the same probabilities

Discussion: Transmitting a message at 10,000 symbols per second needs a channel

capacity of 22,000 bps. Compare the two codes.

a1

a2

a3

a4a5

a1a2a3

a4 a5

0 1

0

1

1

0

0

1

0.2

0.4

0.6

1.0

10

0

0

1

1

1

0

0.2

0.40.6

1.0



Extended Huffman Codes

Motivation: All codewords in Huffman coding have integer bit lengths.

It is wasteful whenpi is very large and hence log2 (1/ pi) is close to 0.

Why not group several symbols together and assign a single codeword to the group as a whole?

Example:


a1 0.8 0

a2 0.02 11a3 0.18 10

Entropy: H(S) = 0.816 bits/symbol

Average length: 1.2 bits/symbol

Redundancy: 0.384 bits/symbol 47%

Reduce coding rate by blocking more than one symbols together.

The alphabet size increase exponentially as we block more and more symbols together.

Average length: 1.7516 bits/(extended)symbol

Average length: 0.8758 bits/(original)symbol

Redundancy: 0.06 bits/symbol 7%


18/54



Use of Huffman Coding in Lossless Image Compression



Original image

256 gray-level image (8 bits/pixel)

image size 256x256 pixels

uncompressed file size = 65,536 bytes

Compression using Huffman codes on pixel values

Compression using Huffman codes on pixel difference values


19/54

01014419 DATA COMPRESSION Arithmetic coding


ADVANCE VLCS

Limitations of Huffman Coding

Rate always larger than 1.0 bit/sample Extended Huffman coding does not always work

Predesign of the codeFixed code table If probabilities are different than the ones used in the design, data expansion may occur.

Advance VLCs

block coding: combinations of consecutive symbols are coded as a single wordavoid storage of (large) codeword tables learn data statistics as they encode



ARITHMETIC CODING [1963]

Huffman coding assigns each symbol a codeword which has an integral bit length.Arithmetic coding can treat the whole message as one unit.

A message is represented by a half-open interval [a, b) where a and b are real numbersbetween 0 and 1. Initially, the interval is [0, 1). When the message becomes longer, thelength of the interval shortens and the number of bits needed to represent the interval

increases.


20/54



Coding a sequence

We tag each sequence with a unique identifierThe tag is a real number in the unit interval [0, 1).

Since the number of numbers in the interval is infinite, it is possible to assign

a unique tag to each sequence.

We use the cumulative distribution function cdfof the random variable associatedwith the source.

Generating a tag

Partition the [0, 1) interval into subintervals defined by the cdfof the sourceWhile there are more symbols to be encoded do

get next symbol restrict the tag to the subinterval corresponding to the new symbolpartition the new subinterval of the tag proportionally based on the cdf

The tag for the sequence is any number in the final subinterval



Example:

Letter Prob. (Pi) cdf(Qi)

0.0

a1 0.7 0.7

a2 0.1 0.8

a3 0.2 1.0

Coding the sequence a1a2a3 ...

a1: [0.0, 0.7)

a1a2: [0.49, 0.56)

a1a2a3: [0.546, 0.560)


21/54



Selecting a tag

The interval in which the tag for a particular sequence resides is disjoint for allintervals in which the tag for any other sequence may reside

Any number in the interval can be chosen as the tag

lower limit of interval midpoint

We will use the midpoint of the interval as the tagRecursive computation of the tag

Consider the sequenceX=x1x2 ...xnDenote by l(k) and u(k) the lower and upper limit of the interval corresponding to the

sequencex1x2 ...xkrespectively

Compute the lower and the upper limits of the tag interval from the recurrence relations:

[ ][ ] )()1()1()1()(

1)()1()1()1()(

)0(

)0(

1

0

k

k

x

x

Qlulu

Qlull

u

l

kkkk

kkkk

+=

+=

=

=

T(X) = [l(n) + u(n) ]/2



Example: Encode the sequence 1 3 2 1 generated by the source below

Letter Prob. (Pi) cdf(Qi)

0.0

1 0.8 0.8

2 0.02 0.82

3 0.18 1.0

l(0)

= 0

u(0) = 1

l(1)

= 0 + (10)Q0 = 0 + (1)0 = 0

u(1)

= 0 + (10)Q1 = 0 + (1)0.8 = 0.8

l(2)

= 0 + (0.80)Q2 = 0 + (0.8)0.82 = 0.656

u(2)

= 0 + (0.80)Q3 = 0 + (0.8)1 = 0.8

l(3)

= 0.656 + (0.80.656)Q1 = 0.656 + (0.144)0.8 = 0.7712

u(3)

= 0.656 + (0.80.656)Q2 = 0.656 + (0.144)0.82 = 0.77408

l(4)

= 0.7712 + (0.774080.7712)Q0 = 0.7712

u(4)

= 0.7712 + (0.774080.7712)Q1 = 0.773504

T(X) = (0.7712 + 0.773504)/2 = 0.772352


22/54



Deciphering a tag

Initialize l(0) = 0 and u(0) = 1k= 1Repeat until the whole sequence has been decoded

)1()1(

)1(* )(

=

kk

k

lu

lXTt

Find the value ofxk: xk= ai for which Qi-1t*

< Qi

Update l(k)

and u(k)

k = k+ 1

Question: How do we know when the entire sequence has been decoded?

The decoder knows the length of the sequence in advanceUse an end-of-transmission symbol



Generating a binary code

We want to find a binary code that representsXin a unique and efficient manner.The tag T(X) = [l(n) + u(n) ]/2 might be infinitely long.Use as the binary code forT(X) the binary representation ofT(X) truncated to

1)(

1log)( +

=

XPXl bits

Example: Block of one symbol

Letter P(X) Q(X) T(X) In Binary 1)(

1log +

XP

Code

1 0.5 0.5 0.25 .010 2 01

2 0.25 0.75 0.625 .101 3 101

3 0.125 0.875 0.8125 .1101 4 1101

4 0.125 1.0 0.9375 .1111 4 1111


23/54



Example: Block of two symbols

X P(X) T(X) In Binary 1)(

1log +

XP

Code

11 0.25 0.125 .001 3 001

12 0.125 0.3125 .0101 4 010113 0.0625 0.40625 .01101 5 01101

14 0.0625 0.46875 .01111 5 01111

21 0.125 0.5625 .1001 4 100122 0.0625 0.65625 .10101 5 10101

23 0.03125 0.703125 .101101 6 101101

24 0.03125 0.734375 .101111 6 101111

31 0.0625 0.78125 .11001 5 11001

32 0.03125 0.828125 .110101 6 110101

33 0.015625 0.8515625 .1101101 7 110110134 0.015625 0.8671875 .1101111 7 1101111

41 0.0625 0.90625 .11101 5 1110142 0.03125 0.953125 .111101 6 111101

43 0.015625 0.9765625 .1111101 7 1111101

44 0.015625 0.984375 .1111111 7 1111111



Comparison of Huffman and Arithmetic Coding

Arithmetic coding is more complicated than Huffman coding.Coding a sequence of length m

Huffman coding requires building the entire code for all possible sequencesof length m.

Arithmetic coding do not need to build the entire codebook, just a single tag. Example: If the original alphabet size was k, the size of codebook would be km

(k= 16 and m = 20 gives a codebook size of 1620

!)

It might not be worth the extra complexity to use arithmetic coding for the sourcewith large alphabet and the probabilities are not too skewed (smallPmax).

It is much easier to adapt arithmetic codes to changing input statistics, by keeping acount of the letters as they are coded.


24/54

01014419 DATA COMPRESSION Dictionary techniques


DICTIONARY TECHNIQUES

Incorporate the structure in the data in order to increase the amount of compression

Built a list commonly occurring patterns, the dictionary, and encode these patterns bytransmitting their index in the list.

Can be static or dynamicAlgorithm variations:

LZ77 LZ78 LZW



Static dictionary technique

Appropriate when considerable prior knowledge about the source is availableSuitable for particular applications

Digram Coding

The dictionary consists of all letters of the source alphabet and by as many pairs of

letters, call digrams, as can be accommodated by the dictionary

Example: Assume the alphabetA = {a, b, c, d, r}.

Code the sequence abracadabra based on the dictionary:

Code Entry Code Entry

0 a 4 r

1 b 5 ab Sequence: a b r a c a d a b r a

2 c 6 ac Code: 5 4 6 7 5 4 0

3 d 7 ad


25/54



Adaptive dictionary techniquesBased on the work of Ziv and Lempel

LZ77 (assume patterns recur close together)LZ78 (based on the entire previously coded sequence)

The LZ77 approach

Asymptotically, the performance of the algorithm approaches the best that could be obtainedby using a static scheme that has full knowledge about the statistics of the source.

The encoder examines the input sequence through a sliding window which ispartitioned into:

Search buffer: contains portion of most recently encoded sequence

Look-ahead buffer: contains next portion of sequence to be encoded

Example: c a b r a c a d a b r a r r a r r a d

Search buffer Look-ahead buffer



The coding process

A match denotes two strings of size length.The first starting in the search bufferoffsetlocations to the left of the look-ahead

buffer, and the second occupying the leftmost positions of the look-ahead buffer

The string starting in the search buffer can extend into the look-ahead bufferThe encoder repeats:

Find the longest match.Let o and lbe the offset and the length of the match, respectively.

Let c be the character that follows the match in the look-ahead buffer

Encode triple using a fixed-length codeAdvance the windows by l+1 positions

The decoding process

Similar to the encodingFaster/simple since no searching is required


26/54



Example: Encode the sequence: cabracadabrarrarrad

size of search buffer = 7 letters

size of look-ahead buffer = 6 letters

assume that the 7 leading letters have been encoded

c a b r a c a d a b r a r r a r r a d



END



Example: (cont.)

Decode the sequence: , ,

size of search buffer = 7 letters

assume that the sequence cabraca have been decoded

c a b r a c a c a b r a c a d

a b r a c a d a b r a c a d a b r a r

a d a b r a r a d a b r a r r a r r a d

Decoded as: cabracadabrarrarrad


27/54



Variations of the LZ77 scheme

Encode triples with variable length code:PKZip, Zip, LHarc, PNG, gzip, ARJ

Vary the size of search and look-ahead buffers,requires more effective search strategies for large buffer

Eliminate use of triple to encode a single character by using a flag bit,3

rdelement of triple can be dropped (LZSS algorithm)



The LZ78 approach

Problem with LZ77: assumes that like patterns occur close togetherLZ78 keeps an explicit dictionary containing all distinct patterns seen during the

encoding

Both the encoder and the decoder have to build the dictionary, in an identical mannerThe input sequence is coded as a sequence of doubles where:

i is the index corresponding to the dictionary entry that was the longest match

to the input (0: no match), and

c is the character that follows the matched portion of the input


28/54



The coding process

m denotes the longest match in the dictionary



Example: Encode the sequence: DADDADADADDYDADO

Encoder output Index Entry

1

2

3

45

6

78

9

D

A

D

DADA

DAD

DY

DADO

The decoding process

Similar to the encoding; the decoder also builds the dictionary from the received doublesFaster/simple since only indexing of the dictionary is required


29/54



The LZW approach

A variation of LZ78 which avoids the transmission of the input character that follows the matchThe dictionary, at both the coder and the decoder sides, initially contains all alphabet symbolsAssume that string m is the match and that a is the input character that follows it.

The encoder repeats: Transmit the index ofm in the dictionary Insert m*a into the dictionary Build the next match starting with a

Example: (cont.) Encode the sequence: DADDADADADDYDADO

Output Index Entry Output Index Entry

1

2

34

56

7

89

A

D

OY

DA

AD

DD

10

11

1213

1415

16

17

DAD

DA

DAADD

DYY

DAD

DO



Applications of the LZW coding

File CompressionUNIX compress Adaptive dictionary size; 29 216 entries Codewords increase in length as dictionary size increases When dictionary reaches maximum size:

- It performs static coding- It monitors the compression ratio and flashes dictionary

if compression ratio drops below a threshold

Image CompressionThe Graphics Interchange Format (GIF) An implementation of LZW (similar to compress) Works well with computer generated images It is an unfortunate choice for continuous tone images

Compression over ModemsV.42 bis Operates in two modes:

Transparent mode: no compression, used when sequence does not contain

repeating patterns (usually previously compressed files)

Compressed mode: LZW algorithm, forbids the transmission of an entry

immediately after its insertion into the dictionary Variable size dictionary


30/54

01014419 DATA COMPRESSION Run-length coding


RUN-LENGTH CODING (RLC)

Rationale for RLC: if the information source has the property that symbols tend to form

continuous groups, then such symbol and length of the group can be coded.

Run indicates the repetition of a symbolRun-length represents the number of consecutive symbols of the same value Instead of encoding the consecutive symbols, it is obvious that encoding the run-length

and the value that these consecutive symbols commonly share may be more efficient.

Example 1



Run-length encoding for two symbolsRLC mostly used for bi-level (black and white) images

where we assume that the first run represents always white pixels.

Example 2


31/54



Facsimile codingRLC has been adopted in the international standard for facsimile coding:

the CCITT Recommendations T.4 and T.6

1-D RLC uses only horizontal correlation between pixels on the same scan line 2-D RLC uses both horizontal and vertical correlation between pixels,

to achieve higher coding efficiency.

Compression TechniqueGroup of

Facsimile

Apparatuses

Speed

Requirement

for A4 Size

Analog or Digital

Scheme

CCITT

Recommen-

dation Model Basic CoderAlgorithm

Acronym

G1 6 min Analog T.2

G2 3 min Analog T.3

G3 1 min Digital T.4 1-D RLC2-D RLC(optional)

ModifiedHuffman MHMR

G4 1 min Digital T.6 2-D RLC Modified

Huffman

MMR



One-dimensional scheme

Each scan line is encoded independentlyEach scan line can be considered as a sequence of alternating white and black runsThe first run in each scan line is assumed to be a white run. If the first actual pixel isblack, then the run-length of the first white run is set to be zero.

White and black runs are encoded separately, Huffman coding is then applied to twosource alphabets.

According to T.4, each line (A4 page) contains 1728 pixels:Huffman coding requires 2 large codebooks (size of 1728)

Encoded using Modified Huffman (MH), each run-length rlis written as:rl = 64m+t, m = 1,...,27, t= 0,...,63

We use codes form (make-up code) and t(terminating code)


32/54



Modified Huffman Code Table



Two-dimensional scheme

The coding is performed using line-to-line correlation Instead of reporting the length of each run, it reports the position of the start of each run

Example: Encode the sequence

1 8 16

Run-length encoding:

Start-of-run encoding:

It encodes the transition points with reference to the previously encoded line(the line above)

Modification of Relative Element Address Designate (READ), referred as

Modified READ (MR)


33/54

01014419 DATA COMPRESSION Linear predictive coding


LINEAR PREDICTIVE MODELS

Assumes that pixels are related to their neighborsPredicts the current pixel based on its neighborhood and then transmits theprediction error (or, residual)

Usually assumes a raster scan orderTwo-dimensional predictive scheme: Predict the current pixel based on the

two-dimensional neighborhood

01014419 DATA COMPRESSION Linear predictive coding


LOSSLESS JPEG

Lossless JPEG: a special case of the JPEG image compression

Forming a prediction The predictor can use any one of the 7 schemesEncodingThe encoder compares the prediction with the actual pixel value and

encodes the difference using Huffman coding

Neighboring pixels for predictors

Predictor Prediction

P1 A

P2 B

P3 C

P4 A+B-C

P5 A+(B-C)/2

P6 B+(A-C)/2

P7 (A+B)/2

Predictors for lossless JPEG

Note: Non-real time compression can try all schemes and select the best


34/54

01014419 DATA COMPRESSION Quantization


QUANTIZATION

The process of representing a large possibly infinite set of values with a smaller setis called quantization.

A quantizer consists of two mappings Encoder mapping

-The encoder divides the range of values that the source generates intoa number of intervals

-All values that fall into an interval are represented by the codeword forthat interval

Decoder mapping

-For every codeword generated by the encoder, the decoder generatesa reconstruction value



Example

Encoder mapping

Decoder mappingInput Codes Output

000

001

010

011

100101

110

111

3.5

2.5

1.5

0.5

0.51.5

2.5

3.5

Quantizer input-output map


35/54



Formal definition

Let fdenote anfthat has been quantized:iii dfdrfQf


36/54



Example

Lidd ii = 1,1

Lidd

r iii +

= 1,2

1



NON-UNIFORM QUANTIZER

Make quantization intervals small in regions that have more probability mass


37/54



OPTIMUM QUANTIZER

ri and di determined by minimizing the average distortionD:dffpffdffdED

== )(),()],([

d(f, f ): distortion measure, e.g. ff

Lloyd-Max quantizer based on Minimum Mean Square Error (MMSE) criterion:( ) ( ) dffpffffEeED Q =

== )(][

222

rkis the centroid ofp(f0) over the interval kk dfd 01

dk(except d0 and dL) is the middle point between two reconstruction levels rkand rk+1

Solution:



UNIFORM VS.NON-UNIFORM QUANTIZERS

Uniform quantizer is the optimal MMSE quantizer when PDF is uniform The more the PDF deviates from being uniform, the higher will be gain from

non-uniform quantization over uniform quantization

Non-uniform quantizer is usually complex Companded quantization

Instead of making the step-size small, make the interval in which the input lies with high

probability large.

1. Mapftogby nonlinear (compressor) function in such a way that PDF is uniform

2. Quantizegwith a uniform quantizer

3. Perform the inverse nonlinear (expander) function

Simple implementationPopular for audio: logarithmic curves


38/54



Example: Companded Quantization

COMPRESSOR EXPANDER

01014419 DATA COMPRESSION Vector quantization


VECTOR QUANTIZATION

Scalar quantization processes samples of input signal one-by-one and independently. Vector quantization uses dependency betweenNconsecutive samples to break-up an

N-dimensional space in cells in a more efficient way than with scalar quantization.

Signal to be quantized is anN-dimensional vectorfthat consists ofNreal-valued scalars[ ]TNfff ,,, 21 =f

fis mapped to anotherN-dimensional vector[ ]TNrrr ,,, 21 =r

Iffis in the cell Ci, fis mapped to ri (for 1 iL)ii C== frff ,)(VQ


39/54



Example:N= 2 andL = 9



VQENCODER &DECODER

The encoder simply searches for the closest codevector (ri) from the codebook. The label of this reconstruction vector (i) is then entropy coded for transmission (or storage) The decoder performs a table lookup using the label to find the respective reconstruction level


40/54



Example:N= 2 andL = 4

CODEBOOK

i ri

1

23

4

(0, 0)

(2, 1)(1, 3)

(1, 4)

Signal: 0 1 2 3 2 0

Transmit to decoder: 1 3 2

Decoded signal: 0 0 1 3 2 1

Quantization error: 0 -1 -1 0 0 1



Advantages of vector quantization over scalar quantization

Example 1: For the same distortion, VQ needs less number of reconstruction levels

(f1,f2) are jointly uniformly distributed over the shaded region PDF of both f1 andf2 are uniform distributions between (-a, a) Scalar Quantization: 4 reconstruction levels

If we allow 2 reconstruction levels for each scalar, the optimal levels are a/2 and a/2 Vector Quantization: 2 reconstruction levels, provides the same distortion

Scalar quantization Vector quantization


41/54



Example 2: For a given rate, VQ results in lower distortion

Scalar Quantization: 2/2aED T == QQee If we allow 2 reconstruction levels for each scalar, the optimal levels are a and a

Vector Quantization: 12/5 2aED T == QQee

Scalar quantization Vector quantization

L = 4



DESIGN OF A VECTOR QUANTIZER

Quantization error: ffffeQ == )(VQ ri and Ci are determined by minimizing some error criterion, such as the average

distortion measure )],([ ffdED = , e.g. QQeeffTd =),(

Expression similar to scalar quantizer designWe do not knowp(f)Cannot be optimized analytically (see Lloyd-Max)

The optimal codebook should be designed such that the overall distortion is minimized. Instead ofp(f), a set of representative samples ortraining vectors is used to design

the codebook in an iterative optimization procedure.

The most popular method is known as the Linde-Buzo-Gray or LBG algorithm.


42/54

01014419 DATA COMPRESSION Performance evaluation


PERFORMANCE EVALUATION

General description of compression system:

Coder Decoder010100101)(nf )( nf

Average number of bits/sample (R)

Distortion (D)

Rate distortion function,R(D), offers answers to the best rate-versus-distortionperformance that can be achieved.

R(D) can be computed analytically for simple sources and distortion measures.

01014419 DATA COMPRESSION Performance evaluation


Compression Measures

Compression ratiosizedataCompressed

sizedataOriginal=RC

Number of bits per samplesamplesofNumber

bitsofnumberEncoded=R

Number of bits per pixel (image)pixelsofNumber

bitsofnumberEncoded=bpp

Mean square error( )

=

=N

iii ff

NMSE

1

21

Root mean square errorMSERMSE

=

Signal-to-noise ratio

=

=

MSEfN

SNRN

i

i

1

210

1log10 (dB)

Peak signal-to-noise ratio (8-bit image)

=

MSEPSNR

2

10

255log10 (dB)


43/54

01014419 DATA COMPRESSION Differential encoding


DIFFERENTIAL ENCODING

Techniques that transmit information by encoding differences are called differentialencoding techniques

They transmit differences between the prediction and the sampled valuesEasy to implement

Used extensively in speech codingExample

Sinusoid sampled at 30 samples/cycle

- range [1, 1]

- uniform 4-level quantizer

- step size: 0.5

- quantization error [-0.25, .25]

Sample-to-sample differences- range [-0.2, 0.2]

- step size: 0.1

- quantization error [0.05, 0.05]



The Basic Algorithm

Difference based on original sequence of samples

Example:

Source sequence: 6.2 9.7 13.2 5.9 8 7.4 4.2 1.8

The differences: 6.2 3.5 3.5 7.3 2.1 0.6 3.2 2.4

7-level quantizer output values: 6, 4, 2, 0, 2, 4, 6

The quantized sequence: 6 4 4 6 2 0 4 2

Decoder Output: 6 10 14 8 10 10 6 4

Error: 0.2 0.3 0.8 2.1 2 2.6 1.8 2.2

As the reconstruction progresses, the magnitudes of the error become significantly larger!!


44/54



Explanation

At the transmitter (encoder)

The differences: { }1= nnn xxd

Quantized differences: [ ] nnnn qddQd +== where qn is the quantization error

At the receiver (decoder)

Reconstructed values: nnn dxx 1 += ( 1nx available only at the encoder)

011 xxd =

[ ] 1111 qddQd +==

11110101 qxqdxdxx +=++=+=

122 xxd =

[ ] 2222 qddQd +==

2122211212 qqxqdqxdxx ++=+++=+=

=

+=n

kknn qxx

1

The quantization error accumulates as the process continues



Difference based on previously reconstructed values

At the transmitter (encoder)

The differences: { }1 = nnn xxd ( 1 nx available at both encoder and decoder)

Quantized differences: [ ] nnnn qddQd +== where qn is the quantization error

At the receiver (decoder)

Reconstructed values:nnn

dxx 1+=

011 xxd =

[ ] 1111 qddQd +==

11110101 qxqdxdxx +=++=+=

122 xxd =

[ ] 2222 qddQd +==

22221212 qxqdxdxx +=++=+=

nnn qxx +=

No accumulation of the quantization noise


45/54



Differential Encoding System

Predictor: ( )021 ...,,, xxxfp nnn

=

Known asDifferential Pulse Code Modulation (DPCM)The quantizer maps the prediction error into a limited range of outputs, which establish

the amount of compression and distortion associated with lossy predictive coding.



DELTA MODULATION (DM)

Simple form of DPCM used in speech coding applicationsDPCM with 1-bit quantizer and output values A source output sampled and coded using delta modulation

In regions where the source output is relatively constant, the output alternates up ordown by ; these regions are called thegranular regions.

In regions where the source output rises or falls fast, the reconstructed output cannotkeep up; these regions are called theslope overload regions.


46/54



Adaptive Delta Modulation

A source output sampled and coded using adaptive delta modulation

In quasi-constant regions, make the step size small in order to reduce the granular error In regions of rapid change, increase the step size in order to reduce slope overload errorExample:

Constant Factor Adaptive Delta Modulation (CFDM) Continuously Variable Slope Delta Modulation (CVSD)



TRANSFORM CODING

The transform coding takes a sequence of inputs in a given representation andtransforms it into another equivalent sequence in a different representation.

The transform is reversibleThe sequence resulting from a transform contains most of the information in a small

number of its elements

Transform coding consists of three steps: Transform the data sequence {xn} into a transformed sequence {n} using a

reversible mapping

Quantize the transform sequence Encode the quantized sequence


47/54



Linear Transform

General form:

NkxaT

x

x

aa

aa

T

T

N

iikik

NNNN

N

N

,...,1;1

1

1

1111

==

=

=

=

AXT

Inverse transform:TAX

1=

Objective is to choose A such that T is uncorrelated. Another desirable property is energy compaction

Compaction of energy (in as few transform coefficients as possible) allows us to

discard many coefficients without seriously affecting the reconstructed data.



Transform Selection

Karhunen-Love Transform (KLT) Optimally decorrelating transform Optimal energy packaging Data dependent, need to notify decoder what A is

Discrete Fourier Transform (DFT) Discrete Cosine Transform (DCT) Walsh-Hadamard Transform (WHT)

DCT is most widely used in image/video compression standards


48/54

01014419 DATA COMPRESSION JPEG compression


JPEGCOMPRESSION

JPEG: Joint Photographic Experts Group

Modes of Operation

sequential DCTbased encoding, in which each image component is encoded ina single left- to right , top-to-bottom scan

progressive DCT-bases encoding, in which the image is encoded in multiple scans,in order to produce a quick, rough decoded image when the transmission time is long,

lossless encoding, in which the image is encoded to guarantee the exact reproduction, hierarchical encoding, in which the image is encoded in multiple resolutions



JPEG Encoder & Decoder


49/54



Discrete Cosine Transform (DCT)



Quantizing DCT Coefficients

The quantization step is the main source for loss in JPEG compression DCT coefficients are divided by a constant value and the results are rounded to

the nearest integer

The entries ofQ(u, v) tend to have larger values towards the lower right corner.This aims to introduce more loss at the higher spatial frequencies

=

),(

),(),(

vuQ

vuFroundvuF

16 11 10 16 24 40 51 61

12 12 14 19 26 58 60 55

14 13 16 24 40 57 69 56

14 17 22 29 51 87 80 62

18 22 37 56 68 109 103 77

24 35 55 64 81 104 113 92

49 64 78 87 103 121 120 101

72 92 95 98 112 100 103 99


50/54



Zig-Zag Ordering of AC Coefficients

Run-length Coding (RLC) aims to turn theF(u, v) values into sets{#-zeros-to-skip, next non-zero value}

To make it most likely to hit a long run of zeros: azig-zag scan is usedto turn the 8x8 matrixF(u, v) into a 64-vector



DPCM on DC coefficients

The DC coefficients are coded separately from the AC ones.Differential Pulse Code Modulation (DPCM) is the coding method.

If the DC coefficients for the first 5 image blocks are 150, 155, 149, 152, 144,then the DPCM would produce 150, 5, -6, 3, -8,

assuming di = DCi+1 - DCi, and d0 = DC0.

Entropy Coding

The DC and AC coefficients finally undergo an entropy coding step to gaina possible further compression.


51/54



JPEG Progressive Algorithms

Progressive JPEG delivers low quality versions of the image quickly, followed by higher

quality passes.

1. Spectral selection: Takes advantage of the spectral characteristics of the DCT

coefficients: higher AC components provide detail information.

Scan 1: Encode DC and first few AC components, e.g., AC1, AC2.Scan 2: Encode a few more AC components, e.g., AC3, AC4, AC5.

:Scan k: Encode the last few ACs, e.g., AC61, AC62, AC63.

2. Successive approximation: Instead of gradually encoding spectral bands, all DCT

coefficients are encoded simultaneously but with their most significant bits (MSBs) first.

Scan 1: Encode the first few MSBs, e.g., Bits 7, 6, 5, 4.Scan 2: Encode a few more less significant bits, e.g., Bit 3.

:Scan m: Encode the least significant bit (LSB), Bit 0.




52/54



JPEG Hierarchical Mode

The encoded image at the lowest resolution is basically a compressed low-passfiltered image, whereas the images at successively higher resolutions provide

additional details (differences from the lower resolution images).

Similar to Progressive JPEG, the Hierarchical JPEG images can be transmittedin multiple passes progressively improving quality.



Block diagram for Hierarchical JPEG


53/54

01014419 DATA COMPRESSION Video compression


VIDEO COMPRESSION

A video consists of a time-ordered sequence of frames, i.e., images.

An obvious solution to video compression would be predictive coding based onprevious frames.

Compression proceeds by subtracting images: subtract in time order and code the

residual error.

It can be done even better by searching for just the right parts of the image tosubtract from the previous frame.

Steps of Video compression based onMotion Compensation (MC):1. Motion Estimation (motion vector search).

2. MC-based Prediction.

3. Derivation of the prediction error, i.e., the difference.



Spatial vs Temporal Compression

Spatial compressionThe spatial compression of each frame is done with JPEG, or a modification of it.

Each frame is a picture that can be independently compressed.

Temporal compressionIn temporal compression, redundant frames are removed. For example, in a static

scene in which someone is talking, most frames are the same except for the segment

around the speakers lips, which changes from one frame to the next.


54/54



H.261

H.261: An earlier digital video compression standard, its principle of MC-basedcompression is retained in all later video compression standards.

The standard was designed for videophone, video conferencing and other audiovisualservices over ISDN.

The video codec supports bit-rates ofpx64 kbps, wherep ranges from 1 to 30(Hence also known asp*64).


MPEG

MPEG: Moving Pictures Experts Group, established in 1988 for the developmentof digital video.

MPEG standards:o MPEG-1 (ISO/IEC 11172, Nov 92)o

MPEG-2 (ISO/IEC 13818, Nov 94)o MPEG-4 (ISO/IEC 14496, Oct 98)

hand out compression

Documents