jpeg (joint photographic expert group) prepared by: md. abedul haque std no: 100505010p cse, buet

JPEG (Joint Photographic Expert Group)

Prepared By: Md. Abedul Haque

Std No: 100505010P

CSE, BUET

JPEG JPEG is the image compression standard developed by

the Joint Photographic Experts Group.

In 1992, JPEG became an ISO international Standard.

JPEG applies to color and gray-scaled still images.

The JPEG implementation should be independent of image size. applicable to any image and pixel aspect ratio.

Color representation should be independent of the special implementation.

Image content may be of any complexity.

The JPEG standard specification should be state-of-the-art regarding the compression factor and achieved image quality.

Sequential and progressive decoding should be possible.

JPEG requirements

Steps of JPEG compression

The following figure outlines the steps of JPEG compression.

Picture Preparation

Pixel

Block, MCU

Picture Processing

Predictor

FDCT

Quantization

Entropy Coding

Run-Length

Huffman

Arithmetic

JPEG Modes Four different variants of image compression lead to four

modes. The lossy sequential DCT-based mode(baseline process) must

be supported by every JPEG implementation

The expanded lossy DCT-based mode provides a set of further enhancements to the baseline process.

The lossless mode has a low compression ratio but allows perfect reconstruction of the original image.

The hierarchical mode accommodates images of different resolutions and selects its algorithms from the three modes of above.

Lossy Sequential DCT-based Mode

Image Preparation


Step: Image Preparation Image Component/Plane:

An image consists of at least 1 and at most 255 planes (or component).

An RGB image consist of three components(or planes) i.e. RED, GREEN, BLUE

RED Plane GREEN Plane BLUE Plane

Image Preparation

Image Component/Plane: a gray-scale image consist of a single component i.e. Intensity.

An YUV image consist of three planes namely Luminance plane/Y plane Chrominance plane U Chrominance plane V

C1: Luminance Y

C2: Chrominance U

C3: Chrominance V

Each small rectangle represents a pixel

Image Component/Plane: Now, unlike RGB image, Y, U, and V plane may be of different

resolution. That is the following figure is also possible.

Image Preparation

C1: Luminance Y

C2: Chrominance U

C2: Chrominance V

In this case, we see that 4 pixels share same chrominance value.

Each pixel is represented by p bits. All pixels of all components within the same image are coded with the same number of bits.

Data Unit/Block: Each block is made of 8x8 pixels. This definition of block/data

unit comes from DCT.

These blocks of 8x8 pixels are transferred to the next step as a unit for processing.

There are two ways these data units are passed to the next step.

Non-interleaved data ordering:

• In this case, data units are passed component by component.

• All the data units from first component are passed and then from the second component and so on.

• Data units from the each component are passed from left to right, top to bottom order.

Image Preparation

Non-interleaved data ordering: Using this mode for an RGB-encoded image with very high

resolution, the display would initially present only the red component, then green and then blue resulting the original image.

Data Unit/Block

To the Image Processing Step

Represents data unit of 8x8 pixels

Interleaved data ordering: In this approach data units from different components are

combined into Minimum Coded Unit (MCU). If all components have the same resolution, an MCU

consists of exactly one data unit from each component.

Data Unit/Block

MCU2

MCU3

MCU4

MCU5

MCU1

And so on

To the Image Processing Step

Interleaved data ordering: If all components don’t have the same resolution,

Different number of data units from each component comprises the MCU.

The number of data units from each component is calculated from relative horizontal and vertical sampling ratios.

Data units from each component are taken from left to right, top to bottom order.

MCUs are also constructed from left to right, top to bottom order.

Example 1: Let, three different plane of an image has resolutions as follows:

Plane 0: X0 = 512, Y0 = 256 Plane 1: X1= 256, Y1 = 512 Plane 2: X2 = 128, Y2 = 256

Data Unit/Block

Example 1:

Data Unit/Block

…………………………………………………………………………………………………………..

512

256………………………………………………

256

512

128

256

…………...

Plane 0(512x256 pixels)

Plane1 (256x512 pixels)

Plane 2(128x256 pixels)

Now, if we see data units of each component we find

********************************************************************************

64

32

****************************************************************

32

64

16

32

********

Each * represents a data unit i.e. 8x8 pixels

Data Unit/Block Example 1:

Now Hi, and Vi are called relative sampling ratios and calculated for each plane.

Hi = Xi / Xmin

Vi= Yi / Ymin

So, we get Plane 0: H0 = 4, V0 = 1 Plane 1: H1 = 2, V1 = 2 Plane 2: H2 = 1, V2 = 1 Hi, and Vi must be integer values between 1 and 4.

Now, a MCU is built by taking H0xV0 data units from plane 0 H1xV1 units from plane 1 H2xV2 units from plane 2

Example 1:

Data Unit/Block

**** ****

*MCU

Example 2: Let, an image has four planes with the following dimensions

Plane 0: X0 = 48 pixels, Y0 = 32 pixels

Plane 1: X1= 48 pixels, Y1 = 16 pixels



Data Unit/Block Example 2:

Represents a data unit or block i.e. 8x8 pixels

If we calculate Hi and Vi like previous example we will find MCUs like below

H0 = 2, V0 = 2

H1 = 2, V1 = 1

H2 = 1, V2 = 2

H3 = 1, V3 = 1

MCU1

MCU2

MCU3

MCU4

MCU5

MCU6

Blocks of these MCUs are transferred to the next step.


Image Processing

In this baseline mode, blocks of 8x8 pixels come to the image processing unit in particular order from the previous step.

Each pixel is encoded using 8 bits. First step of image processing is a transformation

performed by DCT. FDCT:

FDCT transforms an image from spatial to frequency domain. Transforming a 8x8 pixel block through FDCT we find 64

coefficients which can be regarded as a two-dimensional frequency.


Step: Image Processing

Image Processing FDCT:

The formula for FDCT is given below

Svu represents the DCT coefficients.

The co-efficient S00 corresponds to the lowest frequency in both dimensions. It is known as DC-coefficient, which determines the fundamental color of the data units of 64 pixels.

The other coefficients are called AC-coefficients.

Image Processing

FDCT: The function of the formula Svu is called basis function. The 64

basis functions can be illustrated by the following image.

The FDCT transforms the 64 pixels to a linear combination of these 64 squares.

Steps of FDCT: At the beginning of FDCT, the

pixel values are shifted into the range [-128, 127], with zero as the center.

These shifted pixel values are the Syx, used in the formula.

Steps of FDCT: Example 3:

If one of the blocks is like below,

Image Processing

Which after shifting results in,

Steps of FDCT: Example 3:

And then taking DCT and rounding to the nearest integer results

Image Processing


Quantization


Step: Quantization The human eye is fairly good at seeing small differences

in brightness over a relatively large area.

But not so good at distinguishing the exact strength of a high frequency brightness variation.

This fact allows reducing the amount of information in the high frequency components.

This is done by simply dividing each coefficient by a constant for that component, and then rounding to the nearest integer.

Quantization This is the main lossy operation in the whole process.

As a result of this, it is typically the case that many of the higher frequency components are rounded to zero.

A common quantization matrix is(i.e. the numbers by which the coefficients are divided)

Using this quantization matrix with the DCT coefficient matrix of example 3 results in

Quantization

Here we see, most of the high frequency components are zero.

This matrix is sent to the next step for entropy encoding.


Entropy Encoding

Encoding of DC-coefficient: During the initial step of entropy encoding, a DC-coefficient is

encoded as the difference between the current and the previous one.

Only the differences are subsequently processed.


Step: Entropy Encoding

Blocki-1 Blocki

DCi-1

DCi

Diff = DCi – DCi-1

Encoding of DC-coefficient:

The DC coefficients are usually highly correlated, this reduces the entropy of the DC data stream.

The result is a series of numbers.

Example 4: This may result something like 1,2,-1,0,2,3,-3,..., one number for each

block in the image, which is to be compressed with a lossless entropy encoding method.

Now these numbers are encoded according to the following table.

… -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 …

0 0

1 -1 1

2 -3 -2 2 3

3 -7 -6 -5 -4 4 5 6 7

4 -15 -14 -13 -12 -11 -10 -9 -8 8 9 10 11 12 13 14 15

5 … …

… … …

Entropy Encoding

Example 4: The rows are to be extended up to the number 11. Any positive or

negative integer in the range -2047 to +2047 can be specified using its row and column index in this table

• 1 is at (1, 0) in binary (0001,00000000000) represented as (0001, 0) • 3 is at (2, 1) in binary (0010,00000000001) represented as (0010, 01) • -3 is at (2,-2) in binary (0010,11111111110) represented as (0010, 10)

Under the baseline mode the first number( i.e. 0001 of (0001,0) ) is additionally compressed using either a default Huffman table, or optionally one provided with the image.


Entropy Encoding


Example 4: The default table (for luminance DC coefficients) has the mapping

0000 00

0001 010

0010 011

0011 100

0100 101

0101 110

0110 1110

0111 11110

1000 111110

1001 1111110

1010 11111110

1011 111111110

So, the seven coefficients 1,2,-1,0,2,3,-3,..., are represented as

1 010 0

2 011 00

-1 010 1

0 00

2 011 00

3 011 01

-3 011 10

Entropy Encoding

Entropy Encoding

Zig-Zag ordering of AC-coefficients:

Zig-Zag ordering of AC-coefficients: The AC-coefficients are taken in the order shown below.

The Zig-Zag sequence actually collects the low frequency coefficients before high frequency coefficients, thereby grouping large numbers at the beginning of the sequence.

The Zig-Zag sequence for the example 3 after quantization will be

−3, 0, −3, −2, −6, 2, −4, 1, −4, 1, 1, 5, 1, 2, −1, 1, −1, 2, 0, 0, 0, 0, 0, −1, −1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

Encoding of AC-coefficient:

JPEG has a special Huffman code word for ending the sequence prematurely when the remaining coefficients are zero. Using this special code word, EOB, the sequence becomes

−26, −3, 0, −3, −2, −6, 2, −4, 1, −4, 1, 1, 5, 1, 2, −1, 1, −1, 2, 0, 0, 0, 0, 0, −1, −1, EOB

Now, this sequence is first encoded using RLE for 0. All sequences of 0s are replaced by a special symbol and then the number of consecutive zeros.

Then this sequence is processed in the similar fashion, i.e. each coefficient is transformed to two numbers using the table and then the first number is Huffman coded.

Entropy Encoding

Encoding of AC-coefficient:

In this case of lossy sequential encoding, the whole image is coded and decoded in a single run.

The figure below shows an example of decoding with immediate presentation; the picture is presented from top to bottom.


Sequential picture presentation used in the lossy DCT-based mode

Expanded Lossy DCT-based Mode

Step: Image Preparation


The difference in this step from the previous mode is, a sample precision of 12 bits per sample, in addition to 8 bits per sample, can be used.

Step: Image processing:

Image Processing is DCT-based and follows rules analogous to be baseline DCT mode.


Step:Quantization and Encoding In this mode, JPEG specifies progressive encoding in

addition to sequential encoding. In the first run, a very rough representation of the image

appears which looks out of focus and is refined during successive steps.

Progressive picture presentation used in the expanded DCT-based mode

Progressive image representation is achieved by expansion of quantization.

A buffer is added at the output of the quantizer that temporarily stores all coefficients of the quantized DCT.

Progressiveness is achieved in two different ways By using a spectral selection. DC component and first few AC

coefficients are sent first, then gradually some more AC components.


Step:Quantization and Encoding

Spectral selection: In the following figure we see how selection of more coefficients

refines the image. In the figure, original image and reconstructed image using the

additional coefficients are shown.


1 AC coefficient

After sending

2nd AC coefficient

Spectral selection:


After sending 3rd-5th AC coeff. 6th -9th AC coeff. 10th -15th AC coeff.

16th -20th AC coeff. 20th -30th AC coeff. 30th -40th AC coeff.

Successive approximation: In this way all the coefficients are sent in each run but single

bits are differentiated according to their significance. The MSB bits are encoded first and then the LSB.

For example, To encode the coefficient 13(1101), 8(1000) is encoded in the first run, 4(100) is encoded in the second run, 0(00) is encoded in the third run and 1(1) is encoded in the fourth run.

So, the picture is refined gradually.


Step:Quantization and Encoding


So, image display in this mode may be sequential or Progressive, bits per sample may be 8 or 12 and Entropy coding may be Huffman or arithmetic.

This leads to 12 alternative processing modes.

Image Display Bits Per Sample Entropy Encoding

Sequential 8 Huffman Coding

Sequential 8 Arithmetic Coding

Sequential 12 Huffman Coding

Sequential 12 Arithmetic Coding

Progressive Successive 8 Huffman Coding

Progressive Spectral 8 Huffman Coding

Progressive Successive 8 Arithmetic Coding

Progressive Spectral 8 Arithmetic Coding

Progressive Successive 12 Huffman Coding

Progressive Spectral 12 Huffman Coding

Progressive Successive 12 Arithmetic Coding

Progressive Spectral 12 Arithmetic Coding

Lossless Mode

Lossless Mode Data units of single pixels are taken for image

preparation. Any precision between 2 and 16 bits per pixel can be

used Image processing and quantization use a predictive

technique. For each pixel X, one of eight possible predictions is

selected. The selection criterion is a prediction that is as good as possible.

C B

A X Principle of the prediction in the lossless mode

Lossless Mode

The number of the chosen predictor, as well as the difference of the prediction to the actual value is passed to the entropy encoding step.

Entropy encoding can use Huffman or arithmetic encoding techniques.

Selection Value Prediction

0 No Prediction

1 X=A

2 X=B

3 X=C

4 X=A+B-C

5 X=A+(B+C)/2

6 X=B+(A-C)/2

7 X=(A+B)/2

Predictors for lossless coding

Hierarchical Mode

The hierarchical mode uses either the lossy DCT-based algorithms or the lossless compression. The main feature of this mode is the encoding of an image at different resolutions.

The prepared image is initially sampled at a lower resolution (down sampled by 2n ).

That smaller image is coded using another JPEG mode (Progressive, Sequential, or Lossless).

Then the image is decoded and up-sampled. The difference between the up-sampled and the original image

is encoded using Progressive, Sequential, or Lossless

This process can be repeated multiple times. Good for viewing high resolution image on low resolution display.

Hierarchical Mode

A three level JPEG encoder is shown below.

Hierarchical Mode

The End

jpeg (joint photographic expert group) prepared by: md. abedul haque std no: 100505010p cse, buet

Documents