jpeg (joint photographic expert group) prepared by: md. abedul haque std no: 100505010p cse, buet
TRANSCRIPT
JPEG (Joint Photographic Expert Group)
Prepared By: Md. Abedul Haque
Std No: 100505010P
CSE, BUET
JPEG JPEG is the image compression standard developed by
the Joint Photographic Experts Group.
In 1992, JPEG became an ISO international Standard.
JPEG applies to color and gray-scaled still images.
The JPEG implementation should be independent of image size. applicable to any image and pixel aspect ratio.
Color representation should be independent of the special implementation.
Image content may be of any complexity.
The JPEG standard specification should be state-of-the-art regarding the compression factor and achieved image quality.
Sequential and progressive decoding should be possible.
JPEG requirements
Steps of JPEG compression
The following figure outlines the steps of JPEG compression.
Picture Preparation
Pixel
Block, MCU
Picture Processing
Predictor
FDCT
Quantization
Entropy Coding
Run-Length
Huffman
Arithmetic
JPEG Modes Four different variants of image compression lead to four
modes. The lossy sequential DCT-based mode(baseline process) must
be supported by every JPEG implementation
The expanded lossy DCT-based mode provides a set of further enhancements to the baseline process.
The lossless mode has a low compression ratio but allows perfect reconstruction of the original image.
The hierarchical mode accommodates images of different resolutions and selects its algorithms from the three modes of above.
Lossy Sequential DCT-based Mode
Image Preparation
Lossy Sequential DCT-based Mode
Step: Image Preparation Image Component/Plane:
An image consists of at least 1 and at most 255 planes (or component).
An RGB image consist of three components(or planes) i.e. RED, GREEN, BLUE
RED Plane GREEN Plane BLUE Plane
Image Preparation
Image Component/Plane: a gray-scale image consist of a single component i.e. Intensity.
An YUV image consist of three planes namely Luminance plane/Y plane Chrominance plane U Chrominance plane V
C1: Luminance Y
C2: Chrominance U
C3: Chrominance V
Each small rectangle represents a pixel
Image Component/Plane: Now, unlike RGB image, Y, U, and V plane may be of different
resolution. That is the following figure is also possible.
Image Preparation
C1: Luminance Y
C2: Chrominance U
C2: Chrominance V
In this case, we see that 4 pixels share same chrominance value.
Each pixel is represented by p bits. All pixels of all components within the same image are coded with the same number of bits.
Data Unit/Block: Each block is made of 8x8 pixels. This definition of block/data
unit comes from DCT.
These blocks of 8x8 pixels are transferred to the next step as a unit for processing.
There are two ways these data units are passed to the next step.
Non-interleaved data ordering:
• In this case, data units are passed component by component.
• All the data units from first component are passed and then from the second component and so on.
• Data units from the each component are passed from left to right, top to bottom order.
Image Preparation
Non-interleaved data ordering: Using this mode for an RGB-encoded image with very high
resolution, the display would initially present only the red component, then green and then blue resulting the original image.
Data Unit/Block
To the Image Processing Step
Represents data unit of 8x8 pixels
Interleaved data ordering: In this approach data units from different components are
combined into Minimum Coded Unit (MCU). If all components have the same resolution, an MCU
consists of exactly one data unit from each component.
Data Unit/Block
MCU2
MCU3
MCU4
MCU5
MCU1
And so on
To the Image Processing Step
Interleaved data ordering: If all components don’t have the same resolution,
Different number of data units from each component comprises the MCU.
The number of data units from each component is calculated from relative horizontal and vertical sampling ratios.
Data units from each component are taken from left to right, top to bottom order.
MCUs are also constructed from left to right, top to bottom order.
Example 1: Let, three different plane of an image has resolutions as follows:
Plane 0: X0 = 512, Y0 = 256 Plane 1: X1= 256, Y1 = 512 Plane 2: X2 = 128, Y2 = 256
Data Unit/Block
Example 1:
Data Unit/Block
…………………………………………………………………………………………………………..
512
256………………………………………………
256
512
128
256
…………...
Plane 0(512x256 pixels)
Plane1 (256x512 pixels)
Plane 2(128x256 pixels)
Now, if we see data units of each component we find
********************************************************************************
64
32
****************************************************************
32
64
16
32
********
Each * represents a data unit i.e. 8x8 pixels
Data Unit/Block Example 1:
Now Hi, and Vi are called relative sampling ratios and calculated for each plane.
Hi = Xi / Xmin
Vi= Yi / Ymin
So, we get Plane 0: H0 = 4, V0 = 1 Plane 1: H1 = 2, V1 = 2 Plane 2: H2 = 1, V2 = 1 Hi, and Vi must be integer values between 1 and 4.
Now, a MCU is built by taking H0xV0 data units from plane 0 H1xV1 units from plane 1 H2xV2 units from plane 2
Example 1:
Data Unit/Block
**** ****
*MCU
Example 2: Let, an image has four planes with the following dimensions
Plane 0: X0 = 48 pixels, Y0 = 32 pixels
Plane 1: X1= 48 pixels, Y1 = 16 pixels
Plane 2: X2 = 24 pixels, Y2 = 32 pixels
Plane 3: X3 = 24 pixels, Y3 = 16 pixels
Data Unit/Block Example 2:
Represents a data unit or block i.e. 8x8 pixels
If we calculate Hi and Vi like previous example we will find MCUs like below
H0 = 2, V0 = 2
H1 = 2, V1 = 1
H2 = 1, V2 = 2
H3 = 1, V3 = 1
MCU1
MCU2
MCU3
MCU4
MCU5
MCU6
Blocks of these MCUs are transferred to the next step.
Lossy Sequential DCT-based Mode
Image Processing
In this baseline mode, blocks of 8x8 pixels come to the image processing unit in particular order from the previous step.
Each pixel is encoded using 8 bits. First step of image processing is a transformation
performed by DCT. FDCT:
FDCT transforms an image from spatial to frequency domain. Transforming a 8x8 pixel block through FDCT we find 64
coefficients which can be regarded as a two-dimensional frequency.
Lossy Sequential DCT-based Mode
Step: Image Processing
Image Processing FDCT:
The formula for FDCT is given below
Svu represents the DCT coefficients.
The co-efficient S00 corresponds to the lowest frequency in both dimensions. It is known as DC-coefficient, which determines the fundamental color of the data units of 64 pixels.
The other coefficients are called AC-coefficients.
Image Processing
FDCT: The function of the formula Svu is called basis function. The 64
basis functions can be illustrated by the following image.
The FDCT transforms the 64 pixels to a linear combination of these 64 squares.
Steps of FDCT: At the beginning of FDCT, the
pixel values are shifted into the range [-128, 127], with zero as the center.
These shifted pixel values are the Syx, used in the formula.
Steps of FDCT: Example 3:
If one of the blocks is like below,
Image Processing
Which after shifting results in,
Steps of FDCT: Example 3:
And then taking DCT and rounding to the nearest integer results
Image Processing
Lossy Sequential DCT-based Mode
Quantization
Lossy Sequential DCT-based Mode
Step: Quantization The human eye is fairly good at seeing small differences
in brightness over a relatively large area.
But not so good at distinguishing the exact strength of a high frequency brightness variation.
This fact allows reducing the amount of information in the high frequency components.
This is done by simply dividing each coefficient by a constant for that component, and then rounding to the nearest integer.
Quantization This is the main lossy operation in the whole process.
As a result of this, it is typically the case that many of the higher frequency components are rounded to zero.
A common quantization matrix is(i.e. the numbers by which the coefficients are divided)
Using this quantization matrix with the DCT coefficient matrix of example 3 results in
Quantization
Here we see, most of the high frequency components are zero.
This matrix is sent to the next step for entropy encoding.
Lossy Sequential DCT-based Mode
Entropy Encoding
Encoding of DC-coefficient: During the initial step of entropy encoding, a DC-coefficient is
encoded as the difference between the current and the previous one.
Only the differences are subsequently processed.
Lossy Sequential DCT-based Mode
Step: Entropy Encoding
Blocki-1 Blocki
DCi-1
DCi
Diff = DCi – DCi-1
Encoding of DC-coefficient:
The DC coefficients are usually highly correlated, this reduces the entropy of the DC data stream.
The result is a series of numbers.
Example 4: This may result something like 1,2,-1,0,2,3,-3,..., one number for each
block in the image, which is to be compressed with a lossless entropy encoding method.
Now these numbers are encoded according to the following table.
… -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 …
0 0
1 -1 1
2 -3 -2 2 3
3 -7 -6 -5 -4 4 5 6 7
4 -15 -14 -13 -12 -11 -10 -9 -8 8 9 10 11 12 13 14 15
5 … …
… … …
Entropy Encoding
Example 4: The rows are to be extended up to the number 11. Any positive or
negative integer in the range -2047 to +2047 can be specified using its row and column index in this table
• 1 is at (1, 0) in binary (0001,00000000000) represented as (0001, 0) • 3 is at (2, 1) in binary (0010,00000000001) represented as (0010, 01) • -3 is at (2,-2) in binary (0010,11111111110) represented as (0010, 10)
Under the baseline mode the first number( i.e. 0001 of (0001,0) ) is additionally compressed using either a default Huffman table, or optionally one provided with the image.
Encoding of DC-coefficient:
Entropy Encoding
Encoding of DC-coefficient:
Example 4: The default table (for luminance DC coefficients) has the mapping
0000 00
0001 010
0010 011
0011 100
0100 101
0101 110
0110 1110
0111 11110
1000 111110
1001 1111110
1010 11111110
1011 111111110
So, the seven coefficients 1,2,-1,0,2,3,-3,..., are represented as
1 010 0
2 011 00
-1 010 1
0 00
2 011 00
3 011 01
-3 011 10
Entropy Encoding
Entropy Encoding
Zig-Zag ordering of AC-coefficients:
Zig-Zag ordering of AC-coefficients: The AC-coefficients are taken in the order shown below.
The Zig-Zag sequence actually collects the low frequency coefficients before high frequency coefficients, thereby grouping large numbers at the beginning of the sequence.
The Zig-Zag sequence for the example 3 after quantization will be
−3, 0, −3, −2, −6, 2, −4, 1, −4, 1, 1, 5, 1, 2, −1, 1, −1, 2, 0, 0, 0, 0, 0, −1, −1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
Encoding of AC-coefficient:
JPEG has a special Huffman code word for ending the sequence prematurely when the remaining coefficients are zero. Using this special code word, EOB, the sequence becomes
−26, −3, 0, −3, −2, −6, 2, −4, 1, −4, 1, 1, 5, 1, 2, −1, 1, −1, 2, 0, 0, 0, 0, 0, −1, −1, EOB
Now, this sequence is first encoded using RLE for 0. All sequences of 0s are replaced by a special symbol and then the number of consecutive zeros.
Then this sequence is processed in the similar fashion, i.e. each coefficient is transformed to two numbers using the table and then the first number is Huffman coded.
Entropy Encoding
Encoding of AC-coefficient:
In this case of lossy sequential encoding, the whole image is coded and decoded in a single run.
The figure below shows an example of decoding with immediate presentation; the picture is presented from top to bottom.
Lossy Sequential DCT-based Mode
Sequential picture presentation used in the lossy DCT-based mode
Expanded Lossy DCT-based Mode
Step: Image Preparation
Expanded Lossy DCT-based Mode
The difference in this step from the previous mode is, a sample precision of 12 bits per sample, in addition to 8 bits per sample, can be used.
Step: Image processing:
Image Processing is DCT-based and follows rules analogous to be baseline DCT mode.
Expanded Lossy DCT-based Mode
Step:Quantization and Encoding In this mode, JPEG specifies progressive encoding in
addition to sequential encoding. In the first run, a very rough representation of the image
appears which looks out of focus and is refined during successive steps.
Progressive picture presentation used in the expanded DCT-based mode
Progressive image representation is achieved by expansion of quantization.
A buffer is added at the output of the quantizer that temporarily stores all coefficients of the quantized DCT.
Progressiveness is achieved in two different ways By using a spectral selection. DC component and first few AC
coefficients are sent first, then gradually some more AC components.
Expanded Lossy DCT-based Mode
Step:Quantization and Encoding
Spectral selection: In the following figure we see how selection of more coefficients
refines the image. In the figure, original image and reconstructed image using the
additional coefficients are shown.
Expanded Lossy DCT-based Mode
1 AC coefficient
After sending
2nd AC coefficient
Spectral selection:
Expanded Lossy DCT-based Mode
After sending 3rd-5th AC coeff. 6th -9th AC coeff. 10th -15th AC coeff.
16th -20th AC coeff. 20th -30th AC coeff. 30th -40th AC coeff.
Successive approximation: In this way all the coefficients are sent in each run but single
bits are differentiated according to their significance. The MSB bits are encoded first and then the LSB.
For example, To encode the coefficient 13(1101), 8(1000) is encoded in the first run, 4(100) is encoded in the second run, 0(00) is encoded in the third run and 1(1) is encoded in the fourth run.
So, the picture is refined gradually.
Expanded Lossy DCT-based Mode
Step:Quantization and Encoding
Expanded Lossy DCT-based Mode
So, image display in this mode may be sequential or Progressive, bits per sample may be 8 or 12 and Entropy coding may be Huffman or arithmetic.
This leads to 12 alternative processing modes.
Image Display Bits Per Sample Entropy Encoding
Sequential 8 Huffman Coding
Sequential 8 Arithmetic Coding
Sequential 12 Huffman Coding
Sequential 12 Arithmetic Coding
Progressive Successive 8 Huffman Coding
Progressive Spectral 8 Huffman Coding
Progressive Successive 8 Arithmetic Coding
Progressive Spectral 8 Arithmetic Coding
Progressive Successive 12 Huffman Coding
Progressive Spectral 12 Huffman Coding
Progressive Successive 12 Arithmetic Coding
Progressive Spectral 12 Arithmetic Coding
Lossless Mode
Lossless Mode Data units of single pixels are taken for image
preparation. Any precision between 2 and 16 bits per pixel can be
used Image processing and quantization use a predictive
technique. For each pixel X, one of eight possible predictions is
selected. The selection criterion is a prediction that is as good as possible.
C B
A X Principle of the prediction in the lossless mode
Lossless Mode
The number of the chosen predictor, as well as the difference of the prediction to the actual value is passed to the entropy encoding step.
Entropy encoding can use Huffman or arithmetic encoding techniques.
Selection Value Prediction
0 No Prediction
1 X=A
2 X=B
3 X=C
4 X=A+B-C
5 X=A+(B+C)/2
6 X=B+(A-C)/2
7 X=(A+B)/2
Predictors for lossless coding
Hierarchical Mode
The hierarchical mode uses either the lossy DCT-based algorithms or the lossless compression. The main feature of this mode is the encoding of an image at different resolutions.
The prepared image is initially sampled at a lower resolution (down sampled by 2n ).
That smaller image is coded using another JPEG mode (Progressive, Sequential, or Lossless).
Then the image is decoded and up-sampled. The difference between the up-sampled and the original image
is encoded using Progressive, Sequential, or Lossless
This process can be repeated multiple times. Good for viewing high resolution image on low resolution display.
Hierarchical Mode
A three level JPEG encoder is shown below.
Hierarchical Mode
The End