© 2002 by Yu Hen Hu2ECE533 Digital Image Processing
Agenda
Overview Wavelet transform EBCOT - JPEG2000 coefficient modeling
and context encoding MQ arithmetic coding ROI: Region of Interests
© 2002 by Yu Hen Hu4ECE533 Digital Image Processing
Introduction
Joint Photographic Experts Group (JPEG) is an ISO standard committee with a mission on “Coding and compression of still images”.
JPEG coding standard (1988): DCT (discrete cosine transform) based transform coding to compress bit-map images.
JPEG2000 efforts started in 1996 to use new methods such as fractals or wavelets. The target deliver date is year 2000 and hence the name.
© 2002 by Yu Hen Hu5ECE533 Digital Image Processing
JPEG2000 Features
• High compression efficiency• Lossless color transformations• Lossy and lossless coding in one algorithm• Embedded lossy to lossless coding• Progressive by resolution and quality• Static and dynamic Region-of-Interest• Error resilience• Visual (fixed and progressive) coding• Multiple component images• Palletized Images
© 2002 by Yu Hen Hu6ECE533 Digital Image Processing
Handling Large Images
Partition in both spatial and frequency domain Spatial Domain Partition: Tile, Frame
» bit streams of different tiles or frames are not independent
» artifact may occur at boundaries Special wavelet transform:
» Spatially segmented wavelet transform (SSWT)» Line based wavelet transform
Block: Independent partition in frequency domain (wavelet coefficients)» bit streams are independently generated
© 2002 by Yu Hen Hu7ECE533 Digital Image Processing
JPEG at 0.125 bpp (enlarged)
C. Christopoulos, A. Skodras, T. Ebrahimi, JPEG2000 (online tutorial)
© 2002 by Yu Hen Hu8ECE533 Digital Image Processing
JPEG2000 at 0.125 bpp
C. Christopoulos, A. Skodras, T. Ebrahimi, JPEG2000 (online tutorial)
© 2002 by Yu Hen Hu10ECE533 Digital Image Processing
Wavelet Based Image Coding
DiscreteWavelet
Transform
Context-basedQuantization
Entropycoding
20 40 60
10
20
30
40
50
60
20 40 60
10
20
30
40
50
60
2D discrete wavelet transformconverts images into “sub-bands”Upper left is the DC coefficientLower right are higher frequencysub-bands.
© 2002 by Yu Hen Hu11ECE533 Digital Image Processing
1D Discrete Wavelet Transform
x(n)H0
H1
2
2z–1
H0
H1
2
2z–1
H0
H1
2
2z–1
y0
y1
y2y3
y0 y1 y2 y3
HO: low pass digital filter, H1: high pass digital filter.Z-1: delay, 2: down-sample by 2
Recursive application of wavelet transform in spatialdomain corresponds to dyadic partition of data in the frequency domain.
© 2002 by Yu Hen Hu12ECE533 Digital Image Processing
2D Separate DWT
1D DWT applied alternatively to vertical and horizontal direction line by line.
The LL band is recursively decomposed, first vertically, and then horizontally.
This is Mallat method. Other methods have also been proposed.
L H
LL
HL
LH
HH
Image in spatial domain
HL
LH
HHHL
LH
HH
© 2002 by Yu Hen Hu13ECE533 Digital Image Processing
Bit Plane Coding
Coefficients are represented in sign-magnitude format Bit plane starts from the most significant bit (MSB) Sign bit is encoded after the MSB is encoded. Context (surrounding bit patterns) at each bit plane is
examined. Key: explore patterns in binary bit-plane.
3 -1 7
4 -5 2
6 1 -2
0 0 1+
1+
1- 0
1+
0 0
1+
0 1
0 0 1+
1 0 1-
1 1- 1
0 1 0
0 1+
0MSB LSB
© 2002 by Yu Hen Hu14ECE533 Digital Image Processing
SPIHT Set Partitioning in Hierarchical
Trees. Amir Said and William Pearlman
(IEEE Trans. CSVT, 1996) Based on zero tree wavelet
coding Main ideas:
» Partial magnitude sorting of wavelet transformation coefficients
» Ordered bit plane transmission » Exploitation of the self-similarity
among wavelet coefficients between sub-bands having parent-descendent relations.
© 2002 by Yu Hen Hu15ECE533 Digital Image Processing
JPEG2000
Image components, tiles, and sub-band structures
Wavelet transform Coefficient modeling Arithmetic coding
© 2002 by Yu Hen Hu16ECE533 Digital Image Processing
Tiling
XTOsiz + XTsiz > XOsiz, YTOsiz + YTsiz > YOsiz
© 2002 by Yu Hen Hu17ECE533 Digital Image Processing
Image Structure
Image
Image components
Tiles
precinct
layers
Code block
resolution Sub-band
packet
4LL 3HL
3LH 3HH
2HL
1HL
2HH
1HH
2LH
1LH
© 2002 by Yu Hen Hu18ECE533 Digital Image Processing
Layered Bit stream
Each bit stream is organized as a succession of layers
Each layer contains additional contributions from each block (some contributions might be empty)
Block truncation points associated with each layer are optimal in the rate distortion sense
Rate distortion optimization can be performed but it does not need to be standardized
© 2002 by Yu Hen Hu19ECE533 Digital Image Processing
DC level shift and component transform
Purpose of component transform is to de-correlate among components.For multi-spectral images, PCA may be used.There are reversible and irreversible transforms.
Forward reversible component transform
Inverse reversible component transform
© 2002 by Yu Hen Hu20ECE533 Digital Image Processing
Reversible Color Transform
Make lossless color coding possible. All components must have
identical sub-sampling parameters and same depth
© 2002 by Yu Hen Hu22ECE533 Digital Image Processing
IDWT Procedure
IDWT
DonelevNL
lev0 I(x,y) aoLL(x,y)
a(lev1)LL(u,v) = 2D_SR(alevLL(u,v), alevHL(u,v), alevLH(u,v), alevHH(u,v))
Iev lev1
yes
no
© 2002 by Yu Hen Hu24ECE533 Digital Image Processing
Lossless 1D DWT
Reversible Integer DWT DWT coefficients are integers without any truncation error provided
image component pixel values are also integer-valued. Transform is exactly reversible. Non-causal filter.
Xext(), Yext(): symmetrically, cyclic extended signals.
Forward transform Reverse transform
I01 2n < i1 1; I0 2n+1 < i1 ; I01 2n+1 < i1 1; I0 2n < i1 ;
© 2002 by Yu Hen Hu25ECE533 Digital Image Processing
Lossy 1D DWTDaubechies’ (9,7) filter in the lifting format.
Step 1: i03 2n < i1+3
Step 2: i02 2n+1 < i1+2
Step 3: i03 2n < i1+3
Step 4: i02 2n+1 < i1+2
Step 5: i01 2n < i1+1
Step 6: i0 2n+1 < i1
= 1.586 134 342, = 0.052 980 118 = 0.882 911 075, = 0.443 506 852K = 1.230 174 105
Step 1: i03 2n+1 < i1+3
Step 2: i02 2n < i1+2
Step 3: i01 2n+1 < i1+1
Step 4: i0 2n < i1Step 5: i0 2n+1 < i1Step 6: i0 2n < i1
© 2002 by Yu Hen Hu26ECE533 Digital Image Processing
Row-based Wavelet Transform
Problem with traditional wavelet transform: » filtering to be performed in both vertical and horizontal
directions. While access in one direction is easy, access in the other will require whole image to be buffered
» Difficult for implementation on PDA or other hand-held devices with limited amount of main memory.
Row-based wavelet transform» consumes the minimum amount of resources, » gives same results as traditional wavelet transform
Method » Use a rolling window for each decomposition level to keep
enough number (five) rows of image data in on-chip memory.
© 2002 by Yu Hen Hu28ECE533 Digital Image Processing
Context Coding Algorithm: EBCOT
Embedded Block Coding with Optimal Truncation Block Coding
» Divide each sub-band into code blocks of samples which are coded independently
» For each block, a separate bit-stream is generated without utilizing any information from any of the other blocks
Optimal Truncation» The bit-stream of each block can be truncated to a variety
of discrete lengths, with associated distortion» A post-processing step after all blocks are compressed
determines truncation point for each block
© 2002 by Yu Hen Hu29ECE533 Digital Image Processing
EBCOT Block Coding
Taubman and Zakhor (IEEE Trans. IP, Sep. 94). Layered Zero Coding with Fractional Bit-Planes.
» For each bit plane, the encoding is applied three passes. Four types of coding operations for Arithmetic Entropy
Coding:» Zero Coding (ZC)
» Run-Length Coding (RLC)
» Sign Coding (SC)
» Magnitude Refinement Usage rule:
If a pixel is not yet significant, use ZC and RLC to encode whether it is significant in the current bit plane. If so, use SC to encode its sign. If a pixel is already significant, use Magnitude refinement to encode the new bit position.
© 2002 by Yu Hen Hu30ECE533 Digital Image Processing
Two Tiered Coding in EBCOT
Full-featured bit-stream
Blocks of subband samples
Embedded block bit-streamsand summary information
T1Low-level embedded block
coding engine
T2Layer formation and block
summary informationcoding engine
All the complexity is concentrated in the low-level block coding engine, T1, which generates embedded block bit-streams. The second tier, T2, plays a vital role in efficiently representing the individually coded blocks in a full-featured bit-stream.
ISO/IEC JTC 1/SC 29/WG 1 N1422
© 2002 by Yu Hen Hu31ECE533 Digital Image Processing
Illustration of Layered Coding
Illustration of block contributions to bit-stream layers. Only five layers are shown with seven code blocks, for simplicity.
Notice that not all code blocks need contribute to every layer and that the number of bytes contributed by blocks to any given layer is generally highly variable.
Notice also that the block coding operation proceeds vertically through each code block independently, whereas the layered bit-stream organization is horizontal, distributing the embedded bit-streams for each block throughout the bit-stream.
empty
empty empty
empty
empty
empty
empty
empty
layer 5
layer 4
layer 3
layer 2
layer 1
block 1 bit-stream
block 2 bit-stream
block 3 bit-stream
block 4 bit-stream
block 5 bit-stream
block 6 bit-stream
block 7 bit-stream
© 2002 by Yu Hen Hu32ECE533 Digital Image Processing
Embedded Block Bit Stream
1,3-MiiP1,3-MiiP
2,1-MiiP1,3-MiiP
2,2-MiiP2,3-MiiP 0,2
iP0,3iP1,3-MiiP
0,1iP1,3-MiiP
1,3iP
Pip,k: k-th pass of i-th block, p-th bit plane (1 p Mi1)
Scanning order: for i = 1, 2, …
for p = 1, 2, … Mi1
for k = 1, 2, 3 Three passes process:
» Significant Propagation Pass (Pip,1)
» Magnitude Refinement Pass (Pip,2)
» Clean up Pass (Pip,3)
© 2002 by Yu Hen Hu33ECE533 Digital Image Processing
Coefficient Bit Modeling
Wavelet coefficients are associated with different sub-bands arising from the 2D separable transform applied
These coefficients are then arranged into rectangular blocks within each sub-band, called code-blocks.
These code-blocks are then coded a bit-plane at a time starting from the most significant bit-plane with a non-zero element to the least significant bit-plane.
For each bit-plane in a code-block, a special code-block scan pattern is used for each of three coding passes.
Each coefficient bit in the bit-plane is coded in only one of the three coding passes:» significance propagation, » magnitude refinement, and » cleanup.
© 2002 by Yu Hen Hu34ECE533 Digital Image Processing
Three Passes Scanning
Significant Pass» Scanning all insignificant samples which have at
least one significant neighbors to determine if it will become significant at current bit plane.
» Use ZC to encode if a sample is still insignificant.» If a sample becomes significant, also apply SC to
encode its sign bit. Magnitude Refinement Pass
» Scanning samples which became significant in a previous bit-plane using MR encoding.
Normalization Pass» Scanning all remaining samples and encode using
ZC + RLC
© 2002 by Yu Hen Hu35ECE533 Digital Image Processing
Scanning Order within a code block
Each bit plane with a code block is scanned during the context coding process in a specific order.
All quantized transform coefficients are represented in sign-magnitude representation.
For a particular sub-band, there is a maximum number of magnitude bits, Mb. The “significance state” changes from insignificant to significant at the bit plane where the most significant 1 bit is found.
For a code-block, the number of bit-planes starting from the most significant bit-plane that are all zero, is signaled in the packet header
© 2002 by Yu Hen Hu36ECE533 Digital Image Processing
Neighboring states used to form context
Each coefficient in a code-block has an associated binary state variable called its significance state.
Significance states are initialized to 0 (coefficient is insignificant) and may become 1 (coefficient is significant) during the course of the coding of the code-block.
Four different context formation rules are defined, one for each of the four coding operations: » significance propagation pass:
– significance coding, – sign coding,
» magnitude refinement pass– magnitude refinement coding,
» cleanup pass– Cleanup coding.
The current context obtained during context coding is provided to the arithmetic MQ coder.
© 2002 by Yu Hen Hu37ECE533 Digital Image Processing
Bit plane encoding orders
The number of bit-planes starting from the most significant bit that have no significant coefficients (only insignificant bits) is signaled in the packet headers.
The first bit-plane with a non-zero element has a cleanup pass only. The remaining bit-planes are coded in three coding passes.
Each coefficient bit is coded in exactly one of the three coding passes. Which pass a coefficient bit is coded in depends on the conditions for that pass.
In general, the significance propagation pass includes the coefficients that are predicted, or “most likely,” to become significant and their sign bits, as appropriate.
The magnitude refinement pass includes bits from already significant coefficients.
The cleanup pass includes all the remaining coefficients.
© 2002 by Yu Hen Hu38ECE533 Digital Image Processing
Context of Significance and Cleanup Passes
LL and LH sub-bands (vertical high pass)
HL sub-bands (horizontal high pass)
HH sub-bands (diagonally high pass)
Context label
H V D H V D (H+V) D
2 x x x 2 x x 3 8
1 1 x 1 1 x 1 2 7
1 0 1 0 1 1 0 2 6
1 0 0 0 1 0 2 1 5
0 2 x 2 0 x 1 1 4
0 1 x 1 0 x 0 1 3
0 0 2 0 0 2 2 0 2
0 0 1 0 0 1 1 0 1
0 0 0 0 0 0 0 0 0
x: don’t care
© 2002 by Yu Hen Hu39ECE533 Digital Image Processing
Significance propagation pass
The significance propagation pass includes only bits of coefficients that were insignificant (the significance bit has yet to be encountered) and have a non-zero context. All other coefficients are skipped.
The context is delivered to the arithmetic decoder (along with the bit stream) and the decoded coefficient bit is returned.
If the value of this bit is 1 then the significance state is set to 1 and the immediate next bit to be decoded is the sign bit for the coefficient. Otherwise, the significance state remains 0.
When the contexts of successive coefficients and coding passes are considered, the most current significance state for this coefficient is used.
© 2002 by Yu Hen Hu40ECE533 Digital Image Processing
V0 (or H0) V1 (or H1) V (or H)
contribution S (significant), P (positive) S, P 1
S, N (negative) S, P 0 I (insignificant) S, P 1
S, P S, N 0 S, N S, N 1
I S, N 1 S, P I 1 S, N I 1
I I 0
Sign Bit Coding Two phases:
» Summarize contributions of vertical and horizontal neighbors
» Reduces these contributions into 1 or 5 context labels
The context labels are sent to MQ arithmetic coder.
Signbit = AC(contextlabel) XORbit» Signbit: sign bit of the
current coefficient» AC(contextlabel) is the
valuate returned from arithmetic decoder given the context label and the bit stream.
H contribution V contribution Context label XORbit 1 1 13 0 1 0 12 0 1 1 11 0 0 1 10 0 0 0 9 0 0 1 10 1 1 1 11 1 1 0 12 1 1 1 13 1
© 2002 by Yu Hen Hu41ECE533 Digital Image Processing
Magnitude Refinement
The magnitude refinement pass includes the bits from coefficients that are already significant (except those that have just become significant in the immediately proceeding significance propagation pass).
The context used is determined by the summation of the significance state of the horizontal, vertical, and diagonal neighbors. These are the states as currently known to the decoder, not the states used before the significance decoding pass.
Further, it is dependent on whether this is the first refinement bit (the bit immediately after the significance and sign bits) or not.
H + V + D 1st refinement for this coefficient
Context label
X False 16 1 True 15 0 True 14
© 2002 by Yu Hen Hu42ECE533 Digital Image Processing
Cleanup Pass
The first pass and only coding pass for the first significant bit-plane. The third and the last pass of all the remaining bit-planes. Use both neighbor context as in significant propagation pass and run-
length coding.
© 2002 by Yu Hen Hu43ECE533 Digital Image Processing
Context-based Arithmetic Entropy Coding
The MQ-coder, a low complexity entropy coder is used.
Contexts are based on the significance of horizontal, vertical, diagonal neighbors of the pixel concerned.
Current there are 46 contexts.
© 2002 by Yu Hen Hu44ECE533 Digital Image Processing
Tagged TreeEach node has an associated current value, which is initialized to zero (the minimum). A 0 bit in the tag tree means that the minimum (or the value in the case of the highest level) is larger than the current value and a 1 bit means that the minimum (or the value in the case of the highest level) is equal to the current value. For each contiguous 0 bit in the tag tree the current value is incremented by one. Nodes at higher levels cannot be coded until lower level node values are fixed (i.e a 1 bit is coded). The top node on level 0 (the lowest level) is queried first. The next corresponding node on level 1 is then queried, and so on.
© 2002 by Yu Hen Hu45ECE533 Digital Image Processing
Tagged tree encoding example
K = 0 (top level)
t0(0,0) = 0 (initialize)
t0(0,0) = 0 < q0(0,0) = 1
output 0, t0(0,0)= t0(0,0)+1=1
t0(0,0) = 1 = q0(0,0) = 1output 1, K = K+1 = 1
Note: q0(0,0) is encoded!K = 1
t1(0,0) = q0(0,0) = 1 (initialize)
t1(0,0) = 1 = q1(0,0)output 1, K = K+1 = 2
Note: q1(0,0) is encoded!K = 2
t2(0,0) = q1(0,0) = 1 (initialize)
t2(0,0) = 1 = q2(0,0) = 1output 1, K = K+1 = 3
Note: q2(0,0) is encoded.
K = 3
t3(0,0) = q2(0,0) = 1 (initialize)
t3(0,0) = q3(0,0) = 1output 1, done
Note: q3(0,0) is encoded
Thus, code for q3(0,0): 01111
q0(0,0)=1
q1(0,0)=1
q2(0,0)=1
q3(0,0)=1
01
1
1
1
© 2002 by Yu Hen Hu46ECE533 Digital Image Processing
Example continuedNext, encode q3(1,0). Since its parent node
q2(0,0) is known, we start with K = 3:K = 3
t3(1,0) = q2(0,0) = 1 (initialize)
t3(1,0) = 1 < q3(1,0) = 3
output 0, t3(1,0) = t3(1,0) + 1 = 2
t3(1,0) = 2 < q3(1,0) = 3
output 0, t3(1,0) = t3(1,0) + 1 = 3
t3(1,0) = 3 = q3(1,0), doneoutput 1,
Note q3(1,0) is encoded as 001
Now, consider q3(2,0). Its parent is q2(1,0) which needs to be encoded first.
K = 2
t2(1,0) = q1(0,0) = 1
t2(1,0) = 1 = q2(1,0)output 1, K = K + 1 = 3
K = 3
t3(2,0) = q2(1,0) = 1
t3(2,0) = 1 < q3(2,0) = 2
output 0, t3(2,0) = t3(2,0)+1 = 2
t3(2,0) = 2 = q3(2,0), done
output 1
Hence q3(2,0) is encoded as 101q0(0,0)=1
q1(0,0)=1
q2(0,0)=1
q3(0,0)=1
01
1
1
1
q3(1,0)=3 q3(2,0)=2
q2(1,0)=1
001
1
01
© 2002 by Yu Hen Hu47ECE533 Digital Image Processing
Layers
Bit-stream is a succession of layers. Layer contains the contributions from each code
block. The block truncation associated with each layer are
optimal in rate-distortion sense. Single layer can achieve “progressive in resolution” Multiple layers can achieve “progressive in SNR”
© 2002 by Yu Hen Hu49ECE533 Digital Image Processing
Basic Arithmetic Coding
MPS: more probable symbol with probability Pe
LPS: less probable symbol with probability Qe
If M is encoded, current interval is the Pe part, else, it is the Qe part (bottom). The length is kept in variable A.
Code string C points to the base of the current interval.
Pe
Qe
M M L M
0.0
1.0
© 2002 by Yu Hen Hu50ECE533 Digital Image Processing
Encoding of the Sequence MMLM
A(the current interval)
if MPS is encoded C C+Qe A AQeelse(LPS is encoded) A Qeendif A < 0.75 Renormalize A and C; Update Qe;
• Interval A is kept between 0.75 and 1.5. Binary 0x8000 is used to represent 0.75 to make comparison easy.
• Each time A is doubled, so does C. The higher order byte of C register is overflowed to an external buffer (compressed code stream).
Qe
Qe
Qe
M M L MContext:
Qe
C(the pointer of code string)
0
A(0)
© 2002 by Yu Hen Hu51ECE533 Digital Image Processing
Decoding of the sequence MMLM
A(the current interval)
QeQe Qe
M M L MContext:
Qe
C(the pointer of code string)
If C>=Qe( MPS is decoded) C <- C-Qe A <- A-Qeelse(LPS is decoded) A <- Qeend
if A<0.75 Renormalize A and C; Update Qe; 0
A(0)
© 2002 by Yu Hen Hu52ECE533 Digital Image Processing
Context model
Uncompressed data
Probability estimator
Arithmetic encoder
Decision (D)Context (CX)
QeMPS
Arithmetic decoder
compressed data
Probability estimator
Decision (D)Context (CX)
Uncompressed
data
QeMPS
Context model
compressed data
JPEG2000 Arithmetic Codec
© 2002 by Yu Hen Hu53ECE533 Digital Image Processing
Encoder Register Structure
“a” bits -- fractional bits in the A-register (the current interval value) “x” bits -- fractional bits in the code register. “s” bits -- spacer bits which provide useful constraints on carry-over, “b” bits -- bit positions from which the completed bytes of the data
are removed from the C-register. “c” bit -- a carry bit.
© 2002 by Yu Hen Hu55ECE533 Digital Image Processing
Encode MPS, LPS
Total 46 context symbols are listed.Encoding is similar to a finite state machine: from current row, find the next row depending on MPS or LPS and output the code stream.
© 2002 by Yu Hen Hu57ECE533 Digital Image Processing
Region of Interests Coding
An ROI is a part of an image that is coded earlier in the code stream than the rest of the image (the background).
The coding is also done in such a way that the information associated with the ROI precedes the information associated with the background.
The method used is the Maxshift method. ROI allows certain parts of the image to be coded in better
quality Static:
» The ROI is decided and coded once for all at the encoder side Dynamic:
» The ROI can be decided and decoded on the fly from a same bit stream
© 2002 by Yu Hen Hu58ECE533 Digital Image Processing
MaxShift Method
Encoding1. Generate ROI mask, M(x,y).
– M(x,y) = 1, wavelet coefficient (x,y) is needed for ROI– M(x,y) = 0, wavelet coefficient (x,y) belong to background
pixels and can be sacrificed w/o affecting ROI.
2. Find the scaling value, s and scale up all ROI wavelet coefficients by s bits so that ROI coefficients > 2s > background coefficient
3. Write the scaling value, s, into code stream using the RGN marker
Decoding1. Get s from RGN marker2. Scale background wavelet coefficients by 2s
© 2002 by Yu Hen Hu59ECE533 Digital Image Processing
ROI Mask Computation
Must track wavelet coefficients that will contribute to ROI region pixels.
C. Christopoulos, A. Skodras, T. Ebrahimi, JPEG2000 (online tutorial)
© 2002 by Yu Hen Hu60ECE533 Digital Image Processing
Scale Operation
C. Christopoulos, A. Skodras, T. Ebrahimi, JPEG2000 (online tutorial)
© 2002 by Yu Hen Hu61ECE533 Digital Image Processing
Advantages of Maxshift method
Support for arbitrary shaped ROI’s with minimal complexity
No need to send shape information No need for shape encoder and decoder No need for ROI mask at decoder side Decoder as simple as non-ROI capable
decoder Can decide in which sub band the ROI will
begin» therefore it can give similar results to the general
scaling method
© 2002 by Yu Hen Hu62ECE533 Digital Image Processing
Conclusion
JPEG2000 is an emerging image coding standard for the next generation of digital imaging.
No IPR (intellectual property right) on part I of the standard (free licensing)
More complex than JPEG but designed with hardware implementation in mind.
Many companies are working to incorporate JP2 into the next generation of digital camera and scanners.