m. wu: enee631 digital image processing (spring'09) optimal bit allocation and unitary...

M. Wu: ENEE631 Digital Image Processing (Spring'09)

Optimal Bit Allocation and Unitary Transform Optimal Bit Allocation and Unitary Transform

in Image Compressionin Image Compression

Spring ’09 Instructor: Min Wu

Electrical and Computer Engineering Department,

University of Maryland, College Park

bb.eng.umd.edu (select ENEE631 S’09) [email protected]

ENEE631 Spring’09ENEE631 Spring’09Lecture 14 (3/23/2009)Lecture 14 (3/23/2009)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec14 – Bit Allocation & KLT[2]

Overview and LogisticsOverview and Logistics

Last Time: Wavelet transform and coding– EZW example: exploit coefficient structures to remove redundancy– Haar basis and transform; Considerations for wavelet filters– JPEG 2000

Today– Optimal bit allocation: “equal slope” in R-D curves– Optimal unitary transform – KLT

Midterm exam: Wednesday March 25 in class– Sample exam from previous year was posted online for reference– Emphasis: (1) Theoretical foundations: 2-D Signals & Systems, 2-D Fourier

analysis, Quantization, Basis vectors/images and Unitary transform;

(2) Principles and algorithmic techniques for: human visual system (color and monochorome vision); image enhancement and restoration; image coding.

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Review: Non-Dyadic Decomposition – Wavelet PacketsReview: Non-Dyadic Decomposition – Wavelet Packets

Figures are from slides at Gonzalez/ Woods DIP book 2/e website

(Chapter 7)


Bit Allocation in Image CodingBit Allocation in Image Coding

Focus on MSE-based optimization;Can further adjust based on HVS

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


1

0

2 )()(

1min

N

kq kvkvE

ND

Here rk is # bits allocated for v(k); k2: variance of k-th coeff. v(k);

f(-): a func. relating bit rate to distortion based on a coeff’s p.d.f.

Bit AllocationBit Allocation

How many bits to be allocated for each coefficient?

– Related to each coeff’s variance (and probability distribution)– More bits for high variance k

2 to keep total MSE small

1

0

)(1 N

kkrf

N

1

0

.bits/coeff 1

subject toN

kk Rr

N

“Reverse Water-filling”– Try to keep same amount of error in each frequency band

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

)


Rate/Distortion Allocation via Reverse Water-fillingRate/Distortion Allocation via Reverse Water-filling

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

/20

04

) Optimal solution for Gaussian distributed coefficients Idea: try to keep same amount of error in each freq. band; no need to

spend bits to represent coeff. w/ smaller variance than water level

– Results based on R-D function and via Lagrange-multiplier optimization

– Note the convex shape of R-D function for Gaussian => slope becomes milder; require more bits to further reduce distortion

D

RD

2

Given total rate R, determine and then D; or vice versa given D, determine R

Optimal Not Optimal


Details on Reverse Water-filling SolutionDetails on Reverse Water-filling Solution

RRRDi

i

n

iii

tosubj. )(min1

N

ii

N

iiiN RRRDRRJ

111 )(),...(

Construct func. using Lagrange multiplier

R and allfor 0 ii

i

i

i

i

RdR

dDi

dR

dD

R

J Necessary condition Keep the same marginal gain

0for 1

2

1 0,ln

2

1max

2

iii

i

i

i

ii

i RDdD

dR

DdD

d

dD

dR

Equal Slope Condition for bit allocation

D

RD

2

Resulting in equal distortions on Gaussian r.v’sas interpreted in “reverse water filling”


Key Result in Rate Allocation: Equal R-D SlopeKey Result in Rate Allocation: Equal R-D Slope

Keep the slope in R-D curve the same (not just for Gaussian r.v)

– Otherwise some bits can be applied to other r.v. for better “return” in reducing the overall distortion

If all r.v. are Gaussian, the same slope in R-D curves for these r.v. correspond to an identical amount of distortion.

R(D) = ½ [ log 2 – log D ] => dR/dD = – 1 / (2D)

Use operational/measured rate-distortion information for general r.v.

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

) Rate allocation results:=> Equal slope(o.w. exist a better allocation strategy)


Optimal TransformOptimal Transform

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Optimal TransformOptimal Transform

Recall: Why use transform in coding/compression?– Decorrelate the correlated data– Pack energy into a small number of coefficients

– Interested in unitary/orthogonal or approximate orthogonal transforms Energy preservation s.t. quantization effects can be better

understood and controlled

Unitary transforms we’ve dealt so far are data independent– Transform basis/filters are not depending on the signals we are processing

Can hard-wire them in encoder and decoder

What unitary transform gives the best energy compaction and decorrelation?– “Optimal” in a statistical sense to allow the codec works well with many

images => Signal statistics would play an important role

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Review: Correlation After a Linear TransformReview: Correlation After a Linear Transform

Consider an Nx1 zero-mean random vector x

– Covariance (or autocorrelation) matrix Rx = E[ x xH ] give ideas of correlation between elements Rx is a diagonal matrix for if all N r.v.’s are uncorrelated

Apply a linear transform to x: y = A x

What is the correlation matrix for y ?

Ry = E[ y yH ] = E[ (Ax) (Ax)H ] = E[ A x xH AH ]

= A E[ x xH ] AH = A Rx AH

Decorrelation: try to search for A that can produce a decorrelated y (equiv. a diagonal correlation matrix Ry )

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


K-L Transform (Principal Component Analysis)K-L Transform (Principal Component Analysis)

Eigen decomposition of Rx: Rx uk = k uk

– Recall the properties of Rx

Hermitian (conjugate symmetric RH = R); Nonnegative definite (real non-negative eigen values)

Karhunen-Loeve Transform (KLT) y = UH x x = U y with U = [ u1, … uN ]

– KLT is a unitary transform: to represent x with basis vectors that are the orthonormalized eigenvectors of Rx

– Rx U = [ 1u1, … N uN ] = U diag{1, 2, … , N}

=> UH Rx U = diag{1, 2, … , N} i.e. KLT performs decorrelation

we often order {ui} so that 1 2 … N

– Also known as Hotelling transform or Principle Component Analysis (PCA)

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

/20

04

)


Properties of K-L TransformProperties of K-L Transform

Decorrelation

– E[ y yH ]= E[ (UH x) (UH x)H ]= UH E[ x xH ] U = diag{1, 2, … , N}

– Note: Other matrices (unitary or nonunitary) may also decorrelate the transformed sequence [Jain’s e.g.5.7 pp166]

Minimizing MSE under basis restriction

– If only allow to keep m coefficients for any 1 m N, what’s the best way to minimize reconstruction error?

Retain the coefficients corresponding to the eigenvectors of the first m largest eigen values

Reference: Theorem 5.1 and Proof in Jain’s Book (pp166)

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

/20

04

)


KLT Basis RestrictionKLT Basis Restriction Basis restriction

– Keep only a subset of m transform coefficients and then perform inverse transform (1 m N)

– Basis restriction error: MSE between original & new sequences

Goal: to find the forward and backward transform matrices to minimize the restriction error for each and every m– The minimum is achieved by KLT arranged according to the

decreasing order of the eigenvalues of R

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


K-L Transform for ImagesK-L Transform for Images

Work with 2-D autocorrelation function– R(m,n; m’,n’)= E[ x(m, n) x(m’, n’) ] for all 0 m, m’, n, n’ N-1– K-L Basis images is the orthonormalized eigen function of

4-variable function R( )

Rewrite images into vector form (N2x1)– Need solve the eigen problem for N2xN2 matrix! ~ O(N 6)

Reduced computation for separable R– R(m,n; m’,n’)= r1(m,m’) r2(n,n’)– Only need solve the eigen problem for two NxN matrices ~ O(N3)– KLT can now be performed separably on rows and columns

Reducing the transform complexity from O(N4) to O(N3)

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

)


Pros and Cons of K-L TransformPros and Cons of K-L Transform

Optimality– Decorrelation and MMSE for the same# of partial transform coeff.

Applications– (non-universal) compression– Pattern recognition: e.g., eigen faces– Analyze the principal (“dominating”) components and reduce

feature dimensions

Data dependent:– Have to estimate the 2nd-order statistics from a collection of images

to determine the transform– Need to let decoder know the KL basis vectors used– Can we get data-independent transform with similar performance?

DCT for highly correlated data

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

)


Energy Compaction of DCT vs. KLTEnergy Compaction of DCT vs. KLT

DCT has excellent energy compaction for highly correlated data

DCT is a good replacement for K-L– Close to optimal for highly correlated data– Not depend on specific data like K-L does– Fast algorithm available

[ref and statistics: Jain’s pp153, 168-175]

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Energy Compaction of DCT vs. KLT (cont’d)Energy Compaction of DCT vs. KLT (cont’d)

Preliminaries– The matrices R, R-1, and R-1 share the same set of eigen vectors

– DCT basis vectors are eigenvectors of a symmetric tri-diagonal matrix Qc

– Covariance matrix R of 1st-order stationary Markov sequence (AR process) has an inverse in the form of symmetric tri-diagonal matrix

DCT is close to KLT on 1st-order stationary Markov

– For highly correlated sequence, a scaled version of R-1 approx. Qc

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Summary and Review on Unitary TransformSummary and Review on Unitary Transform

Representation with orthonormal basis Unitary transform

– Preserve energy

Common unitary transforms

– KLT; DFT, DCT, Haar …

Which transform to choose?

– Depend on need in particular task/application– DFT ~ reflect physical meaning of frequency or spatial frequency– KLT ~ optimal in energy compaction– DCT ~ real-to-real, and close to KLT’s energy compaction

=> A comparison table in Jain’s book Table 5.3

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

)


Summary of Today’s LectureSummary of Today’s Lecture

Optimal bit allocation: “equal slope” in R-D curves– Lead to reverse water filling for Gaussian coefficients

Optimal transform – KLT

Next lecture– Compression of image sequence and video

Readings:– Gonzalez’s 3/e book Section 8.2.8 – Optimal bit allocation: article by Ortega-Ramchandran in Nov. ’98 issue

of IEEE Signal Processing Magazine More reference: Jain’s book Section 2.13 & 11.4

– KLT: Jain’s book 5.11, 5.6, 5.14; 2.7-2.9

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


More on Rate-Distortion Based More on Rate-Distortion Based

Bit Allocation in Image CodingBit Allocation in Image Coding

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Details on Reverse Water-filling SolutionDetails on Reverse Water-filling Solution

DDD i

i

in

i

tosubj. ,lnmin2

121

n

ii

i

in

i

DD

DJ1

2

121 ln)( Construct func. using

Lagrange multiplier

D and allfor 0 1

2

1ii

ii

DDiDD

J Necessary condition Keep the same marginal gain

DDD i

n

i i

i

tosubj. ,0,ln2

1maxmin

1

2

DDDD

D

D

Ji

ii

ii

ii

ii

i

s.t.

if ,

if ,

if ,0

if ,022

2

2

2

Necessary conditionfor choosing

D

RD

2


General Constrained Optimization ProblemGeneral Constrained Optimization Problem

Define

Necessary condition: “Kuhn-Tucker conditions”

n~1jfor 0)( tosubj. ),(min j xgxf

j allfor 0

j allfor 0)(

0)()(

*

*

1

*

j

jj

j

n

jj

xg

xgxf

n

jjj xgxfxJ

1

)()()(


Lagrangian Optimiz. for Indep. Budget ConstraintLagrangian Optimiz. for Indep. Budget Constraint

Previous: fix distortion, minimize total rate

Alternative: fix total rate (bit budget), minimize distortion

(Discrete) Lagrangian optimization on general source

– “Constant slope optimization” (e.g. in Box 6 Fig. 17 of Ortega’s tutorial)– Need to determine the quantizer q(i) for each coding unit i– Lagrangian cost for each coding unit

use a line with slope - to intersect with each operating point (Fig.14)

– For a given operating quality , the minimum can be computed independently for each coding unit

=> Find operating quality satisfying the rate constraint

n

iiqiiqi rd

1)(,)(,min

)(,)(, iqiiqi rd

n

iiqiiqi

n

iiqiiqi rdrd

1)(,)(,

1)(,)(, minmin


(from Box 6 Fig. 17 of Ortega’s tutorial)(from Fig. 14 of Ortega’s tutorial)

m. wu: enee631 digital image processing (spring'09) optimal bit allocation and unitary...

Documents

rd function

gaussian r

hvsumcp enee631 slides

distortionumcp enee631

equal rd slopekeep

image codingfocus

general r

total rate r