m. wu: enee631 digital image processing (spring'09) basics on video coding spring ’09...

53
M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department, University of Maryland, College Park bb.eng.umd.edu (select ENEE631 S’09) [email protected] ENEE631 Spring’09 ENEE631 Spring’09 Lecture 15 (3/30/2009) Lecture 15 (3/30/2009)

Upload: rhoda-tucker

Post on 01-Jan-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09)

Basics on Video CodingBasics on Video Coding

Spring ’09 Instructor: Min Wu

Electrical and Computer Engineering Department,

University of Maryland, College Park

bb.eng.umd.edu (select ENEE631 S’09) [email protected]

ENEE631 Spring’09ENEE631 Spring’09Lecture 15 (3/30/2009)Lecture 15 (3/30/2009)

Page 2: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [2]

Overview and LogisticsOverview and Logistics Last Time:

– Bit allocation issues in image compression– Optimal transform KLT ~ unitary transform; decorrelate data

optimal MMSE approximation under basis restriction

Comments on issues arising from mid-term exam– Linearity and shift invariance: check by definition

Is piecewise linear stretching a linear operation? If ignoring boundary effect, are median filtering and point

operations (including histogram based processing) shift invariant? Give examples on shift variant operations

– Quantization: MMSE criterion vs. Minmax criterion

Today:– Image interpolation– Video coding: explore temporal and spatial redundancy

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)

Page 3: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [5]

Image Interpolation: Image Interpolation:

A Quick Extension from 1-D InterpolationA Quick Extension from 1-D Interpolation

Useful in image enlargement, rotation, motion estimation, etc.Useful in image enlargement, rotation, motion estimation, etc.

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)

Page 4: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [6]

Examples of Image InterpolationExamples of Image Interpolation

4x zoom (nearest neighbor) 4x zoom (bilinear)

Page 5: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [7]

Interpolation / ZoomingInterpolation / Zooming

How to make up the new pixels?

Replication according to the nearest neighbor– Simple but leaves zig-zag boundary

(reflect spectrum artifacts; equiv. to interlace zero & LPF with a constant mask)

(p,q)

(p’,q’)

(p,q+1)

(p+1,q+1)(p+1,q)

a

b

a 1-a

f1

f2

– Do two horizontal and one vertical 1-D interpolation

F( p’, q’ ) = (1-a) [ (1-b) F(p, q) + b F(p, q+1) ] + a [(1-b) F(p+1, q) + b F(p+1, q+1) ]

For zoom in by 2 in each dimension:F(p’, q’) = 0.5 [0.5 F(p,q) + 0.5 F(p,q+1)] + 0.5 [0.5 F(p+1,q) + 0.5 F(p+1,q+1)]

=> equiv. to F(x, y) = r x + s y + u xy + v solve parameters using 4 known pixels

Bilinear interpolation– Extend 1-D linear interpolation: (1-a) f1 + a f2

UM

CP

EN

EE

63

1/4

08

G S

lide

s (c

rea

ted

by

M.W

u ©

20

01

/20

02

)

Page 6: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [8]

Review: 1-D Frequency-Domain InterpretationReview: 1-D Frequency-Domain Interpretation

From Crochiere-Rabiner “Multirate DSP” book Fig.2.15-16

Page 7: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [9]

Frequency-Domain InterpretationFrequency-Domain Interpretation Review multirate signal processing (ENEE630)

For Images: extend to the 2-D transform

Downsampling– Aliasing as spectra replicas becomes closer– LPF to avoid aliasing

Upsampling– Upsampling with zero interlacing ~ replicated spectrum– LPF to filter out the spectra replicas in high-frequency part– Ideal filter vs. practical filters

nearest neighbor approach for 2x zoom use [think] what equiv. filters used for bilinear interpolation?

Sampling rate conversion with rational rate M / N– Upsample with zero interlacing by M LPF Downsample

1 1

1 1

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

)

1/2 1

1/4 1/2

1/4 1/2

1/2

1/4

1/4

Page 8: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [11]

More on InterpolationMore on Interpolation

Other filters

– Bi-cubic interpolation (3rd order polynomial on index variables) Based on combination of 16-pixel neighborhood

– Can build p-th order interpolation by recursive filtering After upsample by p, convolve with linear interpolation filter p

times

Interpolation that avoids blurred edges and textures

– Sharpening– Edge-preserving interpolation

( recent research papers in ICIP and Trans. on Image Proc. )

=> Will discuss more on 2-D sampling and frequency domain interpretation in a few lectures

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

/20

04

)

Page 9: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [13]

From Image Coding to Video CodingFrom Image Coding to Video Coding

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)

Page 10: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [14]

ReviewReview

Basic tools for compression

– PCM coding, entropy coding, run-length coding– Quantization and truncation– Predictive coding– Transform coding: DCT-based

JPEG image compression

– 8x8 Block-DCT based transform coding– Use predictive coding, quantization, run-length coding, and

entropy coding

Today: digital video and video compression

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 11: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [15]

Bring in Motion Bring in Motion Video (Motion Pictures) Video (Motion Pictures)

Capturing video

– Video as a 3-D signal 2 spatial dimensions & time dimension continuous I( x, y, t ) => discrete I( m, n, tk )

– Frame by frame => image sequence

Encode digital video

– Simplest way ~ compress each frame image individually e.g., “motion-JPEG” only spatial redundancy is explored and reduced

– How about temporal redundancy? Is differential coding good? Pixel-by-pixel difference could still be large due to motion

Need better prediction

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 12: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [16]

Video ExamplesVideo Examples

1. NASA shuttle

2. “Talking Head”

Page 13: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [18]

Explore Temporal Redundancy – 1Explore Temporal Redundancy – 1stst try try

– Difference between corresponding pixels of two video frames

From Gonzalez-Woods 3/e Fig. 8.34-8.35

Page 14: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [19]

Explore MotionExplore MotionFrom Gonzalez-Woods

3/e Fig. 8.37

Page 15: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [21]

Motion EstimationMotion Estimation

Help understanding the content of image sequence– Useful for surveillance

Stabilizing video by detecting and removing small, noisy global motions– For building stabilizer in camcorder

Reduce temporal redundancy of video for compression[What estimation accuracy and resolution are necessary for this purpose?]

one motion displacement vector per picture? (extreme case: DPCM)

one vector per pixel?

=> Tradeoff: (1) effectiveness & complexity in approximating commonly seen motions; (2) overhead in describing the motion model.

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2;

20

07

)

Page 16: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [22]

Block-Matching by Exhaustive SearchBlock-Matching by Exhaustive Search Modeling: assume movements are block-based translation

Search every possibility over a specified range for the best matching block – MAD (mean absolute difference) often used for simplicity

=> Flash Demo (by Dr. Ken Lam @ Hong Kong PolyTech Univ.)

From Wang’s Preprint Fig.6.6U

MC

P E

NE

E4

08

G S

lide

s (c

rea

ted

by

M.W

u &

R.L

iu ©

20

02

)

Page 17: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [23]

Motion Compensation Motion Compensation

– Help reduce temporal redundancy of video

PREVIOUS FRAME CURRENT FRAME

PREDICTED FRAME PREDICTION ERROR FRAME

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Revised from R.Liu Seminar Course ’00 @ UMD

Page 18: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [24]

Complexity of Exhaustive Block-MatchingComplexity of Exhaustive Block-Matching

Assumptions– Block size NxN and image size S=M1 x M2– Search step size is 1 pixel ~ “integer-pel accuracy”– Search range +/–R pixels both horizontally and vertically

Computation complexity# Candidate matching blocks = (2R+1)2 # Operations for computing MAD for one block ~ O(N2)# Operations for MV estimation per blk ~ O((2R+1)2 N2); # Blocks = S / N2 – Total # operations for entire frame ~ O((2R+1)2 S)

i.e., overall computation load is independent of block size! block size affects encoding bit rate and effectiveness of motion

compensation.

E.g., M=512, N=16, R=16, 30fps => On the order of 8.55 x 109 operations per second!– Was difficult for real time estimation, but possible with parallel hardware

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 19: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [25]

Exhaustive Search: Cons and ProsExhaustive Search: Cons and Pros

Pros– Guaranteed optimality within search range and motion model

Cons– Can only search among finitely many candidates

What if the motion is “fractional”?

– High computation complexity On the order of [search-range-size x image-size] for 1-pixel step

size

How to improve accuracy?

– Include blocks at fractional translation as candidates => require interpolation

How to improve speed?– Try to exclude unlikely candidates

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 20: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [26]

Fractional Accuracy Search for Block MatchingFractional Accuracy Search for Block Matching For motion accuracy of 1/K pixel

– Upsample (interpolate) reference frame by a factor of K– Search for the best matching block in the upsampled reference frame

Half-pel accuracy ~ K=2– Significant accuracy improvement over integer-pel

(esp. for low-resolution)– Complexity increase

(From Wang’s Preprint Fig.6.7)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 21: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [27]

No motion compensation

1-pixel precision

½ pixel precision

¼ pixel precision

Fractional Accuracy for Motion: ExampleFractional Accuracy for Motion: ExampleFrom Gonzalez-Woods

3/e Fig. 8.38

Page 22: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [28]

Fast Algorithms for Block MatchingFast Algorithms for Block Matching

Basic ideas– Matching errors near the best match are generally smaller than far away– Skip candidates that are unlikely to give good match

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

(From Wang’s Preprint Fig.6.6)

Page 23: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [29]

M24

M15 M14 M13

M16

M11

M12

M5 M4 M3

M17 M18 M19

-6 M6 M1 M2 +6

M7 M8 M9

dx

dy

Fast Algorithm: 3-Step Search Fast Algorithm: 3-Step Search

Search candidates at 8 neighbor positions

Step-size cut down by 2 after each iteration– Start with step size

approx. half of max. search range

motion vector {dx, dy} = {1, 6}

Total number of computations: 9 + 82 = 25 (3-step) (2R+1)2 = 169 (full search)

(Fig. from Ken Lam – HK Poly Univ. short course in summer’2001)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

=> See Flash demo by Jane Kim (UMD)

Page 24: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [30]

Lowest resolution

medium resolution

Original resolution

Hierarchical Block MatchingHierarchical Block Matching Problem with fast search at full resolution

– Small mis-alignment may give high displacement error (EDFD) esp. for texture and edge blocks

Hierarchical (multi-resolution) block matching– Match with coarse resolution to narrow down search range– Match with high resolution to refine motion estimation

(From Wang’s Preprint Fig.6.19)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 25: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [31]

Summary of Today’s LectureSummary of Today’s Lecture

Interpolation

Block-based motion estimation and compensation

Next Lecture: video compression through hybrid coding

=> Given what we discussed, how to design a video codec?

Exploit spatial redundancy via transform coding Exploit temporal redundancy via predictive coding

~ motion estimation and compensation

Reading assignment– Gonzalez’s 3/e book 2.4.4 (interpolation); 8.2.9 (motion compensation)

– To explore further: Wang’s video textbook 9.3.1, 6.4

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)

Page 26: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [32]

Hybrid Coding for Video Hybrid Coding for Video

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)

Page 27: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [33]

DCT-M.E. Hybrid Video CodingDCT-M.E. Hybrid Video Coding “Hybrid” ~ combined transform coding & predictive coding Spatial redundancy removal

– Use DCT-based transform coding for reference frame Temporal redundancy removal

– Use motion-based predictive coding for next frames estimate motion and use reference frame to predict only encode MV & prediction residue (“motion compensation residue”)

(From Princeton EE330 S’01 by B.Liu)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 28: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [34]

Review: Predictive Coding with QuantizationReview: Predictive Coding with Quantization Consider: high correlation between successive samples

Predictive coding– Basic principle: Remove redundancy between successive pixels and only

encode residual between actual and predicted – Residue usually has much smaller dynamic range

Allow fewer quantization levels for the same MSE => get compression

– Compression efficiency depends on intersample redundancy

First try:

Any problem with this codec?

uQ (n)

Predictor+

eQ(n)

uP(n) = f[uQ(n-1)] DecodeDecode

rr

u(n)

Predictor

Quantizer_

e(n) eQ(n)

EncodeEncoderr

u’P(n) = f[u(n-1)]

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 29: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [35]

Predictive Coding (cont’d)Predictive Coding (cont’d)

Problem with 1st try– Input to predictor are different at

encoder and decoder decoder doesn’t know u(n)!

– Mismatch error could propagate to future reconstructed samples

Solution: Differential PCM (DPCM)

– Use quantized sequence uQ(n) for prediction at both encoder and decoder

– Simple predictor f[ x ] = x– Prediction error e(n)– Quantized prediction error eQ(n)

– Distortion d(n) = e(n) – eQ(n)

uQ (n)

Predictor+

eQ(n)

uP(n)= f[uQ(n-1)]

DecodeDecoderr

EncodeEncoderr

u(n)

Predictor

Quantizer_

e(n) eQ(n)

+uP(n)=f[uQ(n-1)]

uQ(n)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Note: “Predictor” contains one-step buffer as input to the prediction

Page 30: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [36]

Hybrid MC-DCT Video EncoderHybrid MC-DCT Video Encoder(From R.Liu’s Handbook Fig.2.18)

• Intra-frame: encoded without prediction• Inter-frame: predictively encoded => use quantized frames as ref for residue

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 31: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [37]

Hybrid MC-DCT Video DecoderHybrid MC-DCT Video Decoder

(From R.Liu’s Handbook Fig.2.18)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 32: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [39]

Hybrid Video Coding: Problems to Be SolvedHybrid Video Coding: Problems to Be Solved Not all regions are easily inferable from previous frame

– Occlusion ~ solvable by backward prediction using future frames as ref.– Adaptively decide using prediction or not

Drifting and error propagation

Solution: Encode reference regions or frames from time to time (“intra coding”)

Random access: e.g. want to get 95th frame

Solution: Encode frame without prediction from time to time

How to allocate bits?– Based on visual model and statistics: JPEG-like quant. steps; entropy coding

– Consider constant or variable bit-rate requirement Constant-bit-rate (CER) vs. Variable-bit-rate (VER)

Wrap up all solutions ~ MPEG-like codec

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 33: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [40]

Page 34: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [41]

Page 35: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [43]

Background Reviews onBackground Reviews on

Video Acquisition and DisplayVideo Acquisition and Display

Page 36: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [44]

Video CameraVideo Camera

Frame-by-frame capturing

CCD sensors (Charge-Coupled Devices)– 2-D array of solid-state sensors– Each sensor corresponding to a pixel– Store in a buffer and sequentially read out– Small and light => widely used

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 37: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [45]

Video DisplayVideo Display

CRT (Cathode Ray Tube)

– Large dynamic range– Bulky for large display

CRT physical depth has to be similar to screen width

LCD Flat-panel display

– Use electrical field to change the optical properties hence the brightness/color of liquid crystal

– Generating the electrical field by an array of transistors: active-matrix thin-film transistors by plasma

“Active-matrix display” (also known as TFT) has a transistor located at each pixel, allowing display be switched more frequently and less current to control pixel luminance. Passive matrix LCD has a grid of conductors with pixels located at the grid intersections

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 38: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [46]

Composite vs. Component VideoComposite vs. Component Video

Component video– Three separate signals for tristimulus color representation or

luminance-chrominance representation – Pro: higher quality– Con: need high bandwidth and synchronization

Composite video– Multiplex into a signal signal– Historical reason for transmitting color TV through monochrome

channel– Pro: save bandwidth– Con: cross talk

S-video: luminance sig. + single multiplexed chrominance sig.

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 39: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [47]

Analog Video RasterAnalog Video Raster

Line-by-line “Raster Scan”– Represent line-by-line image frame with 1-D analog

waveform– Synchronization signal for horizontal and vertical retrace

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 40: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [48]

Forming Picture on TV Tube (Monochrome)Forming Picture on TV Tube (Monochrome)

How many lines?

From B.Liu EE330S’01 Princeton

Page 41: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [49]

How Many TV Lines?How Many TV Lines?

Determined by spatial freq. response of HVS(Recall Lecture-2)

dot

dot

Cannot resolve if

distance > 2000 x separation

(~ 0.03 degree viewing angle)

From B.Liu EE330S’01 Princeton

N = 500 for D=4H

Page 42: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [50]

Progressive vs. Interlaced scanProgressive vs. Interlaced scanFrom B.Liu EE330S’01 Princeton

Page 43: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [51]

Analog Color TV SystemsAnalog Color TV Systems

Historical notes – Color TV system had to be compatible with earlier monochrome TV system

3 formats– NTSC ~ North American + Japan/Taiwan – PAL ~ Western Europe + Asia(China) + Middle East– SECAM ~ Eastern Europe + France– What format in your home country?

From Wang’s Preprint Fig.1.5

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 44: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [52]

Comparison of Three Analog TV SystemsComparison of Three Analog TV Systems

– Spatial and temporal resolution– Color coordinate– Signal bandwidth– Multiplexing of luminance, chrominance, and audio

(From Wang’s Book Preprint)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 45: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [53]

NTSCNTSC

4:3 aspect ratio (width:height)

525 lines/frame, 2:1 interlace at field rate 59.94Hz– 483 active lines per frame; vertical retrace takes time of 9 lines– rest for broadcaster’s info. like closed caption

YIQ color coordinate for transmission– RGB primary slightly different from PAL– Orthogonal chrominance

I ~ orange-to-cyan; Q ~ green-to-purple (need less bandwidth)

Multiplexing over 6M Hz total bandwidth– Artifacts due to cross talk between luminance and chrominanceU

MC

P E

NE

E4

08

G S

lide

s (c

rea

ted

by

M.W

u &

R.L

iu ©

20

02

)

Page 46: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [54]

NTSC 6MHz Bandwidth NTSC 6MHz Bandwidth From Wang’s BookPreprint Fig.1.6(b)

Page 47: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [55]

Analog Video RecordingAnalog Video Recording

Comparison of common formats

From Wang’s BookPreprint Table 1.2

Page 48: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [56]

Digital Video FormatsDigital Video Formats

ITU-R BT.601 recommendation Downsampled chrominance

– Y Cb Cr coordinate and four subsampling formats

Inter. Telecomm. Union – Radio sector

Wang’sBookPreprint Fig.1.8

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 49: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [57]

Summary: Source Video FormatsSummary: Source Video Formats

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 50: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [58]

Channel BandwidthsChannel Bandwidths

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 51: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [59]

Channel Bandwidth (cont’d)Channel Bandwidth (cont’d)

UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)

Page 52: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [60]

Application RequirementsApplication Requirements

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

Page 53: M. Wu: ENEE631 Digital Image Processing (Spring'09) Basics on Video Coding Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department,

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec15 – Hybrid Video Coding [61]