Download - Image and Video Coding I · BasicDesignofImageandVideoCodecs HybridVideoEncoder coding mode decision in-loop filtering motion estimation intra mode decision intra-picture predictionÛ

Image and Video Coding I

Thomas Wiegand

Technische Universität Berlin

Introduction

encoder decoderbitstream

Motivation

Motivation for Source Coding

Source Coding / Data Compression

Efficient transmission or storage of dataUse less throughput for given dataTransmit more data for given throughput

Source Coding: Enabling TechnologyEnables new applicationsMakes applications economically feasible

ExamplesDistribution of digital imagesDistribution of digital audio (first mobile audio players)Digitial television (DVB-T, DVB-T2, DVB-S, DVB-S2)Internet video streaming (YouTube, Netflix, Amazon, ...)

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 2 / 31

Motivation

Practical Source Coding Examples

File CompressionCompress large document archivesGZIP, WinZIP, WinRar, ...

Audio CompressionCompress audio for storage on mobile deviceFLAC, MP3, AAC, ...

Image CompressionCompress images for storage and distributionRaw camera formats, JPEG, JPEG-2000, JPEG-XR, ...

Video CompressionCompress video for streaming, broadcast, or storageMPEG-2, H.264 | AVC, VP9, H.265 | HEVC, ...


Motivation

Types of Compression

Lossless Compression

Invertible / reversible form of data compressionOriginal input data can be completely recoveredRequired for document compression

Examples: Lempel-Ziv coding (GZIP), FLAC, JPEG-LS

Lossy CompressionNot invertibleOnly approximation of original input data can be recoveredAchieves much higher compression ratiosDominant form of compression for media data (audio, images, video)

Examples: MP3, AAC for audioJPEG, JPEG-2000 for imagesMPEG-2, H.264 | AVC, H.265 | HEVC for video


Source Data Sampling and Quantization

Source Data

Analog Signals

Continuous-time/space and continuous-amplitude signalsTypically resulting from physical measurementsMost signals in reality (audio and visual signals)

Digital Signals

Discrete-time/space and discrete-amplitude signalsOften generated from analog signalSometimes directly measured

Input Data for Source CodingRequire digital signalsNeed to be stored and processed with a computerCompression methods are computer programs



Analog-to-Digital Conversion

Sampling

Maps continuous-time/space signal into discrete-time/space signal

QuantizationMaps continuous-amplitude signal into discrete-amplitude signalSimplest case: Rounding



Impact of Sampling (Spatial Resolution)

400×300 samples 200×150 samples

100×75 samples 50×38 samples



Impact of Quantization (Bit Depth)

8 bits per component 4 bits per component

3 bits per component 2 bits per component



Raw Images and Videos

Single-Component ImageMatrix of integer samples

s[x , y ]

Characterized byNumber of samples in horizontaland vertical direction W×HSample bit depth B

x

y

Color ImagesThree color components (typically RGB or YCbCr)Additionally characterized by color sampling format

VideosSequence of imagesAdditionally characterized by frame rate



Color Sampling Formats

RGB YCbCr 4:4:4 YCbCr 4:2:2 YCbCr 4:2:0

mostcommonformat



Raw Data Rate

Raw Data RateBit rate R of raw data format

R = (samples per time unit) · (bit depth per sample) (1)

Example: Full High Definition (HD) Video

1920×1080 luma samples, 50 frames per second (Europe)4:2:0 chroma format (chroma components with 1/4 luma resolution)8 bits per sampleraw data rate = 50Hz · 1920 · 1080 · (1 + 2/4) · 8 bits ≈ 1.24Gbit/s

Example: Ultra High Defintion (UHD) Video

3840×2160 luma samples, 60 frames per second (USA, Japan)4:2:0 chroma format, 10 bits per sampleraw data rate = 60Hz · 3840 · 2160 · (1 + 2/4) · 10 bits ≈ 7.5Gbit/s


The Source Coding Problem

Typical Video Communication Scenario

capture

raw inputvideo samples

raw inputvideo samples

pre-processing

videoencodervideo

encoder

transmission channel (can be replaced by storage)

channelencodermodulatorchanneldemodulatorchannel

decoder

videodecodervideo

decoderpost-

processing

raw outputvideo samplesraw output

video samples

display andperception

encodedbitstream

receivedbitstream

no transmission errorsbitstream



Application Examples

HD movie UHD broadcaston Blu-ray Disc over DVS-S2

raw video format

1920×1080 luma samples 3840×2160 luma samples4:2:0 chroma format 4:2:0 chroma format8 bits per sample 10 bits per sample24 frames per second 60 frames per second

raw data rate ca. 600 Mbit/s ca. 7.5 Gbit/s

channel bit rate 36 Mbit/s (read speed) 58 Mbit/s (8PSK 2/3)

video bit rate ca. 20 Mbit/s ca. 25 Mbit/s

required compression ca. 30:1 ca. 300:1



The Basic Communication / Source Coding Problem

Basic Source Coding ProblemTwo equivalent formulations:

Representing source data with highest fidelity possiblewithin an available bit rate

and

Representing source data using lowest bit rate possiblewhile maintaining a specified reproduction quality

Source CodecSource codec: System of encoder and decoder

encoder decoderbitstream



Practical Communication / Source Coding Problem

Characteristics of Source Codecs / Overall Communication

Bit rate: Throughput of the communication channelQuality: Fidelity of the reconstructed signalDelay: Start-up latency, end-to-end delayComplexity: Computation, memory, memory access

Practical Source Coding Problem

Given a maximum allowed complexity and a maximum delay,achieve an optimal trade-off between bit rate and reconstructionquality for the transmission problem in the targeted application

In this course:Will concentrate on source codec

Ignore aspects of transmission channel (e.g., transmission errors)


Coding Efficiency

Image Compression Example

100%

Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%


Coding Efficiency

Trade-Off between Quality and Compression Ratio

Coding EfficiencyAbility to trade-off bit rate and reconstruction qualityWant best reconstruction quality for a given bitrate (or vice versa)

codec A

bettercodec B

bit rate

quality


Coding Efficiency

Coding Efficiency Example

1:100 Compression: JPEG1:100 Compression: H.265 | HEVC


Coding Efficiency

Measuring Coding Efficiency

Average Bit Rate

R =number of bits in bitstream for a video sequence

nominal duration of the video sequence(2)

Quality / Distortion

Typically: Mean square error (MSE) and peak-signal-to-noise ratio (PSNR)For a W×H array s[x , y ] of original samples, the array s ′[x , y ] ofreconstructed samples, and a sample bit depth B

MSE =1

W · H∑∀(x,y)

(s[x , y ]− s ′[x , y ]

)2 (3)

PSNR = 10 · log10

((2B − 1)2

MSE

)(4)

For videos: Average PSNR over entire video sequence


Basic Design of Image and Video Codecs

Image Compression Example: JPEG

Partition color components into 8× 8 blocks

Transform coding of 8× 8 blocks of samples

2D

transform

scalar

quantization

entropy

coding

block of

samples

sequence

of bits

2D inverse

transform

decoder

mapping

entropy

decoding

reconstructed

block



JPEG: Transform of Sample Blocks

Linear Transform: Matrix multiplication of the form c = A · sOrthogonal Transform: Rotation and reflection in signal spaceSeparable Transform: Reduced complexityIn practice: Discrete Cosine Transform (DCT) or approximation thereof

original block

horizontalDCT

verticalDCT

after 2d DCT

Effect of transformCompaction of signal energyQuantization is more efficient in transform domain



JPEG: Quantization

s

s'0 s'1 s'2 s'3 s'4s'-1s'-2s'-3s'-4

0 Δ 1·Δ 3·Δ 4·Δ-Δ-2·Δ-3·Δ-4·Δ

u-3 u-2 u-1 u0 u1 u2 u3 u4

Uniform reconstruction quantizersEqually spaced reconstruction levels (indicated by step size ∆)Simple decoder mapping

t ′ = ∆ · qEncoder has freedom to adapt decision thresholds to sourceSimplest encoder: Rounding

q = round(t/∆)

Quantization step size ∆ determines tradeoff between quality and bit rateT. Wiegand (TU Berlin) — Image and Video Coding: Introduction 22 / 31


JPEG: Entropy Coding

Scanning: Traverse quantization indexes from low to high frequency positions

0.242 0.108 0.053 0.009

0.105 0.053 0.022 0.002

0.046 0.017 0.006 0.001

0.009 0.002 0.001 0.000

probabilities P(qk 6= 0) zig-zag scan (JPEG)

Map list of numbers into bitstreamSimplest approach: Codeword tableOptimization problem:Minimize average codeword length

¯ =∑k

pk · `k

number codeword0 01 1002 1013 110004 110015 110106 11011· · · · · ·



Similarities between Successive Pictures in a Video Sequence

t



Video Codecs: Motion-Compensated Prediction

already coded reference picture

moving object

current picture

displaced object

𝑥

𝑦

𝒎 =𝑚𝑥

𝑚𝑦 displacement vector

for current block

current block

best-matching

block in

reference

picture FOOD FRESH

FOOD FRESH

FOOD FRESH

Predict current block using a displaced block in an already coded picture

s[x , y ] = s ′ref [x + mx , y + my ]

Displacement is characterized by displacement vector or motion vector

m = (mx ,my )T

Estimate suitable motion vector in encoder =⇒ Motion estimationTransmit motion vector to decoder as part of the bitstream



Example for Motion-Compensated Prediction



Hybrid Video Encoder

coding mode

decision

in-loop

filtering

motion

estimation

intra mode

decision

intra-picture

prediction

motion-comp.

prediction

transform &

quantization

scaling &

inv. transform

entropy

coding

encoder

control

bitstream

coding modes,

intra pred. modes,

motion data

𝑢𝑘[𝑥, 𝑦]

𝑢𝑘′ [𝑥, 𝑦]

𝑠𝑘∗[𝑥, 𝑦]

𝑠 𝑘[𝑥, 𝑦]

𝑠𝑘[𝑥, 𝑦]

input video pictures

(in coding order and

partitioned into blocks)

𝑠𝑘′ [𝑥, 𝑦]

buffer for

current picture

decoded

picture buffer


reconstructed video

(for monitoring)

𝑠𝑟′[𝑥+𝑚𝑥, 𝑦+𝑚𝑦]

𝑓( 𝑠𝑘∗ 𝑥, 𝑦 )

transform coefficient levels

and quantization parameters



Hybrid Video Decoder

in-loop

filtering

motion-comp.

prediction

intra-picture

prediction

scaling &

inv. transform

entropy

decoding

bitstream

transform coefficient levels

and quantization parameters

𝑢𝑘′ [𝑥, 𝑦]


𝑠 𝑘[𝑥, 𝑦]

𝑠𝑘′ [𝑥, 𝑦]

buffer for

current

picture

decoded

picture buffer


decoded video

𝑠𝑟′[𝑥+𝑚𝑥, 𝑦+𝑚𝑦]

𝑓( 𝑠𝑘∗ 𝑥, 𝑦 )

motion data

intra prediction

modes

coding modes


Outline

Outline of Course

Image and Video Coding I: FundamentalsLossless CodingRate-Distortion TheoryQuantizationPredictive CodingTransform Coding

Image and Video Coding II: Algorithms and ApplicationsAcquisition, Representation, Display, and Perception of ImagesVideo Coding OverviewVideo Encoder ControlIntra-Picture CodingInter-Picture CodingVideo Coding Standards


Outline

Literature I

Source CodingWiegand, T. and Schwarz, H., “Source Coding: Part I of Fundamentals ofSource and Video Coding,” Foundations and Trends in Signal Processing,Now Publishers, vol. 4, no. 1–2, 2011. (pdf available on web site)Cover, T. M. and Thomas, J. A., “Elements of Information Theory,” JohnWiley & Sons, New York, 2006.Gersho, A. and Gray, R. M., “Vector Quantization and Signal Compression,”Kluwer Academic Publishers, Boston, Dordrecht, London, 1992.Jayant, N. S. and Noll, P., “Digital Coding of Waveforms,” Prentice-Hall,Englewood Cliffs, NJ, 1994.

Human VisionWandell, B. A., “Foundations of Vision,” Sinauer Associates, Sunderland,1995.Hauske, G. “Systemtheorie der visuellen Wahrnehmung,” B. G. Teubner,Stutgart, 1994.


Outline

Literature II

Image and Video CodingSchwarz, H. and Wiegand, T., “Video Coding: Part II of Fundamentals ofSource and Video Coding,” Foundations and Trends in Signal Processing,Now Publishers, vol. 10, no. 1–3, 2016. (pdf available on web site)Bull, D. R., “Communicating Pictures: A Course in Image and VideoCoding,” Elsevier, 2014.Ohm, J.-R., “Multimedia Signal Coding and Transmission,” Springer, 2015.Pennebaker, W. and Mitchell, J., “JPEG Still Image Data CompressionStandard,” Van Nostrand Reinhold, New York, 1993.Vetterli, M. and Kovacevic, J., “Wavelets and Subband Coding,”Prentice-Hall, 1995.Wien, M., “High Efficiency Video Coding — Coding Tools andSpecifications,” Springer 2014.Sze, V., Budagavi, M., and Sullivan, G. J. (eds.), “High Efficiency VideoCoding (HEVC): Algorithm and Architectures,” Springer, 2014.


Download - Image and Video Coding I · BasicDesignofImageandVideoCodecs HybridVideoEncoder coding mode decision in-loop filtering motion estimation intra mode decision intra-picture predictionÛ

Top Related