Image and Video Coding I
Thomas Wiegand
Technische Universität Berlin
Introduction
encoder decoderbitstream
Motivation
Motivation for Source Coding
Source Coding / Data Compression
Efficient transmission or storage of dataUse less throughput for given dataTransmit more data for given throughput
Source Coding: Enabling TechnologyEnables new applicationsMakes applications economically feasible
ExamplesDistribution of digital imagesDistribution of digital audio (first mobile audio players)Digitial television (DVB-T, DVB-T2, DVB-S, DVB-S2)Internet video streaming (YouTube, Netflix, Amazon, ...)
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 2 / 31
Motivation
Practical Source Coding Examples
File CompressionCompress large document archivesGZIP, WinZIP, WinRar, ...
Audio CompressionCompress audio for storage on mobile deviceFLAC, MP3, AAC, ...
Image CompressionCompress images for storage and distributionRaw camera formats, JPEG, JPEG-2000, JPEG-XR, ...
Video CompressionCompress video for streaming, broadcast, or storageMPEG-2, H.264 | AVC, VP9, H.265 | HEVC, ...
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 3 / 31
Motivation
Types of Compression
Lossless Compression
Invertible / reversible form of data compressionOriginal input data can be completely recoveredRequired for document compression
Examples: Lempel-Ziv coding (GZIP), FLAC, JPEG-LS
Lossy CompressionNot invertibleOnly approximation of original input data can be recoveredAchieves much higher compression ratiosDominant form of compression for media data (audio, images, video)
Examples: MP3, AAC for audioJPEG, JPEG-2000 for imagesMPEG-2, H.264 | AVC, H.265 | HEVC for video
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 4 / 31
Source Data Sampling and Quantization
Source Data
Analog Signals
Continuous-time/space and continuous-amplitude signalsTypically resulting from physical measurementsMost signals in reality (audio and visual signals)
Digital Signals
Discrete-time/space and discrete-amplitude signalsOften generated from analog signalSometimes directly measured
Input Data for Source CodingRequire digital signalsNeed to be stored and processed with a computerCompression methods are computer programs
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 5 / 31
Source Data Sampling and Quantization
Analog-to-Digital Conversion
Sampling
Maps continuous-time/space signal into discrete-time/space signal
QuantizationMaps continuous-amplitude signal into discrete-amplitude signalSimplest case: Rounding
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 6 / 31
Source Data Sampling and Quantization
Impact of Sampling (Spatial Resolution)
400×300 samples 200×150 samples
100×75 samples 50×38 samples
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 7 / 31
Source Data Sampling and Quantization
Impact of Quantization (Bit Depth)
8 bits per component 4 bits per component
3 bits per component 2 bits per component
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 8 / 31
Source Data Sampling and Quantization
Raw Images and Videos
Single-Component ImageMatrix of integer samples
s[x , y ]
Characterized byNumber of samples in horizontaland vertical direction W×HSample bit depth B
x
y
Color ImagesThree color components (typically RGB or YCbCr)Additionally characterized by color sampling format
VideosSequence of imagesAdditionally characterized by frame rate
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 9 / 31
Source Data Sampling and Quantization
Color Sampling Formats
RGB YCbCr 4:4:4 YCbCr 4:2:2 YCbCr 4:2:0
mostcommonformat
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 10 / 31
Source Data Sampling and Quantization
Raw Data Rate
Raw Data RateBit rate R of raw data format
R = (samples per time unit) · (bit depth per sample) (1)
Example: Full High Definition (HD) Video
1920×1080 luma samples, 50 frames per second (Europe)4:2:0 chroma format (chroma components with 1/4 luma resolution)8 bits per sampleraw data rate = 50Hz · 1920 · 1080 · (1 + 2/4) · 8 bits ≈ 1.24Gbit/s
Example: Ultra High Defintion (UHD) Video
3840×2160 luma samples, 60 frames per second (USA, Japan)4:2:0 chroma format, 10 bits per sampleraw data rate = 60Hz · 3840 · 2160 · (1 + 2/4) · 10 bits ≈ 7.5Gbit/s
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 11 / 31
The Source Coding Problem
Typical Video Communication Scenario
capture
raw inputvideo samples
raw inputvideo samples
pre-processing
videoencodervideo
encoder
transmission channel (can be replaced by storage)
channelencodermodulatorchanneldemodulatorchannel
decoder
videodecodervideo
decoderpost-
processing
raw outputvideo samplesraw output
video samples
display andperception
encodedbitstream
receivedbitstream
no transmission errorsbitstream
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 12 / 31
The Source Coding Problem
Typical Video Communication Scenario
capture
raw inputvideo samples
raw inputvideo samples
pre-processing
videoencodervideo
encoder
transmission channel (can be replaced by storage)
channelencodermodulatorchanneldemodulatorchannel
decoder
videodecodervideo
decoderpost-
processing
raw outputvideo samplesraw output
video samples
display andperception
encodedbitstream
receivedbitstream
no transmission errorsbitstream
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 12 / 31
The Source Coding Problem
Typical Video Communication Scenario
capture
raw inputvideo samples
raw inputvideo samples
pre-processing
videoencodervideo
encoder
transmission channel (can be replaced by storage)
channelencodermodulatorchanneldemodulatorchannel
decoder
videodecodervideo
decoderpost-
processing
raw outputvideo samplesraw output
video samples
display andperception
encodedbitstream
receivedbitstream
no transmission errorsbitstream
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 12 / 31
The Source Coding Problem
Application Examples
HD movie UHD broadcaston Blu-ray Disc over DVS-S2
raw video format
1920×1080 luma samples 3840×2160 luma samples4:2:0 chroma format 4:2:0 chroma format8 bits per sample 10 bits per sample24 frames per second 60 frames per second
raw data rate ca. 600 Mbit/s ca. 7.5 Gbit/s
channel bit rate 36 Mbit/s (read speed) 58 Mbit/s (8PSK 2/3)
video bit rate ca. 20 Mbit/s ca. 25 Mbit/s
required compression ca. 30:1 ca. 300:1
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 13 / 31
The Source Coding Problem
The Basic Communication / Source Coding Problem
Basic Source Coding ProblemTwo equivalent formulations:
Representing source data with highest fidelity possiblewithin an available bit rate
and
Representing source data using lowest bit rate possiblewhile maintaining a specified reproduction quality
Source CodecSource codec: System of encoder and decoder
encoder decoderbitstream
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 14 / 31
The Source Coding Problem
Practical Communication / Source Coding Problem
Characteristics of Source Codecs / Overall Communication
Bit rate: Throughput of the communication channelQuality: Fidelity of the reconstructed signalDelay: Start-up latency, end-to-end delayComplexity: Computation, memory, memory access
Practical Source Coding Problem
Given a maximum allowed complexity and a maximum delay,achieve an optimal trade-off between bit rate and reconstructionquality for the transmission problem in the targeted application
In this course:Will concentrate on source codec
Ignore aspects of transmission channel (e.g., transmission errors)
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 15 / 31
Coding Efficiency
Image Compression Example
100%
Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 31
Coding Efficiency
Image Compression Example
100%
Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 31
Coding Efficiency
Image Compression Example
100%
Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 31
Coding Efficiency
Image Compression Example
100%
Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 31
Coding Efficiency
Image Compression Example
100%
Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 31
Coding Efficiency
Image Compression Example
100%
Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 31
Coding Efficiency
Image Compression Example
100%
Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 31
Coding Efficiency
Image Compression Example
100%
Original Image (1024×678 image points, 945 KB)Lossless Compressed: GZIP 67.03%Lossless Compressed: PNG 46.50%Lossy Compressed: JPEG (Quality 95) 12.03%Lossy Compressed: JPEG (Quality 75) 3.18%Lossy Compressed: JPEG (Quality 50) 1.94%Lossy Compressed: JPEG (Quality 25) 1.11%Lossy Compressed: JPEG (Quality 1) 0.24%
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 31
Coding Efficiency
Trade-Off between Quality and Compression Ratio
Coding EfficiencyAbility to trade-off bit rate and reconstruction qualityWant best reconstruction quality for a given bitrate (or vice versa)
codec A
bettercodec B
bit rate
quality
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 31
Coding Efficiency
Coding Efficiency Example
1:100 Compression: JPEG1:100 Compression: H.265 | HEVC
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 18 / 31
Coding Efficiency
Coding Efficiency Example
1:100 Compression: JPEG1:100 Compression: H.265 | HEVC
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 18 / 31
Coding Efficiency
Measuring Coding Efficiency
Average Bit Rate
R =number of bits in bitstream for a video sequence
nominal duration of the video sequence(2)
Quality / Distortion
Typically: Mean square error (MSE) and peak-signal-to-noise ratio (PSNR)For a W×H array s[x , y ] of original samples, the array s ′[x , y ] ofreconstructed samples, and a sample bit depth B
MSE =1
W · H∑∀(x,y)
(s[x , y ]− s ′[x , y ]
)2 (3)
PSNR = 10 · log10
((2B − 1)2
MSE
)(4)
For videos: Average PSNR over entire video sequence
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 19 / 31
Basic Design of Image and Video Codecs
Image Compression Example: JPEG
Partition color components into 8× 8 blocks
Transform coding of 8× 8 blocks of samples
2D
transform
scalar
quantization
entropy
coding
block of
samples
sequence
of bits
2D inverse
transform
decoder
mapping
entropy
decoding
reconstructed
block
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 20 / 31
Basic Design of Image and Video Codecs
JPEG: Transform of Sample Blocks
Linear Transform: Matrix multiplication of the form c = A · sOrthogonal Transform: Rotation and reflection in signal spaceSeparable Transform: Reduced complexityIn practice: Discrete Cosine Transform (DCT) or approximation thereof
original block
horizontalDCT
verticalDCT
after 2d DCT
Effect of transformCompaction of signal energyQuantization is more efficient in transform domain
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 21 / 31
Basic Design of Image and Video Codecs
JPEG: Quantization
s
s'0 s'1 s'2 s'3 s'4s'-1s'-2s'-3s'-4
0 Δ 1·Δ 3·Δ 4·Δ-Δ-2·Δ-3·Δ-4·Δ
u-3 u-2 u-1 u0 u1 u2 u3 u4
Uniform reconstruction quantizersEqually spaced reconstruction levels (indicated by step size ∆)Simple decoder mapping
t ′ = ∆ · qEncoder has freedom to adapt decision thresholds to sourceSimplest encoder: Rounding
q = round(t/∆)
Quantization step size ∆ determines tradeoff between quality and bit rateT. Wiegand (TU Berlin) — Image and Video Coding: Introduction 22 / 31
Basic Design of Image and Video Codecs
JPEG: Entropy Coding
Scanning: Traverse quantization indexes from low to high frequency positions
0.242 0.108 0.053 0.009
0.105 0.053 0.022 0.002
0.046 0.017 0.006 0.001
0.009 0.002 0.001 0.000
probabilities P(qk 6= 0) zig-zag scan (JPEG)
Map list of numbers into bitstreamSimplest approach: Codeword tableOptimization problem:Minimize average codeword length
¯ =∑k
pk · `k
number codeword0 01 1002 1013 110004 110015 110106 11011· · · · · ·
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 23 / 31
Basic Design of Image and Video Codecs
Similarities between Successive Pictures in a Video Sequence
t
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 24 / 31
Basic Design of Image and Video Codecs
Video Codecs: Motion-Compensated Prediction
already coded reference picture
moving object
current picture
displaced object
𝑥
𝑦
𝒎 =𝑚𝑥
𝑚𝑦 displacement vector
for current block
current block
best-matching
block in
reference
picture FOOD FRESH
FOOD FRESH
FOOD FRESH
Predict current block using a displaced block in an already coded picture
s[x , y ] = s ′ref [x + mx , y + my ]
Displacement is characterized by displacement vector or motion vector
m = (mx ,my )T
Estimate suitable motion vector in encoder =⇒ Motion estimationTransmit motion vector to decoder as part of the bitstream
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 25 / 31
Basic Design of Image and Video Codecs
Example for Motion-Compensated Prediction
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 26 / 31
Basic Design of Image and Video Codecs
Hybrid Video Encoder
coding mode
decision
in-loop
filtering
motion
estimation
intra mode
decision
intra-picture
prediction
motion-comp.
prediction
transform &
quantization
scaling &
inv. transform
entropy
coding
encoder
control
bitstream
coding modes,
intra pred. modes,
motion data
𝑢𝑘[𝑥, 𝑦]
𝑢𝑘′ [𝑥, 𝑦]
𝑠𝑘∗[𝑥, 𝑦]
𝑠 𝑘[𝑥, 𝑦]
𝑠𝑘[𝑥, 𝑦]
input video pictures
(in coding order and
partitioned into blocks)
𝑠𝑘′ [𝑥, 𝑦]
buffer for
current picture
decoded
picture buffer
𝑠𝑘∗[𝑥, 𝑦]
reconstructed video
(for monitoring)
𝑠𝑟′[𝑥+𝑚𝑥, 𝑦+𝑚𝑦]
𝑓( 𝑠𝑘∗ 𝑥, 𝑦 )
transform coefficient levels
and quantization parameters
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 27 / 31
Basic Design of Image and Video Codecs
Hybrid Video Decoder
in-loop
filtering
motion-comp.
prediction
intra-picture
prediction
scaling &
inv. transform
entropy
decoding
bitstream
transform coefficient levels
and quantization parameters
𝑢𝑘′ [𝑥, 𝑦]
𝑠𝑘∗[𝑥, 𝑦]
𝑠 𝑘[𝑥, 𝑦]
𝑠𝑘′ [𝑥, 𝑦]
buffer for
current
picture
decoded
picture buffer
𝑠𝑘∗[𝑥, 𝑦]
decoded video
𝑠𝑟′[𝑥+𝑚𝑥, 𝑦+𝑚𝑦]
𝑓( 𝑠𝑘∗ 𝑥, 𝑦 )
motion data
intra prediction
modes
coding modes
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 28 / 31
Outline
Outline of Course
Image and Video Coding I: FundamentalsLossless CodingRate-Distortion TheoryQuantizationPredictive CodingTransform Coding
Image and Video Coding II: Algorithms and ApplicationsAcquisition, Representation, Display, and Perception of ImagesVideo Coding OverviewVideo Encoder ControlIntra-Picture CodingInter-Picture CodingVideo Coding Standards
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 29 / 31
Outline
Literature I
Source CodingWiegand, T. and Schwarz, H., “Source Coding: Part I of Fundamentals ofSource and Video Coding,” Foundations and Trends in Signal Processing,Now Publishers, vol. 4, no. 1–2, 2011. (pdf available on web site)Cover, T. M. and Thomas, J. A., “Elements of Information Theory,” JohnWiley & Sons, New York, 2006.Gersho, A. and Gray, R. M., “Vector Quantization and Signal Compression,”Kluwer Academic Publishers, Boston, Dordrecht, London, 1992.Jayant, N. S. and Noll, P., “Digital Coding of Waveforms,” Prentice-Hall,Englewood Cliffs, NJ, 1994.
Human VisionWandell, B. A., “Foundations of Vision,” Sinauer Associates, Sunderland,1995.Hauske, G. “Systemtheorie der visuellen Wahrnehmung,” B. G. Teubner,Stutgart, 1994.
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 30 / 31
Outline
Literature II
Image and Video CodingSchwarz, H. and Wiegand, T., “Video Coding: Part II of Fundamentals ofSource and Video Coding,” Foundations and Trends in Signal Processing,Now Publishers, vol. 10, no. 1–3, 2016. (pdf available on web site)Bull, D. R., “Communicating Pictures: A Course in Image and VideoCoding,” Elsevier, 2014.Ohm, J.-R., “Multimedia Signal Coding and Transmission,” Springer, 2015.Pennebaker, W. and Mitchell, J., “JPEG Still Image Data CompressionStandard,” Van Nostrand Reinhold, New York, 1993.Vetterli, M. and Kovacevic, J., “Wavelets and Subband Coding,”Prentice-Hall, 1995.Wien, M., “High Efficiency Video Coding — Coding Tools andSpecifications,” Springer 2014.Sze, V., Budagavi, M., and Sullivan, G. J. (eds.), “High Efficiency VideoCoding (HEVC): Algorithm and Architectures,” Springer, 2014.
T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 31 / 31