4 imtc wiegand 131009
DESCRIPTION
Presentation about Video Communications with focus on Video Coding - covers H.264/MPEG-AVC, Lagrangian Coder Control, H.265/MPEG-HEVC, Immersive Video Communication, Human Perception Measurement. Delivered at IMTC 20th Anniversary ForumTRANSCRIPT
slid
e 2
slid
e 3
slid
e 4
§ H.264/MPEG-AVC
§ Lagrangian Coder Control
§ H.265/MPEG-HEVC
§ Immersive Video Communication
§ Human Perception Measurement
slid
e 5
0 100 200 300
28
30
32
34
36
38
40
Rate [kbit/s]
PSNR [dB] Half-pel
motion compensation (MPEG-1 1993 MPEG-2 1994)
Integer-pel motion compensation (H.261, 1991)
Variable block size (16x16 – 8x8) (H.263, 1996) + quarter-pel motion compensation (MPEG-4, 1998)
Variable block size (16x16 – 4x4) + quarter-pel + multi-frame motion compensation (H.264/AVC, 2003)
Intra frame DCT coding (JPEG, 1990)
Bit-rate Reduction: 75% 35
Foreman 10 Hz, QCIF 100 frames
slid
e 6
0 100 200 300
28
30
32
34
36
38
40
Rate [kbit/s]
PSNR [dB] H.264/AVC (2003) ?
slid
e 7
1989: Digital TV – Digital Broadcast, DVD
1999: Birth of H.26L in Berlin
Today: >3 Billion devices with H.264/AVC 50% of all bits on the Internet
Every HDTV Receiver
Every Blu-Ray Player
Most Internet Video
Countless Mobile Video
slid
e 8
Entropy Coding
Scaling & Inv. Transform
Motion- Compensation
Control Data
Quant. Transf. coeffs
Motion Data
Intra/Inter
Coder Control
Decoder
Motion Estimation
Transform/ Scal./Quant. -
Input Video Signal Split into Macroblocks 16x16 samples
Intra-frame Prediction
De-blocking Filter
Output Video Signal
slid
e 9
§ How to run the video encoder?
§ Decision between many options denoted with vector p
§ Unconstrained Lagrangian Formulation:
€
minp
D(p)+ λ ⋅R(p)
D - Distortion R - Rate RT - Target Rate p - Parameter Vector
with λ controlling the rate-distortion trade-off
€
minp
D(p) s.t. R(p)≤ RT
§ Minimization tests the various modes in video coding [Wiegand, et al., 1996]
§ Constrained Problem:
slid
e 10
[Shoham & Gersho, 1989]
slid
e 12
Entropy Coding
Scaling & Inv. Transform
Motion- Compensation
Control Data
Quant. Transf. coeffs
Motion Data
Intra/Inter
Coder Control
Decoder
Motion Estimation
Transform/ Scal./Quant. -
Input Video Signal Split into Macroblocks 16x16 pixels
Intra-frame Prediction
De-blocking Filter
Output Video Signal
8x8
0
4x8
0 1 0 1 2 3
4x4 8x4
1 0 8x8
Types
0
16x16
0 1
8x16 MB
Types
8x8 0 1 2 3
16x8
1
0
slid
e 13
§ Division of a picture into square blocks § Blocks are assigned to quadtrees
§ Maximum block size is signalled (e.g. 64x64)
§ Quadtree-based subdivision of tree block into prediction and transform blocks
slid
e 14
§ Transform sizes range from 4x4 to 32x32
§ Fast integer transforms specified
§ Additional new rectangular transforms proposed
slid
e 15
15
0 100 200 300
28
30
32
34
36
38
40
bit rate (kbit/s)
PSNR (dB)
Foreman 10 Hz, QCIF 100 frames
H.265 / MPEG-HEVC
H.264/ MPEG-AVC
MPEG-2 H.261 H.263 + MPEG-4 Visual
JPEG
35 Bit-rate Reduction: 50%
slid
e 16
PSNR [dB]
Bit Rate [kbit/s]
50% 50%
slid
e 17
PSNR [dB]
Bit Rate [kbit/s]
2.5 dB 3.5 dB
slid
e 18
slid
e 19
Source: D. Grois et al.
slid
e 20
Source: D. Grois et al.
slid
e 21
Final approval of version 1: April 14, 2013
What comes after version 1 of H.265/MPEG-HEVC?
èThe following H.265/MPEG-HEVC extensions are work in progress:
• Range Extensions (January 2014)
Higher bit-depths (>10bit), More chroma formats (4:4:4, 4:2:2),...
• Scalable Coding (Mid. 2014)
• 3D Multiview and Depth (January 2014 and 2015)
slid
e 23
§ Whole conference situation is not sufficiently natural § Provision of eye-contact is limited § Awareness of gestures and body language is not fully supported
slid
e 24
Courtesy:
slid
e 25
slid
e 26
26
§ Multi-view video analysis
§ Calculation of a 3D model
§ Rendering of a novel virtual view
slid
e 27
27
slid
e 28
© 28
slid
e 29
© 29
slid
e 30
slid
e 31
8k theater (NHK) at IBC 2011
Virtual Stadium (NTT) at IBC 2005 Laser Dream Theatre (Sony) at Expo 2005
5k system (HHI) at NAB 2007
slid
e 32
Omnidirectional 6k camera
system (OMNICAM)
Stitching Segmentation
Coding Transmission
Decoding Warping
Blending
Multiprojection system
7 HD Projectors 6k Video
slid
e 34
§ Steady-state visual evoked potentials (VEPs) to objectively evaluate Visual Cortex response
§ Event-Related Potentials (ERPs) to objectively evaluate subjective processes
§ ERPs are eventually leading to judgments and evaluation
slid
e 35
Display Human Subjective Assessment:
MOS excellent
good
fair
poor
bad
Cornea Lens
Retina V1 V2 P3
slid
e 36
slid
e 37
§ H.264/MPEG-AVC: More than 3 Billion devices and 50% of all bits on the Internet
§ H.265/MPEG-HEVC: § Lagrangian approach to coder control § 50% bit-rate reduction relative to H.264
§ Immersive rooms § Seamless integration and eye contact § Walls are becoming displays
§ H.266 – Research Frontier § Even higher resolutions and 3D § Improved subjective measures
slid
e 38
ITU-T VCEG & ISO/IEC MPEG Colleagues: • Gary J. Sullivan • Gisle Bjontegaard • … Vidyo • Alex Eleftheriadis • Ofer Shapiro • … HHI/TUB members and research associates § H. Schwarz, D. Marpe & D. Grois § P. Kauff & R. Schäfer § K.-R. Müller & A. Norcia § …
slid
e 39