exploiting temporal redundancy of visual structures for

Post on 13-Apr-2022

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Exploiting Temporal Redundancy

of Visual Structures for Video Compression

Georgios Georgiadis Stefano Soatto

Department of Computer Science,

University of California, Los Angeles,

Los Angeles, CA 90095, USA

{giorgos,soatto}@cs.ucla.edu

We present a video coding system that partitions the scene into “visual structures”and a residual “background” layer. The system exploits the temporal redundancy ofvisual structures to compress video sequences. We construct a dictionary of track-templates, which correspond to a representation of visual structures. We subsequentlychoose a subset of the dictionary’s elements to encode video frames using a MarkovRandom Field (MRF) formulation that places the track-templates in “depth” layers(Fig. 1). Our video coding system o↵ers an improvement over H.265/H.264 and othermethods in a rate-distortion comparison (Fig. 2) in four sequences from MOSEG1.

+

Input Visual Structures Layers (Excluding Background Layer) Reconstructed

Visual Structures

Input Frame

Reconstructed Visual Structures

Background Layer

Reconstructed Frame

≈ =

Figure 1: Reconstructing a frame. Visual structures are decomposed into depth layers andreconciled by overlaying them. The input frame is reconstructed by adding back to thevisual structures the background layer.

0 50 100 150 200

25

30

35

40

45

Bit Rate (kbps)

PSNR

(dB)

VS+h.265h.265VS+h.264h.264JPEG

0 50 100 150 200

25

30

35

40

45

Bit Rate (kbps)

PSNR

(dB)

VS+h.265h.265VS+h.264h.264JPEG

0 50 100 150 200

25

30

35

40

45

Bit Rate (kbps)

PSNR

(dB)

VS+h.265h.265VS+h.264h.264JPEG

0 50 100 150 200

25

30

35

40

45

Bit Rate (kbps)

PSNR

(dB)

VS+h.265h.265VS+h.264h.264JPEG

Figure 2: Typical results on four sequences (Cars-1,2, People-1,2). Top: PSNR against bitrate. “VS+H.265” (black) and “VS+H.264” (blue) outperform H.265 (yellow) and H.264(red) respectively. Bottom: Propagated and newly-created tracks. Non-transparent trackscorrespond to tracks that are motion-predicted from previous frames. Semi-transparenttracks are tracks that start in this frame.

1http://lmb.informatik.uni-freiburg.de/resources/datasets/

top related