parallel scalability and efficiency of hevc parallelization approaches chi ching chi, mauricio...

Post on 29-Mar-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Parallel Scalability and Efficiency ofHEVC Parallelization Approaches

Chi Ching Chi, Mauricio Alvarez-Mesa,, Ben Juurlink, Gordon Clare, F´elix Henry, St´ephane

Pateux and Thomas SchierlIEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS

FOR VIDEO TECHNOLOGY

Outline

• Introduction• Video codec parallelization approaches• Coding efficiency analysis• Experimental evaluation• Conclusions

Introduction

• While the single-core processor can decode a 1080p H.264/AVC video in real-time, it is very unlikely that processor performance will decode a 2160p50 HEVC video in real-time.

• To obtain real-time HEVC decoding performance, parallelism is no longer an option but a necessity.

Introduction

• H.264/AVC supports slice parallelization.• It may not achieve real-time if it receives a

video with one or a few slices per frame.• The main parallelization approaches currently

included in the HEVC draft (Tiles and Wavefront Parallel Processing[WPP]).

• This paper presents a approach called Overlapped Wavefront(OWF).

Previous parallelization strategies

• Frame-level parallelism• Slice-level parallelism• Macroblock-level parallelism

Frame-level parallelism

• Frame-level parallelism consists of processing multiple frames at the same time.

• Frame-level parallelism is sufficient for multicore systems with just a few cores.

• If due to fast motion, motion vectors are long, there is little parallelism.

Slice-level Parallelism

• Each frame can be partitioned into one or more slices.

• Slices in a frame are completely independent from each other and therefore they can also be used for parallel processing.

• It is useful for a frame with a few slices but not one slice per frame.

Macroblock-level Parallelism

Parallelization Strategies in HEVC

• Tiles• Wavefront Parallel Processing (WPP)• Overlapped Wavefront (OWF)

Tiles

Tiles

• The number of tiles and the location of their boundaries can be defined for the entire sequence or changed from picture to picture.

• Compared to slices, Tiles have a better coding efficiency.

• The rate-distortion loss increases with the number of tiles.

Wavefront Parallel Processing (WPP)

Overlapped Wavefront (OWF)

• When a thread has finished a CTB row in the current picture and no more rows are available it can start processing the next picture instead of waiting for the current picture to finish.

• The support this approach, the motion vector is contrained to ¼ of picture height.

Overlapped Wavefront (OWF)

Coding efficiency analysis

Coding efficiency analysis

Experimental evaluation

• Environment

Experimental evaluation

Experimental evaluation

Experimental evaluation

Experimental evaluation

Conclusions

• We present a detailed performance comparison of the main approaches, namely WPP ,Tiles and OWF.

• Tiles performance 7% higher than WPP on average at 12 cores.

• The proposed OWF 28% higher on average than Tiles.

• Achieve real-time performance for 1080p50 videos, but “only” 25.4 fps for 2160p.

top related