high performance gpu video encoding | gtc...

31
High Performance GPU Video Encoding

Upload: ngotruc

Post on 02-Jan-2019

240 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

High Performance GPU Video Encoding

Page 2: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Agenda

� Why GPU Video Encoding

� NVIDIA H.264 Video Encoding Solutions

� Hardware Architecture

� Software Architecture

� Performance

Page 3: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Why GPU Video Encoding?

Page 4: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Benefits of Encoding on GPU

� Low power

— Fixed function hardware

— Reduced memory transfers

� Low latency

� High performance

� Higher density

— 2x channel density @ ~50% power consumption

� Scalability

Page 5: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Cloud Streaming – Encode on CPU

AppTextures & vertices in sys-mem

Textures & vertices in vid-mem

Render & Present

Captured image in vid-mem

Transfer image to sys-mem

EncodePacketize

& transmit

CPU GPU CPU

Power, LatencyLarge memory transfers

PowerCPU-intensive tasks

Cost/seat , channel density CPU-intensive tasks

Page 6: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Cloud Streaming – Encode on CPU

Power, LatencyLarge memory transfers

PowerCPU-intensive tasks

Limited scalabilityFixed number of CPUs

AppTextures & vertices in sys-mem

Textures & vertices in vid-mem

Render & Present

Captured image in vid-mem

Transfer image to sys-mem

EncodePacketize

& transmit

Transfer image to sys-mem

EncodeApp

Textures & vertices in sys-mem

Textures & vertices in vid-mem

Render & Present

Captured image in vid-mem

Transfer image to sys-mem

EncodePacketize

& transmit

Transfer image to sys-mem

EncodeApp

Textures & vertices in sys-mem

Textures & vertices in vid-mem

Render & Present

Captured image in vid-mem

Transfer image to sys-mem

EncodePacketize

& transmit

Transfer image to sys-mem

EncodeApp

Textures & vertices in sys-mem

Textures & vertices in vid-mem

Render & Present

Captured image in vid-mem

Transfer image to sys-mem

EncodePacketize

& transmit

Transfer image to sys-mem

Encode

? ? ? ? ? ? ? ?

Cost/seat , channel density CPU-intensive tasks

Page 7: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

AppTextures & vertices in sys-mem

Textures & vertices in vid-mem

Render & Present

Captured image in vid-mem

Packetize &

transmit

CPU

Low-power, low-latencyNo large memory transfers

Low powerUse CPU only where needed

Cost/seat , channel density Use CPU only where needed

Encode

CPUGPU

Cloud Streaming – Encode on GPU

Page 8: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Cloud Streaming – Encode on GPU

AppTextures & vertices in sys-mem

Textures & vertices in vid-mem

Render & Present

Captured image in vid-mem

Packetize &

transmitEncode

Low-power, low-latencyNo large memory transfers

Low powerUse CPU only where needed

Cost/seat , channel density Use CPU only where needed

EncodeTextures & vertices in vid-mem

Render & Present

Captured image in vid-mem

EncodeEncodeTextures & vertices in vid-mem

Render & Present

Captured image in vid-mem

EncodeEncodeTextures & vertices in vid-mem

Render & Present

Captured image in vid-mem

EncodeEncodeTextures & vertices in vid-mem

Render & Present

Captured image in vid-mem

EncodeEncodeTextures & vertices in vid-mem

Render & Present

Captured image in vid-mem

EncodeEncodeTextures & vertices in vid-mem

Render & Present

Captured image in vid-mem

EncodeEncodeTextures & vertices in vid-mem

Render & Present

Captured image in vid-mem

EncodeEncode

Excellent scalabilityAdd GPUs as needed

Page 9: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVIDIA H.264 Video Encoding Solutions

Page 10: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVIDIA H.264 Video Encoding SolutionsCUDA E

ncodin

g • Hybrid processing (CPU + CUDA)

• ME, intra-prediction, mode decision in CUDA

• VLE on CPU

• Performance scales with CUDA cores

• Works on all GPUs (Tesla, Fermi, Kepler, …)

NVENC • Fully hardware

accelerated

• ME, intra-prediction, mode decision, VLE

• High performance, low power

• Kepler+ GPUs

Page 11: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVIDIA H.264 Video Encoding Solutions

• Distributed with CUDA SDK libraries

• No low-latency streaming

• All Platforms –GeForce, Quadro, Tesla, GRID

• Windows only

• Proprietary software API

• Optimized for low-latency streaming

• Better visual quality

• Quadro, GRID and Tesla

• Windows & Linux

Page 12: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Power vs. Performance

8

1

2

3

4

5

6

7

Performance n

×× ××HD

Power (Watts)

NVENC (HQ)

NVENC (HP)

CUDA (GK107)

CUDA (GK104)

CUDA (GF110)

CUDA (GF104)

Page 13: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC FeaturesFeature What it enables

H.264 base, main, high profiles Wide range of use-cases

Up to 8x HD encode (1080p @ 240 fps) Faster than real-time encoding

Flexible ME, QP maps Customizable quality, region of interest encoding

YUV 4:2:0 and planar 4:4:4 support High quality encoding without chroma subsampling

MVC Full resolution stereo encode

Up to 4096 × 4096 in HW High resolution encode

API NVENC SDK (Flexible API, Win/Linux, x86)GRID SDK (Capture+ Encode, Win - now, Linux -future)

NVENC and CUDA parallelism Simultaneous and parallel HW and CUDA encoding for increased performance

Page 14: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC Hardware Architecture

Page 15: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC Arch: Microcontroller

Microcontroller

DMA Controller

Motion Estimation Mode

Decision

Intrasearch

& recon loop

EntropyCoding

Video memory (FB) interface

Memory

Host • NVIDIA proprietary

microcontroller

• Runs firmware

• Programs encoder

blocks

• Rate control

Page 16: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC Arch: Motion Estimation

Microcontroller

DMA Controller

Motion Estimation Mode

Decision

Intrasearch

& recon loop

EntropyCoding

Video memory (FB) interface

Memory

Host • Exhaustive full-pel

search (L0, L1, Bi)

• Temporal

• Spatial

• Coloc

• Constant

• External

• Half-pel and

quarter-pel

refinement

• Motion

compensation

Page 17: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC Arch: Mode Decision

Microcontroller

DMA Controller

Motion Estimation Mode

Decision

Intrasearch

& recon loop

EntropyCoding

Video memory (FB) interface

Memory

Host • Calculates inter-MB

cost

• Compares to intra-

MB cost and decides

final winner

Page 18: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC Arch: Microcontroller

Microcontroller

DMA Controller

Motion Estimation Mode

Decision

Intrasearch

& recon loop

EntropyCoding

Video memory (FB) interface

Memory

Host • H.264 intra search

• Forward DCT &

quantization

• Recon-loop (IDCT,

IQT, deblocking)

Page 19: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC Arch: Microcontroller

Microcontroller

DMA Controller

Motion Estimation Mode

Decision

Intrasearch

& recon loop

EntropyCoding

Video memory (FB) interface

Memory

Host

• CAVLC & CABAC

Page 20: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC Software Architecture

Page 21: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Using NVENCNVENC SDK • No capture

• Transcoding

• Archiving

• Video editing

• CUDA + encoding

• D3D, CUDA interop

• Exhaustive encoder settings

GRID

SDK • Capture + encode

• Optimized for low-latency apps

• Limited encoder settings

Direct

Encode

Capture +

Encode

Page 22: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Direct NVENC Encode (NVENC SDK)

Client application

NVENC API

NVENC

Driver

DirectX

Driver

CUDA

Driver

NVENC firmware + hardware

Initialize, Configure, Encode

Configure HW

HW Encode

Encoded

bitstream

Page 23: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Capture and Encode (GRID SDK)

Client application

GRID SDK

NVENC

Driver

DirectX

Driver

NVENC Hardware

Capture

YUV

GPU 3D Engine

DX/OGL Present

Encode

Encoded

Bitstream

Page 24: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC SDK

� Available on NVIDIA developer zone

— https://developer.nvidia.com/nvidia-video-codec-sdk

� .DLL/.so, interface header, documentation, sample apps

� Unified API for Windows and Linux

� Works on x86/x64

� Various presets and API’s for

— Transcoding

— Video conferencing

— Remote graphics (Cloud gaming, remote desktop, capture & stream)

� Supports CBR, VBR rate control

Page 25: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC SDK (Contd.)

� Advanced features

— Dynamic resolution change

— Dynamic bitrate change

— Reference picture invalidation

— Temporal SVC

— Intra-refresh

— Two-pass rate control for constant quality

Page 26: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

GRID SDK Encode

� Licensed from NVIDIA

� .DLL/.so, interface header, documentation, sample apps

� Windows (now) and Linux (future)

� Works on x86/x64

� Various presets and API’s for

— Remote graphics (Cloud gaming, remote desktop, capture & stream)

� Optimized for low latency and high quality

Page 27: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

NVENC Performance

Preset Video 1080p Gaming 1080p Gaming 720p Gaming 720p

HP 223.21 fps 216.45 fps 483.09 fps 485.44 fps

HQ (With 1-B frame) 116.69 fps 122.55 fps 263.85 fps 290.70 fps

HQ (No B-frames) 144.30 fps 130.72 fps 311.53 fps 336.70 fps

0.00 fps

100.00 fps

200.00 fps

300.00 fps

400.00 fps

500.00 fps

600.00 fps

Video 1080p Gaming 1080p Gaming 720p Gaming 720p

HP

HQ (With 1-B frame)

HQ (No B-frames)

Page 28: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Performance (n simultaneous HD encodes)

0 fps

20 fps

40 fps

60 fps

80 fps

100 fps

120 fps

140 fps

160 fps

gijoe_highmotion.yuv (VC

IPP)

gijoe_highmotion.yuv (VC

IBP)

ghost_rider_long.yuv (VC IPP)ghost_rider_long.yuv (VC IBP)

Target perf

N/A

1 contexts

2 contexts

3 contexts

4 contexts

5 contexts

6 contexts

7 contexts

8 contexts

9 contexts

10 contexts

• Encode bit rate = 30 Mbps• Performance shown with more than 1 context is the sum of fps obtained from each context

Page 29: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Gaming Sequence (720p) – 5 Mbps

Target

Actual

0

2

4

6

8

10

1 101 201 301

Page 30: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

0

5

10

15

1 101 201 301 401 501 601 701

Gaming Sequence (720p) – 10 Mbps

Target

Actual

Page 31: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions

Questions?