high performance video encoding using nvidiagpus

25
HIGH PERFORMANCE VIDEO ENCODING Abhijit Patait Sr. Manager, GPU Multimedia SW USING NVIDIA GPUS

Upload: others

Post on 30-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

HIGH PERFORMANCE VIDEO ENCODING

Abhijit Patait

Sr. Manager,

GPU Multimedia SW

USING NVIDIA GPUS

Page 2: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

AGENDA

� Overview GPU Video Encoding

� NVIDIA Video Encoding Capabilities

— Kepler vs Maxwell GPU capabilities

— Roadmap

� Software API

� Performance & Quality

Page 3: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

WHY GPU VIDEO ENCODING?

Page 4: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

BENEFITS OF ENCODING ON GPU

� Low power

— Fixed function hardware

— Reduced memory transfers

� Low latency

� High performance

� Higher density

� Scalability

� Ease of Programming

— Linux, Windows, C/C++, Application portability

Page 5: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

NVIDIA GPU VIDEO ENCODING CAPABILITIES

Page 6: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

NVIDIA GPU ENCODING CAPABILITIES

Feature Benefits

H.264 base, main, high profiles Wide range of use-cases

High performance (Up to 16x HD) “Blazing-speed” encoding

YUV 4:2:0 and 4:4:4 support High quality encoding without chroma subsampling

QP maps Customizable quality, region of interest encoding

MVC Full resolution stereo encode

Up to 4096 × 4096 in HW High resolution encode

API - NV Encode SDK & GRID SDK Flexible, Win/Linux, DirectX/CUDA

Independent of CUDA Use CUDA and encode simultaneously

Page 7: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

VIDEO ENCODING — KEPLER VS. MAXWELL

Kepler (GK104, GK107, GK106, GK110, GK208)

Maxwell (GM107)

Planar 4:4:4 Standard 4:4:4 and H.264 lossless encoding

~240 fps 2-pass encoding @ 720p ~500 fps 2-pass encoding @ 720p

GRID K340/K520, K1/K2, Quadro, Tesla K10/K20

Current and future Maxwell GPU-boards

GeForce – 2 full-speed encode sessions/GPU

GeForce – 2 full-speed encode sessions/GPU

NV Encode SDK 1.0, 2.0, 3.0 (Now) NV Encode SDK 4.0+ (May 2014)

GRID SDK 1.x, 2.2, 2.3 (Now) GRID SDK 3.0+ (June 2014)

Page 8: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

NVIDIA VIDEO ENCODING ROADMAP

� Performance improvements

� Quality improvements

— 4:4:4 & lossless encoding

— Rate control enhancements

— Adaptive quantization

— ROI, ME-only mode

� New video standards

Page 9: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

NVENC SOFTWARE APIS

Page 10: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

USING NVENCNVENC SDK • No capture

• Transcoding

• Archiving

• Video editing

• CUDA pre-process + encoding

• Granular encoder settings

• D3D, CUDA interopGRID SDK • Capture + encode

• Optimized for low-latency apps

• Capture + CUDA pre-process + encoding

• Encoder settings optimized for streaming

• D3D, CUDA interop

Direct

Encode

Capture +

Encode

Page 11: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

DIRECT ENCODE (NVENC SDK)

Client application

NVENC API

NVENC

Driver

DirectX

Driver

CUDA

Driver

NVENC firmware + hardware

Initialize,

Configure HW

HW Encode

Encoded

bitstream Configure, Encode

Page 12: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

CAPTURE AND ENCODE (GRID SDK)

Client application

NvFBC/NvIFR

NVENC

Driver

DirectX/OGL

Driver

NVENC Hardware

Capture

YUV

GPU 3D Engine

DX/OGL Present

Encode

Encoded

Bitstream

Page 13: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

NVENC SDK� Available on NVIDIA developer zone

— https://developer.nvidia.com/nvidia-video-codec-sdk

— Current release 3.0

— Release 4.0 in May 2014 with Maxwell support

� Interface header, documentation, sample application

— .dll/.so included in the driver

� Unified API for Windows and Linux

� Works on x86/x64

� Various API’s, presets, rate control modes for

— Transcoding

— Video conferencing

— GTC Session S4654

Page 14: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

NVENC SDK (CONTD.)� Advantages

— Flexibility

� Dynamic resolution/bitrate change

� CABAC vs CAVLC; low-level encoder settings, B-frames, sync vs async, custom QP

� Linux, Windows, DirectX, CUDA, OGL (via CUDA)

� Also works on GeForce hardware (2 sessions/GPU)

— Error concealment

� Reference picture invalidation

� Intra-refresh

— Quality

� Two-pass modes for higher quality

� Various presets with quality/performance trade-off

� 4:4:4 & lossless encoding (Maxwell only)

Page 15: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

GRID SDK ENCODE

� Available on NVIDIA developer zone

— https://developer.nvidia.com/grid-app-game-streaming

— Current release: 2.2

� Interface header, documentation, sample apps

— .dll/.so included in the driver

� Windows and Linux

� Works on x86/x64

� Various presets and API’s for

— Remote graphics (Cloud gaming, remote desktop, capture & stream)

� Optimized for low latency

Page 16: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

GRID SDK (CONTD.)

� Advantages

— Simplicity

� Very simple API; single function call for capture + H.264 encode

— Low-latency, high performance

� Optimized API

— Error concealment

� Reference picture invalidation

� Intra-refresh

— Quality

� Two-pass modes for higher quality

� 4:4:4 & lossless encoding (Maxwell only)

Page 17: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

PERFORMANCE AND QUALITY

Page 18: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

PERFORMANCE – 720P

100 200 300 400 500 600

2_PASS_QUALITY

2_PASS_FRAMESIZE_CAP

CBR_IFRAME_2PASS

505 fps

503 fps

504 fps

232 fps

232 fps

231 fps

720p Performance (fps)

NVENC Performance at 720p, Low-Latency HP preset

Kepler (GRID)

Maxwell

Performance measured on GRID K520 with GRID SDK NVENC performance benchmarking application

Rate control modes

Page 19: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

PERFORMANCE – 1080P

50 100 150 200 250

2_PASS_QUALITY

2_PASS_FRAMESIZE_CAP

CBR_IFRAME_2PASS

238 fps

240 fps

239 fps

119 fps

118 fps

118 fps

1080p Performance (fps)

NVENC Performance at 1080p, Low-Latency HP preset

Kepler (GRID)

Maxwell

Performance measured on GRID K520 with GRID SDK NVENC performance benchmarking application

Rate control modes

Page 20: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

ENCODING QUALITY VS X264 –ASSUMPTIONS

� Infinite GOP IPPP…

� VBV buffer = bitrate/framerate

� x264

— Zero latency

— CRF = 24

— Preset = faster

� NVENC

— Preset = LOW_LATENCY_HQ

— RC = 2-pass-quality

Page 21: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

NVENC/X264 QUALITY COMPARISON

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

0

5

10

15

20

25

30

35

40

45

1 101 201 301 401 501 601 701 801 901

SSIM

Y

PSN

R Y

(d

B)

Titan Fall 720p, 5 Mbps, Low-latency HQ

PSNR NVENC

PSNR x264

SSIM NVENC

SSIM x264

PSNR Y (dB)

SSIM Y

Page 22: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

NVENC/X264 QUALITY COMPARISON

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

0

10

20

30

40

50

60

1 101 201 301 401 501

SSIM

Y

PSN

R Y

(d

B)

Bunny 1080p, 12 Mbps, Low-latency HQ

PSNR NVENC

PSNR x264

SSIM NVENC

SSIM x264

PSNR Y (dB)

SSIM Y

Page 23: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

QUALITY COMPARISON – PSNR

-5.00 dB

0.00 dB

5.00 dB

10.00 dB

15.00 dB

20.00 dB

25.00 dB

30.00 dB

35.00 dB

40.00 dB

45.00 dB

50.00 dB

Bunny1080p

NFS Rivals720p

NFS Rivals1080p

Titan Fall720p

Titan Fall1080p

WoT - 31280 × 768

WoT - 121280 × 768

PSNR NVENC 47.24 dB 34.05 dB 35.51 dB 30.58 dB 28.13 dB 34.15 dB 35.60 dB

PSNR x264 43.71 dB 33.18 dB 34.39 dB 29.78 dB 30.63 dB 33.41 dB 34.72 dB

PSNR Difference 3.52 dB 0.87 dB 1.12 dB 0.80 dB -2.50 dB 0.74 dB 0.87 dB

PSN

R Y

(d

B)

PSNR Comparison - x264 vs NVENC

Page 24: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

QUALITY COMPARISON – SSIM

-0.2000

0.0000

0.2000

0.4000

0.6000

0.8000

1.0000

Bunny 1080p NFS Rivals720p

NFS Rivals1080p

Titan Fall720p

Titan Fall1080p

WoT - 31280 × 768

WoT - 121280 × 768

SSIM NVENC 0.9874 0.9217 0.9388 0.8350 0.8309 0.9101 0.9169

SSIM x264 0.9808 0.9103 0.9269 0.8073 0.8567 0.8930 0.9027

SSIM Difference 0.01 0.01 0.01 0.03 -0.03 0.02 0.01

SSIM

Y

SSIM Comparison - x264 vs NVENC

Page 25: HIGH PERFORMANCE VIDEO ENCODING USING NVIDIAGPUS

QUESTIONS?