4k video processing and streaming platform on tx1 · 2017-05-11 · 4k video processing and...

21
Tobias Kammacher Dr. Matthias Rosenthal Institute of Embedded Systems / High Performance Multimedia Research Group Zurich University of Applied Sciences 4K Video Processing and Streaming Platform on TX1

Upload: others

Post on 01-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Tobias Kammacher

Dr. Matthias Rosenthal

Institute of Embedded Systems /

High Performance Multimedia Research Group

Zurich University of Applied Sciences

4K Video Processing and Streaming

Platform on TX1

Page 2: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Goal

1. Live video streaming

o In 5 minutes

2. Bottlenecks

o GPU

o Kernel

2

Page 3: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Live Video StreamingConsumer-level YUV420. Seconds-latency.

3

Video Sources Mixing Encoding Stream

Color Space

Conversion

4K → 4Gbps

Scaling

Target

Bandwidth

Transport

Protocol

Gbps Mbps

Page 4: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Live Video Streaming - Dynamic

4

Video Sources Mixing Encoding Stream

Gbps Mbps

HLS

RTP

Page 5: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Classic Approach

5

FPGA

Video Sources Mixing Encoding Stream

CPU

PipelineEncoding

IP

Streaming

Application

Fixed

Implementation

Interface

Page 6: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Approach with TX1

6

Interfaces

CSI

PCIe

USB

Ethernet

Processing

GStreamer

MM API

CPU

GPU

DMAs

CODECs

H.264

H.265

VP8

Streaming

HLS

Mpeg-TS

RT(S)P

Page 7: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Software Frameworks

• GStreamer

– Pipeline-based Multimedia Framework

– Very easy to use (one-liner)

– Open-Source

• L4T Multimedia API (since L4T 24.2)

– Low-level APIs for application development

• GPU Integration

– CUDA

– OpenGL / EGL

7GStreamer is free software available under the terms of the LGPL license

OpenGL® and the oval logo are trademarks or registered trademarks of Silicon Graphics, Inc

Page 8: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Software Stack for Streaming

8

CPU

Video Source

Linux Kernel

GPU

OpenGL, EGL, Vulkan CUDA

V4L2, videobuf2

Drivers / Modules

ALSA

Display Ctrl Eth PHY

DRM/KMS/FB

Host1x / Graphics Host Eth Driver

TCP/IP/UDP

Sources Sinks Processing CODECs Stream

OpenMAX (omx)

GPU Driver

CODECs PCIe Ctrl

Sockets

GStreamer

Multimedia API

v4l2, alsa, tcp/udpxvideo, overlay

(omx), tcp/udp mix, scale, convert,

cuda, openGL

omx h264/h265,

libav, mp3

rtp, rtsp, hls,

mpeg-ts

libargus, V4L2 API NVOSD

Buffer utility

High-Level: VisionWorks/OpenCV, TensorRT, cuDNN, Custom Application

X11

VI (CSI)

v4l2-subdev

Convert

cuda, openGL

NvVideoEncoder,

NvVideoDecoder

HW

Kernel

Space

Libraries

User

Space

Page 9: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Modularity HW & SW

9

CPU

Video Source

GPU

Display Ctrl Eth PHY

Sources Sinks Processing CODECs Stream

CODECs PCIe Ctrl

GStreamerv4l2, alsa, tcp/udp

xvideo, overlay

(omx), tcp/udp mix, scale, convert,

cuda, openGL

omx h264/h265,

libav, mp3

rtp, rtsp, hls,

mpeg-ts

VI (CSI)

Convert

Page 10: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Kernel Driver for 4K Input

10

CPU

Video Source

Linux Kernel

V4L2, videobuf2

Modules /

Drivers

Graphics Host

Sources Sinks Processing CODECs Stream

GStreamer

v4l2, alsa, tcp/udp

VI (CSI)

v4l2-subdev

HW

Kernel

Space

User

Space

Nvidia Jetson

TX1

Development

Board

HDMI2CSI

module

Page 11: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Video Input

11

Capture HDMI sources

– 4K* requires 8 CSI lanes in our case (hardware limitation)

Jetson

TX1

Dev

Board

Toshiba

TC358840

HDMI 4K

HDMI 1080p

* 2160p30 YUV422

Ethernet

Stream

Process

Video

CSI

TC358840

Page 12: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Simple Video Streaming PipelineHLS

12

V4L2

Source

HLS

Sink

Gstreamer Pipeline

ConvertMPEG-

TS Mux

$ gst-launch-1.0 v4l2src !

videoconvert !

omxh265enc

bitrate=5000000 !

mpegtsmux !

hlssink

playlist-location=/var/www/playlist.m3u8

location=/var/www/segment%05d.ts

playlist-root=http://192.168.0.1

Encode

H.265

WebServer (lighttpd)

Page 13: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Video ProcessingScaling, Mixing

13

Mixing two sources (4K and 1080p)

V4L2

Source

Format

Convert

Render

HDMI

Gstreamer Pipeline

ScaleMix

(PiP)

V4L2

Source

Page 14: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Video ProcessingExample: Scaling, Mixing

14

4K Video

1080p Video

Logo

Images: CC BY-SA Wikimedia

Page 15: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Video ProcessingExample: Scaling, Mixing

15

Mixing two sources (4K and 1080p)

• CPU: Using compositor element: 1.2 FPS

• OpenGL (glvideomixer & glimagesink): 6.8 FPS

• Need a solution with better performance => GPU

V4L2

Source

Format

Convert

Render

HDMI

Gstreamer Pipeline

ScaleMix

(PiP)

V4L2

Source

gst-launch-1.0 v4l2src ! 'video/x-raw, format=UYVY, framerate=30/1, width=3840, height=2160' ! compositorname=comp sink_0::alpha=1 sink_1::alpha=0.5 ! xvimagesinksync=false videotestsrc pattern=1 ! 'video/x-raw,format=UYVY, framerate=30/1, width=1000, height=1000' ! comp.

0

5

10

15

20

25

30

35

PiP Pipeline FPS

CPU OpenGL Required

?

Page 16: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

GPU ProcessingGPU Memory Access Methods

16

TX1

CPU GPU

DRAM 4GB

Memory Controller

L2

Cache

Unified Virtual Addressing

L2

Cache

CPU

Buffer

GPU

Buffer

TX1

CPU GPU

DRAM 4GB

Memory Controller

L2

Cache

Zero Copy

L2

Cache

Shared

Buffer

TX1

CPU GPU

DRAM 4GB

Memory Controller

L2

Cache

Managed Memory

L2

Cache

Shared

Buffer

Page 17: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

GPU ProcessingPiP Test (GPU Data Transfer and Kernel Execution)

17* Upload 4K + 1080p, Download 4K

** One time only operation

Unified Virtual Addressing

Step 1: cudaMemcpy() to GPU * 12.5 ms

Step 2: Execute kernel 9-11 ms

Step 3: cudaMemcpy() to host * 7.2 ms

Zero Copy

Step 1: cudaMallocHost(): Allocate memory on host **

-

Step 2: Execute kernel 23.5 – 25.7 ms

Managed Memory

Step 1: cudaMallocManaged(): Allocate shared memory **

-

Step 2: Execute kernel 9-11 ms

Step 3: synchronize with CPU 0.2 ms

Total: 30 ms

Total: 25 ms

Total: 10 ms

Page 18: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

GPU ProcessingResults

• PiP pipeline achieves 30 FPS

– Using managed memory

Additional:

• Consecutive kernels executed

faster

18

0

5

10

15

20

25

30

35

PiP Pipeline FPS

CPU OpenGL GPU

Page 19: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

ConclusionHardware Mapping

19

Color

Space

Conversion

Scaling

Picture

in

Picture

Audio/Video

MuxEncryption

Transport

Protocol

Packer

Forward

Error

Correction

Recorder

Video

Input

Ethernet

Output

Audio

2nd Video Source

GPU

HW Block

CPU

H.264/H.265

Encoder

Gbps Mbps

Page 20: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Conclusion

• Streaming Pipeline in 5 Minutes!

• GStreamer is modular and easy to use

• Video Processing Bottleneck?

– GPU to the rescue!

20

Page 21: 4K Video Processing and Streaming Platform on TX1 · 2017-05-11 · 4K Video Processing and Streaming Platform on TX1 . Zürcher Fachhochschule Goal 1. Live video streaming o In 5

Zürcher Fachhochschule

Get started with video streaming now!

21

Blog: https://blog.zhaw.ch/high-performance/

4K Driver: https://github.com/ines-hpmm

Hardware Board: http://pender.ch/products_zhaw.shtml

[email protected]

[email protected]