基于机器学习的 ad adas 及消费电子解决方案

30
赛灵思技术日 XILINX TECHNOLOGY DAY 原钢 赛灵思 AI 解决方案市场专家 2009 3 19 基于机器学习的赛灵思 自动驾驶和 ADAS 解决方案

Upload: others

Post on 12-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 基于机器学习的 AD ADAS 及消费电子解决方案

赛 灵 思 技 术 日XILINX TECHNOLOGY DAY

原钢赛灵思 AI 解决方案市场专家2009 年 3 月 19 日

基于机器学习的赛灵思 自动驾驶和ADAS 解决方案

Page 2: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

赛灵思 ADAS 市场领导地位

2015

2014

2013

CY

2016

2017

Over 12 Years Semi Supplier Heritage

CY13 - CY17 > 60% CAGR

40M+ cumulative units shipped

14 Makes - 29 Models

9 Makes - 13 Models

19 Makes - 64 Models

23 Makes - 85 Models

26 Makes - 96 Models

Page 3: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

赛灵思方案已经覆盖的应用

Auto Trailer HitchFull Display Mirror

Surround View

Front Camera - Mono

EV Car Charger SystemHeads Up Display

Driver Monitoring System

LiDAR Front Camera - Stereo

Page 4: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

赛灵思扩大汽车级 (XA)产品系列

Page 5: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

硬件可编程性成就性能更高的架构For (i=0, i< num;++i){ classification_process();hashing_process();encryption_process();

}

GPU Implementation FPGA Implementation

unloadloadKernel

Pipelining

No Kernel loading/unloading is required to run different applications Thanks to pipelining

To run different applications, GPU requires loading different kernel

Same kernel run many times using multiple small cores

A B C

A B C

A B C

A B C

Parallelizing Parallelizing

A B C

A B C

A B C

A B C

Page 6: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

OTA 硅片和动态功能

˃ Dyanmic Function eXchange (DFX) – Using the same FPGA for mutually exclusive functions– Eg: Driver monitoring and Valet Parking– Time-multiplexing hardware requires smaller FPGA– System Cost and Size Reduction with less silicon chips

˃ OTA Silicon– OEMs require OTA update to enable upgradability for new innovation in

emerging applications like Automated Driving – We provide both Software just like other SoC vendors but we can go further

by providing Hardware = OTA Silicon.

Page 7: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

˃ 2D Object DetectionVehicle: Car, SUV, Bus…Pedestrian, Cyclist, RiderTraffic -sign, Traffic-light

˃ 3D Object Detection

˃ Pose Estimation

˃ Lane Detection

˃ Drivable Space Detection

˃ Semantic Segmentation

汽车模块

Page 8: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

2D 目标检测

˃ 2D Object DetectionDetection Algorithms: SSD, TINY YOLOv2, YOLOv2, TINY YOLOv3, YOLOv3, Light-head RCNN etc.Datasets: KITTI 、Cityscapes 、BDD100K and Private data etc.

Page 9: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

2D 目标检测˃ SSD

Dataset: BDD100k and private dataCategories: Pedestrian, Car, Cyclist

GOPs(480*360)

Compress Ratio mAP(GPU)

FPS(DPU, Dual core,

ZCU102)

117 - 46.8 -

93.5 20% 46.7 -

69.7 40% 46.3 -

61 50% 48.6 -

46.9 60% 48.1 -

35.2 70% 49.4 -

28.9 75% 48.6 -

17.8 85% 47.5 -

12.1 90% 46.2 -

8.7 93% 44.3 -

6.3 95% 42.7 ~ 110 fps

Page 10: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

2D 目标检测 – 小目标检测

˃ RefineDet: Small pedestrian detectionThe original SSD model permanence on the small pedestrian dataset is 24%(MAP)Now the RefineDet model permanence is 31.8%(MAP)

˃ RefineDet Pruning

˃ FPS: baseline 210G 25fps, pruned 9.4G 101fps (ZCU102, triple B4096@330MHz)

210

103.6

51.9

17.79.4

31.8 31.7831.39

30.27 27.79

0

50

100

150

200

250

baseline 1 2 3 4

RefineDet compression

Operations(G) mAP(%)

Page 11: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

2D 目标检测

˃ YOLO2 Performance after Compression @Customer’s Data.

173

122

8669

52 4534 26.4 21.2 16.8 12.8

56.7 57.9 58.3 58.7 58.1 57.9 57.8 56.9 56.6 55.4 54.4

11.6 14.418.4 21.2

26.8 28.432.4

37.241.6 43.6 46.4

-40

-30

-20

-10

0

10

20

30

40

50

60

0

50

100

150

200

Baseline 1 2 3 4 5 6 7 8 9 10

FPS

Ope

ratio

ns(G

) / m

AP

Pruning loops

Pruning Speed up on Hardware (2xDPU@Zu9)YoloV2 single class detection @ Customer's data

Operations(G) mAP(%) fps

2.8x4x

Page 12: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

2D 目标检测˃ YOLO3 Performance of Compression

Dataset: CityscapesCategories: Pedestrian, Car, CyclistPlatform: ZCU102, triple B4096@330MHz

GOPs(512*256)

Compress Ratio mAP(DarkNet)

mAP(DPU)

FPS(DPU)

53.7 - 53.7 53.1 43

24.5 54% 53.7 53.7 61

17.0 68% 54.0 53.4 74

13.7 75% 56.1 55.4 82

10.7 80% 55.4 52.9 86

7.5 86% 57.0 55.3 93

5.7 89% 55.2 53.0 97

4.0 93% 51.2 49.3 100

Page 13: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

2D 目标检测

˃ SSD LiteBackbone : Mobilenet_v2 (Relu verison)Datasets: BDD100kInput size: 480*360,Operations: 6.57GmAP: 32.9DPU (one core) FPS: 36(ZU9), 21(ZU2)

˃ Tiny YOLO v3Datasets: KITTI ,Cityscapes ,BDD100K and Private data etc.Input size: 416*416Operations: 5.9GDPU FPS: 170 (ZU9 dual core)

Page 14: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

3D 目标检测˃ 3D Object Detection

Reproduce latest advanced 3D detection methods(F-PointNet and AVOD) combing the information of Lidar point cloud and RGB imageOptimize post processing

Page 15: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

姿态预判

˃ Driver Monitoring, Gesture Recognition

˃ Single Person Pose Estimation (After person detection)

head, neck, shoulder, elbow, wrist, hip, knee, ankleModel: CNN networks with coordinates regression300k train images, 70k test images, PCKh0.5 90.25%

˃ Multi-person Pose EstimationThis model uses heatmap to regression the joints’ location and the lines between two related jointsThe OKS of this model on AI challenger dataset is 0.32609

Page 16: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

˃ Motivation: detect lane even if the lanes are occluded by vehicles

˃ Algorithm:SCNN(left) and VPGNet (right)

˃ Dataset: SCNN: 9600 training and 1,300 test images capture from SCNN datasetVPGNet: 1000 training and 200 test images from Caltech-lane datasetInput size: SCNN (800x288), VPGNet (640x480)

车道检测

Page 17: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

˃ VPGnet compression:Dataset: 960 training and 240 test images capture from different scenesEvaluation metric: F1 scoreCompress to 10%, performance degrade 2%

车道检测- 剪枝

100

40

30

20

10

90 88.9 88.8 88.5 88

0

20

40

60

80

100

120

baseline 1 2 3 4

Operation (G) F1 score (%s)

Page 18: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

语义分割

˃ Semantic SegmentationUsing state-of-art algorithm for high performance Compress large model & try light-weight model to ensure efficiency and performance

Algorithm Input size Model backbone Operation numbers IOU(%) FPS @ Input sizeZCU9

WiderRes38 1024* 2048 wider-Resnet-38 10T 77.68 ——

SegNet 1024 * 2048 VGG 16 2.4T 56 ——

FPN-Deephi 1024 * 2048 Google_v1 136G 71.25 ——

Deeplabv3+ 1024 * 2048 Mobilenet_v2 49G 70.88 ——

ESPNet 512 * 1024 —— 9.4G 63.64 21.48 @ 256 * 512

ENet 512 * 1024 —— 9.36G 57.9 54.86 @ 256 * 512

FPN-Deephi(light weight) 256 * 512 Google_v1 9G 56.45 119 @ 256 * 512

Tiny-FPN 512 * 512 —— 1.8G 60.2 117 @ 256 * 512

Page 19: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

语义分割

˃ Semantic Segmentation

(a) Result of WiderRes38

(b) Result of FPN-Deephi (light weight)

Page 20: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

多任务学习

˃ Multi-task learningShared feature extraction backboneImprove accuracy by model architecture optimization multi-task model including 2D box detection, orientation and semantic segmentation (left)multi-task model including object detection, lane detection and drivable space detection (right)

Page 21: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

多任务学习- 剪枝

˃ Multi-task: 2D box detection, orientation and semantic segmentationDataset: BDD100k (train: 6967, test: 988)

Networks Input size Compression Ratio

Detection: mAP(IOU>0.5)

Segmentation: mIOU

Ops

VGG 288x512 Rate: 0 29% 46.9% 106.5G

Resnet50

480x640 Rate: 0 42.4% 48.3% 72.7G

480x640 Rate: 0.5 42.1% 47.1% 34.2G

480x640 Rate: 0.6 40.5% 45.8% 27.5G

480x640 Rate: 0.8 32.0% 39.6% 22.3G

Resnet18480x640 Rate: 0 26.5% 39.9% 27.7G

480x640 Rate: 0.5 24.2% 37.0% 14.0G

Page 22: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

多任务学习- 剪枝

˃ Multi-task: object detection, orientation, lane detection and drivable space detectionDataset: BDD100k (train: 6967, test: 988)

Networks Input size Compression Ratio

Detection: mAP(IOU>0.5)

Segmentation: mIOU

Ops

VGG288x512 Rate: 0 34.51% 57.43% 103.5G

288x512 Rate: 0.4 31.62% 56.35% 60.9G

288x512 Rate: 0.6 31.51% 55.42% 40.8G

Resnet18

288x512 Rate: 0 24.80% 56.26% 13.8G

288x512 Rate: 0.4 24.00% 54.83% 7.6G

288x512 Rate: 0.5 23.27% 54.30% 6.3G

288x512 Rate: 0.6 23.42% 53.58% 5.1G

Resnet50

288x512 Rate: 0 35.55% 58.81% 34.1G

288x512 Rate: 0.4 35.55% 58.09% 18.9G

288x512 Rate: 0.5 35.29% 57.61% 15.7G

288x512 Rate: 0.6 33.41% 56.80% 12.5G

Page 23: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

多任务学习: 在 ZCU102 上部署

˃ 1CH multi-task modelPlatform: ZU9 Network: ‒ ResNet 18 + 2D box detection, orientation and semantic segmentation

Input size: ‒ detection 480 * 360

Operation: ‒ detection 27.7G

FPS: ~29 fps

Page 24: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

现有客户案例Major application Functions Device CNN Demands Target Perf.

Front camera

2D object detection & classification

Zynq7020, ZU2/3/4/5

Yolo and Tiny Yolo, SSD, ResNet, Mobilenet v2

5FPS ~ 15FPS

Semantic Segmentation Zynq7020, ZU3/4 SegNet, FPN, ENet,

ESPNet 5FPS ~ 15FPS

SurroundView & Parking

Multi-channel object detection ZU5/ZU9 Yolo, SSD,

Lighthead RCN10FPS/CH ~ 30FPS/CH

LiDAR Object detection ZU3 SegNet, AVOD, F-PointNet 15FPS ~ 25FPS

L2-L4 ECU

2D and 3D object detection ZU9/ZU11

Yolo and Tiny Yolo, SSD;Complex Yolo;

10FPS/CH ~ 30FPS/CH

Semantic Segmentation ZU9/ZU11 SegNet, FPN, ENet,

ESPNet 10FPS/CH

Driver Monitoring Pose Estimation ZU9 OpenPose, 20FPS

Page 25: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

ADAS 域控制器 - 相机

Rear Camera *1 Surround view fisheye camera *4

Front Camera *2

Front Cam (near): Detection & SegmentationD Mode

R Mode

Fisheye Cam(1CH) @ turning: Segmentation

Fisheye Cam(4CH): Segmentation

Page 26: 基于机器学习的 AD ADAS 及消费电子解决方案

赛 灵 思 技 术 日XILINX TECHNOLOGY DAY

基于机器学习的消费电子解决方案

Page 27: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

消费电子中的机器学习

Drones: obstacle recognitionSmart Appliance: intelligent controlSet Top Box: content recognition

Multi-function Printer: quality enhancement Projector: quality enhancement, SR Camcoder: scenario recognition

Page 28: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

为什么选择赛灵思?

Software programmability of an ARM®-based processor with the hardware programmability of an FPGA

Easy to design single-chip solution

Programmable hardware for diverse interface

Fusion of multi-function

Page 29: 基于机器学习的 AD ADAS 及消费电子解决方案

© Copyright 2019 Xilinx

基于概念的 Zynq 消费电子系统架构

PSDual-core A9

PeripheralsUSB2.0 x2

GPIOsSPIs/I2Cs

DDRC

DVPInterface

DVPInterface

DMA

PS

(Motor Control)

DisplayController

MIPIDSI

UI LCD

PL

POD MotorsPOD MotorsPOD Motors

Motors POD & others

QSPIFLASH

DDR3DDR3

LEDs

POD Camera

Face Camera AXI Bus Fabric

DPU

Page 30: 基于机器学习的 AD ADAS 及消费电子解决方案

Adaptable.Intelligent.

赛 灵 思 技 术 日XILINX TECHNOLOGY DAY