the future of robotics with jetson agx xavier · industrial aerospace/defense healthcare...
TRANSCRIPT
Ming Li | Senior Software Manager – Autonomous Machines
THE FUTURE OF ROBOTICS WITH JETSON AGX XAVIER
COMPUTERS WRITING SOFTWARE
DATA
PRE
PO
ST
PROGRAMDEEP NEURAL NETWORK
ACCELERATED COMPUTING1000X EVERY 10 YEARS
1980 1990 2000 2010 2020
103
105
107
GPU PERFORMANCECPU PERFORMANCE
+
BILLIONS OF AUTONOMOUS MACHINES
Industrial Aerospace/Defense Healthcare Construction Agriculture Smart City
Retail Logistics Inventory Mgmt Delivery Inspection Service
ROBOTICS CONTROL LOOP
REASON& PLAN
PERCEPTION
ACT
EXAMPLE — AI DELIVERYTotal: 20-30 TOPS
Sensor Fusion
HMI
Stereo Depth
Obstacle Detection
Tracking
Path Planning
Controller
Localization Mapping
VISION NETWORKS
Input Size GOPs/Frame GOPs@30Hz
IMAGE RECOGNITION
MobileNet 224x224 0.6 17
AlexNet 227x227 0.7 22
GoogleNet 224x224 2 60
ResNet-50 224x224 4 120
VGG19 224x224 20 600
OBJECT DETECTION
YOLO-v3 416x416 65 1,950
SSD-VGG 512x512 91 2,730
Faster-RCNN 600x850 172 5,160
Compute Demand
Input Size GOPs/Frame GOPs@30Hz
SEGMENTATION
FCN-8S 384x384 125 3,750
DeepLab-VGG 513x513 202 6,060
SegNet 640x360 286 8,580
POSE ESTIMATION
PRM 256x256 46 1,380
Multipose 368x368 136 4,080
STEREO DEPTH DNN 1280x640 260 7,800
RECENT DEPLOYMENTS OF AI IN ROBOTS
Optical InspectionMusashi Seimitsu
ConstructionKomatsu
Agriculture — WeedingBlue River
Agriculture — HarvestingAgrobot
Last Mile DeliveryRobby Technologies
Inventory ManagementFellow Robots
Drone Aerialtronics
LogisticsIAM Robotics
GROUNDBREAKING RESEARCH
Object Detection & Pose Estimation Trained in Simulation
NVIDIA
Complex In-hand Manipulation Trained in Simulation
OpenAI
Reactive Path Planning in Dynamic Environment
NVIDIA, University of Washington
Physics Based Simulation of Flexible Gripper
NVIDIA
10
JETSON AGX XAVIERWorld’s first AI computer for Autonomous Machines
AI Server Performance in 30W 15W 10W
512 Volta CUDA Cores 2x NVDLA
8 core CPU
32 DL TOPS
XAVIERWorld’s First Autonomous Machine Processor
Volta Tensor Core GPUFP32 / FP16 / INT8 Multi-Precision
512 CUDA Tensor Cores2.8 CUDA TFLOPS (FP16)
22.6 Tensor Core DL TOPS
ISP2.4 GPIX/sNative Full-Range HDRTile-based Processing
Vision Accelerator1.7 TOPS
Stereo & Optical Flow Engine2x 3.1 TOPS
Multimedia Engines1.2 GPIX/s Encode | 1.8 GPIX/s Decode
4 GPIX/s Video Image Compositor
16 Lane CSI109 Gbps CPHY 1.1
1Gb Ethernet
DLA5.7 TFLOPS FP1611.4 TOPS INT8
Carmel ARM64 CPU8 Cores10-wide Superscalar21 SpecInt2K6
Industry Standard High-Speed IOPCle Gen4 Root and EndpointUSB 3.1 Gen2 Host and DeviceUFS 2.1 Embedded Storage
256-Bit LPDDR4X137 GB/s
Most Complex SOC Ever Made | 9 Billion Transistors, 350mm2, 12nFFN | ~8,000 Engineering Years
12
JETSON AGX XAVIER
JETSON TX2 JETSON AGX XAVIER
GPU 256 Core Pascal 512 Core Volta
DL Accelerator - NVDLA x 2
Vision
Accelerator- VLA – 7 way VLIW Processor
CPU 6 core Denver and A57 CPUs 8 core Carmel CPUs
Memory8 GB 128 bit LPDDR4
58.4 GB/s
16 GB 256 bit LPDDR4x
137 GB/s
Storage 32 GB eMMC 32 GB eMMC
Video Encode2x 4K @30
HEVC
2x 4K @ 60 / 4x 4K @30
HEVC
Video Decode2x 4K @30
12 bit support
2x 8K @ 30 / 8x 4K @30
12 bit support
CameraUp to 6 cameras
CSI2 D-PHY 1.2 2.5Gbps/lane
Up to 8 cameras
CSI2 D-PHY 1.2 2.5 Gbps/lane
Mechanical50mm x 87mm
400 pin connector
100mm x 87mm
699 pin connector
New!
+2
x4
x2
x2
New!
New!
New!
13
JETSON AGX XAVIER20X PERFORMANCE IN 18 MONTHS
55
112
JetsonTX2
JetsonAGX Xavier
1.3
11
JetsonTX2
JetsonAGX Xavier
1.3
32
Jetson TX2 Jetson AGX Xavier
24x DL / AI 8x CUDA 2x CPU
58
137
JetsonTX2
JetsonAGX Xavier
2.4x DRAM BW
2
8
JetsonTX2
JetsonAGX Xavier
4x CODEC
TO
PS TFLO
PS (
FP16)
Cum
. D
MIP
S
GB/s
4K E
ncode a
nd D
ecode
JETSON AGX XAVIER
0
400
800
1200
1600
Core i7 + GTX 1070 Jetson AGX Xavier
GPU Workstation Perf 1/10th Power, Size
ResN
et-
50 Im
ages/
sec
1.4x
0
10
20
30
40
50
60
70
Core i7 + GTX 1070 Jetson AGX XavierResN
et-
50 Im
ages/
sec/W
14x
AI Inference Performance AI Inference Efficiency
15
Intel Core i7 Skylake + GTX 1070
>200 W
28 TOPS
4000 cm3
Consumer grade
AUTONOMOUS MACHINES PROTOTYPE
16
1 Jetson AGX Xavier
30 W
32 TOPS
600 cm3
Commercial grade
JETSON AGX XAVIER PRODUCTION PLATFORM
NVIDIA ISAAC — PLATFORM FOR ROBOTICS
SIMULATION DEPLOYMENTDEVELOPMENT
ISAAC SIM
VIRTUAL SENSORS
PLUGINS
PHYSICS
ENGINE
RENDERERNVIDIA
GPU CLOUD
ISAAC SIM
MAPPING &
LOCALIZATION
PHYSICS
ENGINE
SITUATION
UNDERSTANDINGPERCEPTION
ISAAC SIM
32 DL TOPSHI-BANDWIDTH
SERDES
DIVERSITY &
REDUNDANCY
7 DIFFERENT
PROCESSORS
PATH &
TASK PLANNING
NVIDIA ISAAC SDK
Perception
Isaac Framework
Drivers and APINavigation Visualization
Jetpack SDK ∙ CUDA ∙ TensorRT ∙ TensorFlow ∙ ONNX ∙ ROS
Stereo DNN
3D Pose Estimation
Object Detection
Model-based Tracking
Human Pose Detection
Path planning
Obstacle Avoidance
Mapping
Plots
Custom Drawings
3D Renderings
Inspection
Camera
LIDAR
IMU
Pan/Tilt
Sound Source Loc.
Applications
Message
Passing
Jetson AGX Xavier
ConfigurationNode/Component
Model Behaviors
Record /
Replay
Task
Scheduling
Distributed
Compute
NVIDIA ISAAC GEMS
Object / People DetectionGlobal Localization LQR Path Planner Depth Estimation Human Pose Estimation
Map Editor Visual Odometry Physical Simulation Gesture Recognition ASR
21
JETPACK SDK
FOR AI AT THE EDGE
JETSON SDKS OVERVIEW
JETSON AGX XAVIER
DEEPSTREAM SDK
FOR VIDEO ANALYTICS
ISAAC SDK
FOR ROBOTICS
22
EXAMPLE – VIDEO ANALYTICS
Typical application: 30+ TOPS
23
NVIDIA DEEPSTREAMZero Memory Copies
Typical multi-stream application: 30+ TOPS
JETSON AGX XAVIER ECOSYSTEM
SENSORS
SYSTEM DESIGN
SYSTEM SOFTWARE/TOOLS
MuJoCo
QUICK START PLATFORMS
AI SOFTWARE
RESEARCH
DISRIBUTORS WORLDWIDE
JETSON AGX XAVIER DEVKITWorld’s First AI Computer for Autonomous Machines
AVAILABLE NOW
AI Server Performance in 30W | 15W | 10W
512 Volta CUDA Cores | 2x NVDLA
8 Core CPU
32 DL TOPS
27