![Page 1: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/1.jpg)
Autonomous Driving on Benchmarks
Xiaodi Hou
![Page 2: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/2.jpg)
TWO DECADES OF BENCHMARKING
![Page 3: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/3.jpg)
Two decades of benchmarking
• MNIST
– 1998
– Character recognition
– 60,000 images
• Inspired Convolutional Neural Net
![Page 4: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/4.jpg)
Two decades of benchmarking
• PASCAL-VOC
– 2005
– Object detection & classification
– 3787 images
• Inspired Deformable Part-based Model
![Page 5: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/5.jpg)
Two decades of benchmarking
• ImageNet
– 2010
– Object classification
– 1,000,000 images
• Inspired deep learning
![Page 6: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/6.jpg)
LIMITATIONS OF BENCHMARKS
![Page 7: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/7.jpg)
Upper bounds of benchmarks
Objective tasks
Intermediate tasks
Subjective tasks
• Measuring physical reality
• Bounded by measurement
accuracy
• Stereo/Optical flow/Face
recognition
• Measuring human cognition
• Bounded by subject
agreement
• Saliency/Memorability/Image
captioning
![Page 8: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/8.jpg)
Imperfect benchmarks
• Marriage market in China
– Tall, rich, and handsome• 80% girls are forced to choose among
– tall poor ugly guy
– short rich ugly guy
– short poor handsome guy
• Dimensionality reduction
– Guaranteed information loss!
– A projection of 𝑹𝒏 → 𝑹
• Red or Blue?
![Page 9: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/9.jpg)
Signs of a fading benchmark
• Saturated competition– Labeled Face in the Wild (0.9978 ± 0.0007)
• Weak transferability– Middlebury Optical Flow → KITTI Optical Flow
• Poor inert-subject consistency– Image captioning and BLEU scores
• A man throwing a frisbee in a park.
• A man holding a frisbee in his hand.
• A man standing in the grass with a frisbee.
![Page 10: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/10.jpg)
BENCHMARKS AND AUTONOMOUS DRIVING
![Page 11: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/11.jpg)
Vision-based autonomous driving benchmarks
• KITTI & CityScapes
– Detection
– Tracking
– Stereo/Flow
– SLAM
– Semantic segmentation
• 100% traditional vision challenges
• Are we ready?
![Page 12: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/12.jpg)
Not yet…
![Page 13: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/13.jpg)
Challenge 1: Data distribution
• Academia
– Average performance
• Silicon valley startup
– Demo oriented
– Best case performance
• Real products
– Murphy’s law
– Worst case performance
![Page 14: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/14.jpg)
Challenge 2: Gruond-truth representation
• Bbox– Almost no bbox in real
world!
– Missing hidden variables (distance & velocity)
• Semantic segmentation– “pixel classification”
– How to assemble all the pixels?
• Stixels– Representing the world
using matchstick
– Distance and 3D shape
– Missing the notion of whole objects
![Page 15: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/15.jpg)
Challenge 3: Structured prior
• What’s wrong with end-to-end learning?
![Page 16: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/16.jpg)
Challenge 3: Structured prior
• Two types of priors:– Implicit prior
• Data driven (e.g. images)
• Good for deep learning models
– Explicit prior• Rule driven (e.g. cars cannot fly)
• Good for probabilistic models
• The road ahead– An image based problem with strong explicit priors
![Page 17: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/17.jpg)
TUSIMPLE CHALLENGES!WORKSHOP@CVPR 2017
![Page 18: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/18.jpg)
TuSimple Challenge 1: Lane challenge
![Page 19: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/19.jpg)
TuSimple Challenge 1: Lane challenge
• Deep learning for lane?– Parametrization of pixels
• Strong structure priors– ~ 3.75m lane width
– Parallel lines
– (almost) flat road surface
• Over-representing corner cases– 20% hard cases (heavy occlusion/strong light condition
change/bad markings) are unlikely to occurs, if sampled uniformly
![Page 20: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/20.jpg)
TuSimple Challenge 2: Velocity estimation
• Representing the world with cam + LiDAR
![Page 21: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/21.jpg)
TuSimple Challenge 2: Velocity estimation
• Object-level representation for motion planning
– Stereo map?
– SLAM?
– Estimation based on bbox size?
• LiDAR vs Camera
– No LiDAR solution for 200m perception
![Page 22: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/22.jpg)
TuSimple challenges
• Video clip based
– We expect non-trivial temporal aggregation!
• Confidence based
– Each entry has a “confidence” field
– We evaluate the most confident 80% entries
• Run-time
– Must report single GPU runtime speed
– Slow algorithms (< 3fps) will not be included in the leaderboard
![Page 23: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/23.jpg)
HTTP://BENCHMARK.TUSIMPIE.AI
Available now!!
![Page 24: Autonomous Driving on Benchmarkson-demand.gputechconf.com/gtc/2017/presentation/s7787...• Stixels –Representing the world using matchstick –Distance and 3D shape –Missing the](https://reader036.vdocument.in/reader036/viewer/2022081613/5fb5fcaaf8ac8529f92eee47/html5/thumbnails/24.jpg)
Xiaodi Hou