Copyright © 2016 Embedded Vision Alliance 1
Computer Vision 2.0:
Where We Are and Where We’re Going
Jeff Bier Founder, Embedded Vision Alliance | President, BDTI
May 3, 2016
Copyright © 2016 Embedded Vision Alliance 2
Computer vision: research and fundamental
technology for extracting meaning from images
Machine vision: factory applications
Embedded vision: thousands of applications
• Consumer, automotive, medical, defense, retail,
gaming, security, education, transportation, …
• Embedded systems, mobile devices, PCs and the cloud
The Evolution of Vision Technology
Copyright © 2016 Embedded Vision Alliance 3
Applications
Copyright © 2016 Embedded Vision Alliance 4
Applications: Natural User Interface
Source: 3rd-strike.com
Source: engadget.com Source: stuff.tv
Copyright © 2016 Embedded Vision Alliance 5
Applications: Automotive Safety
Source: Subaru
Source: digitaltrends.com
Technologyreivew.com
“Now, to win top overall safety scores from the IIHS, a car needs to have a forward-collision warning system with automatic braking. In addition, any autobrake system has to function effectively in formal track tests…”
“A 2009 study conducted by the IIHS found a 7 percent reduction in crashes for vehicles with a basic forward-collision warning system, and a 14 to 15 percent reduction for those with automatic braking.”
Copyright © 2016 Embedded Vision Alliance 6
Software-Defined Sensor
Source: videantis
Source: proctorcars.com Source: optalert.com Source: teslaliving.net
Copyright © 2016 Embedded Vision Alliance 7
Mercedes Magic Body Control
https://www.youtube.com/watch?v=940wGYCeQ68
Copyright © 2016 Embedded Vision Alliance 8
Applications: Keeping an Eye on Our Stuff
Source: bestbuy.ca
Source: Tend Insights
Source: Camio
Copyright © 2016 Embedded Vision Alliance 9
Keeping an Eye on Our Stuff (Industrial Version)
Source: govtech.com
Source: exacq.com
Source: technologyreview.com
Source: Kespry/
Copyright © 2016 Embedded Vision Alliance 10
Autonomous Vehicles Come in Many Varieties
Source: nimblechapps.com
Source: digitaltrends.com
Source: forbes.com
Source: linuxgizmos.com
Copyright © 2016 Embedded Vision Alliance 11
DJI Phantom 4
https://www.youtube.com/watch?v=JJPSSqMQajA
Copyright © 2016 Embedded Vision Alliance 12
• I can’t tell you for sure what the
killer app is for computer vision.
• (If I could, I’d be rich … or at
least I’d be Carnac the
Magnificient.)
• But we do know a couple of
things.
Applications: What’s The Future Hold
Copyright © 2016 Embedded Vision Alliance 13
One thing we know about computer vision is that it will eventually be
“invisible”
Computer Vision is an Enabling Technology, Not
an End in Itself
Shoebox speech recognition
system, 1960s, IBM
Dragon, 1990s-2000s iPhone, 2015
Copyright © 2016 Embedded Vision Alliance 14
• We’ve been here before—lots of times
before, actually
• Example: RISC in the 1980s, digital
signal processing (DSP) in the 1990s
• Search for applications enabled by a new
technology …
• … leads to a scramble to figure out common
algorithms and algorithmic building blocks …
• … which in turn drive processor architecture
(“what do we do in hardware”?)
• … which in turn drives what apps are
possible or easy
We Also Know Apps Don’t Live Alone
Algorithms
Processors
Applications
Copyright © 2016 Embedded Vision Alliance 15
Algorithms
Copyright © 2016 Embedded Vision Alliance 16
• Infinitely varying inputs in many applications
• Uncontrolled conditions: lighting, orientation, motion, occlusion
• Leads to ambiguity…
• Leads to the need for complex,
multi-layered algorithms to extract
meaning from pixels
• Plus:
• Lack of analytical models means
exhaustive experimentation is required
• Numerous algorithms and algorithm
parameters to choose from
Vision Algorithms Are Challenging
www.selectspecs.com
Copyright © 2016 Embedded Vision Alliance 17
Source: xkcd.com
Copyright © 2016 Embedded Vision Alliance 18
Source: hitl.washington.edu/artoolkit Source: xkcd.com
Copyright © 2016 Embedded Vision Alliance 19
Deep Neural Networks: Learning Machines
Source: NVIDIA
Copyright © 2016 Embedded Vision Alliance 20
• Originally used solely for classification, convnets are now also being used
for:
• Detection
• Segmentation
• Sequences (e.g., video captioning)
• Visual motor control
Expanding Applicability of Deep Learning
Source: Long, Shelhamer, Darrell. CVPR’15
Source: Levine, Finn, Darrell, Abbeel, UC Berkeley
Copyright © 2016 Embedded Vision Alliance 21
Then:
• Needed many algorithm engineers
• Needed lots of compute for runtime
• We lacked an underlying theory of
visual perception
• We struggled to implement what we
could describe
What Changes… and What Doesn’t?
Now:
• Need lots of training data
• Need lots of compute for
runtime… and more for training
• We still lack the theory, but now
have more general solutions
• We are increasingly able to
implement what we can show
Copyright © 2016 Embedded Vision Alliance 22
• For many applications algorithms will
converge around deep neural networks
• Some applications will include multiple
deep learning modules
• We’ll also converge on a small set of other
algorithms (i.e., not deep learning) for
specific tasks
• E.g., SLAM, stereo correspondence,
panoramic image stitching, …
Where Does That Leave Us?
Algorithms
Processors
Applications
Copyright © 2016 Embedded Vision Alliance 23
System Architecture
Copyright © 2016 Embedded Vision Alliance 24
Every Computer Vision System Looks Something
Like This
Camera Local
Processor
Network
Connection
Cloud
Backend
Copyright © 2016 Embedded Vision Alliance 25
Cloud, Edge or Both? Yes.
Copyright © 2016 Embedded Vision Alliance 26
Lots of Options, with Tradeoffs, Depending on
What You’re Trying To Do
Cloud Use (Compute and/or Bandwidth)
Local
Processing
Power
Low High
Low
High
CubeWorks
Camio
NAUTO
Facebook, Google,
Clarif.ai, …
ADAS
Copyright © 2016 Embedded Vision Alliance 27
Processors for Deploying Vision
Copyright © 2016 Embedded Vision Alliance 28
The Old Days:
“Any color you want so long as it’s beige”
Copyright © 2016 Embedded Vision Alliance 29
Lots of options:
• PC CPU
• PC CPU + discrete or integrated GPU
• Mobile application processor (e.g., Qualcomm Snapdragon)
• CPU + discrete or integrated FPGA (Xilinx, Altera)
• DSPs (e.g., Texas Instruments ‘C6x)
Today: General-Purpose Chips Used for Vision
Copyright © 2016 Embedded Vision Alliance 30
The options multiply like crazy (get it?)
Processor Chips:
• Analog Devices BF609
• Inuitive NU3000
• MobileEye EyeQ4
• Movidius Myriad 2
• NXP S32V
• Texas Instruments TDA3x, TDA2Ec,
Jacinto 6 Entry
Trend: Vision-specific Processors
Processor Cores:
• Apical Spirit
• Cadence Vision P5, Vision P6
• CEVA XM-4
• Synopsys DesignWare EV
• Vivante VIP7000, GC7000-XS
VX
Copyright © 2016 Embedded Vision Alliance 31
• Heterogeneity is great! It gives:
• Most efficient use of your resources (cost, speed, power)
• Insurance (you’re not committed to a particular platform or
technology)
• But it comes at a cost: hard to program
• Where we are now: “deal with it”
Trend: Heterogeneity
Copyright © 2016 Embedded Vision Alliance 32
• Long term:
• Heterogeneity in hardware becomes increasingly hidden
through higher level abstractions
• More vision-specific co-processors, which are specialized
for the “winning” algorithms
• A winnowing of architectures reduces diversity
Future: Heterogeneity
Copyright © 2016 Embedded Vision Alliance 33
Development
Copyright © 2016 Embedded Vision Alliance 34
• Development centered around the
PC
• Algorithms implemented from
scratch
• Hand-optimized
Development: The Old Days…
Copyright © 2016 Embedded Vision Alliance 35
• OpenCV enables fast algorithm experimentation
• Toolkits from technology suppliers
• Functionality encapsulated in software modules
• Object detection, emotion analysis, SLAM, AR
• In OpenCV and elsewhere
• If you need to optimize: CUDA, OpenCL, NEON compiler intrinsics, etc.
Development: Today
Copyright © 2016 Embedded Vision Alliance 36
• Heterogeneity of hardware becomes hidden
• OpenVX: Abstracts hardware, not the algorithm
• Higher-level APIs: Abstract the algorithm and hardware
• Higher-level deep learning abstractions
• Automated optimization of neural networks
• Automated design and training of neural networks
• Development shifts from implementation to integration
Development: Future
Copyright © 2016 Embedded Vision Alliance 37
The Business of Computer Vision
Copyright © 2016 Embedded Vision Alliance 38
• Ubiquitous
• Invisible
• A gigantic creator of value
• Both for suppliers
• … and those who use it
Analogy to Wireless (Thanks, Raj!)
Facebook stats:
• 1.5B monthly mobile active users
• 989M daily mobile active users
• 54% login ONLY from mobile
• 79% of ad revenue from mobile
Copyright © 2016 Embedded Vision Alliance 39
Intel’s Public Computer Vision Investments
2009 2010 2011 2012 2013 2014 2015
Prism Skylabs
Retail people tracking
$25M investment
10/2013
Vuzix
Digital
eyewear
$24.8M
1/2015
Olaworks
Face recognition
$30.7M
4/2012
InVision Biometrics
3D sensors
$50M
11/2011
Imagination
Mobile GPU
$38M investment
6/2009
InVisage
Quantum film
sensors
$32.5M
12/2014
EyeFluence
Eye tracking
Undisclosed
11/2014
Avegant
Glyph
VR headset
$9.4M
11/2014
3Gear
Gesture
recognition
$1.9M
4/2014
Emotient
Facial expression
recognition
$6M
2/2014
Omek Interactive
Gesture recognition
$40M
7/2013
CognoVision
Digital signage
$30M
11/2010
Tyzx
Stereo vision
Est. $50M
2012
| | | | | |
?
INVESTMENTS
ACQUISITIONS
Tobii
Eye
tracking
$21M
3/2014
$25M 6
2
$21M 9
$33M $38M $25M
$30M $31M $40M $50M $50M
(Estimates)
Copyright © 2016 Embedded Vision Alliance 40
• Computer vision will become ubiquitous and invisible
• It will be a huge creator of value, both for suppliers as well as those who
leverage the technology in their applications
• Deep learning will become a dominant technique (but not the only
technique)
• Computation distributed between the cloud and the edge
• Heterogeneity in hardware becomes increasingly hidden
• Development shifts from implementation to integration
• …Until the next disruptive technology emerges
Conclusions
Copyright © 2016 Embedded Vision Alliance 41
Embedded Vision Alliance Member Companies