tinyml meetup talk v2 · microsoft powerpoint - tinyml_meetup_talk_v2.pptx author: ashah created...
TRANSCRIPT
µW, kB, MHz, and ¢: TinyML in the Real World
Scott HansonFounder and CTOSeptember 2019
1
Cars, Google, and Alexa Stealing Headlines at the Edge
2
And There Has Been a Flood of New AI Chips
3
https://blog.hardwareclub.co/theres-an-investment-frenzy-over-ai-chip-startups-d9b5ea42b5c4
But These Are BIG Processors
4
6B transistors, 250mm2, 14nm
36W power consumption
W, GHz, MB: This will NOT scale to billions of edge devices!
What If AI Is Cost Effective and Battery Powered?
5
ULP
Preventative Medicine AI in the Home Predictive Maintenance
Machines identify looming failure BEFORE
line down
Smart watches act as 24/7 doctors that flag conditions
BEFORE they become serious
Alexa “stickers” respond to our every need wherever
we are in the home
Intelligent Personal
Intelligent Home
Intelligent Factory
A µW, MHz, kB, and ¢ Problem
• 0.3-2mW average power budget
• 48-120MHz max clock frequency
• 1-2MB of on-chip flash, 300-600kB on-chip SRAM
• ~$5 for total for main processor, co-processor, PMIC, radio, sensors
6
To truly scale to billions of devices, the economics need to BEAT that of a typical smartwatch
Typical Smartwatch Teardown
There’s “Edge” and There’s “Bleeding Edge”
7
Edge Battery-Powered Edge
Power Budget 36 Watts 0.0003-0.002 WattsProcessor Speed 2 GHz 0.048-0.12 GHzOn-Chip Memory 32 MB <3 MB
Chip BoM ~$100 ~$5
An entirely different class of chip is required for true intelligence at the battery-powered edge
8
10Xlower power
>50M units
shipped
2X sustained
annual revenue growth
RAPID GROWTH
GROUND BREAKING
Ambiq Micro: Intelligence Everywhere
>30blocking patents
UNIQUE &DISRUPTIVE
TRUSTED & PROVEN
Attacking the Edge with Sub-Threshold Technology
Sub-threshold circuits enable >10X energy savings with standard CMOS – enabling >10X more compute without compromising power!
0 Volts
1.2 Volts
0 Volts
0.3 Volts
Energy ~ (Voltage)2
Conventional Circuit Design Sub-threshold Circuit Design
9
A Practical Example: Virtual Assistants Everywhere
10
A Practical Example: Virtual Assistants Everywhere
11
A Practical Example: Virtual Assistants Everywhere
12
600mWh
1y*365d*24h= 68µW average
1000mWh
7d*24h= 6mW average
1 Week Life on LiPo Battery 1 Year Life on Coin Cell Battery
Pieces of the Virtual Assistant Puzzle
13
Qui
et S
ound
D
etec
tion
Beam
For
min
g
Noi
se R
educ
tion
Keyw
ord
Det
ectio
n
Audi
o Co
mpr
essi
on
Clou
d/Vo
ice
Serv
ice…
Local Keyword Detectionon the Battery Powered Device
Keyword Confirmation and Transmission to
Cloud
Keyword Confirmation and Determination of
MeaningBl
ueto
oth
Tran
smis
sion
Sizing Up the Puzzle Pieces
14
MB and GHz will not lead to µW and ¢!
Wang, et al., “Small Footprint Keyword Spotting Using Deep Neural Network and Connectionist Temporal Classifier,” arXiv:1709.03665, 2017.
“Right Sizing” the Puzzle Pieces
• Keep the frequency under 48MHz typical, 96MHz max
• Keep instructions/fixed data under 1MB
• Keep dynamic data under 384kB
• Stick with 2 mics in a fixed endfire configuration
15
48MHz “typical” clock
Limited Memory
2 Channel Mic Interface
“Right Sizing” the Puzzle Pieces
16
48MHz “typical” clock
Limited Memory
2 Channel Mic Interface
Flash (kB) RAM (kB) MCPS
Audio/NN Chain ~250kB ~100kB ~1 (quiet)
~35 (voice)
Compress/Cloud/
Bluetooth~120kB ~60kB
0 (quiet)~10 (KW
detected)
TOTAL ~370kB ~160kB~1 (quiet)~45 (KW
detected)
Sobering Development Realities
• Audio/NN development is hard – especially w/ resource constraints
• No widespread availability of NN frameworks
• Close collaboration with specialty algorithm/SW houses is vital
• Standard architectures ensures the largest possible ecosystem
17
Qui
et S
ound
D
etec
tion
Beam
For
min
g
Noi
se R
educ
tion
Keyw
ord
Det
ectio
n
Audi
o Co
mpr
essi
on
Clou
d/Vo
ice
Serv
ice…
Blue
toot
h Tr
ansm
issi
on
µW, MHz, kB, and ¢ Can Deliver Great Performance
18
System PowerWith 2 Mics, BF,
SCNR, MCU, radio(µW)
Active Time(%)
Contribution to Average
Power(µW)
QSD Mode 1230 66% 812
Keyword Mode 2670 33% 881
TOTAL -- -- 1,693
High detection accuracy is possible in a power budget that works well for headphones, watches, and other rechargeable use cases
0.00
0.50
1.00
1.50
2.00
-15 -10 -5 0 5 10 15 20Nor
mal
ized
Det
ectio
n Ac
cura
cy (a
.u.)
Signal to Noise Ratio (dB)
Normalized Detection Accuracy vs. SNR
No Mic Proc SCNR 2cm BF 2cm BF + SCNR
Getting Creative on Power
19
System PowerWith 1 Mic, SCNR, MCU,
radio(µW)
Active Time(%)
Contribution to Average
Power(µW)
ZPL Mode 99 70% 69
QSD Mode 833 20% 167
Keyword Mode 2060 10% 206
TOTAL -- -- 442
When dropping to 1 mic and adding in specialized “comparator” mode on mic, we can achieve power inline with a remote control
0.00
0.50
1.00
1.50
2.00
-15 -10 -5 0 5 10 15 20Nor
mal
ized
Det
ectio
n Ac
cura
cy (a
.u.)
Signal to Noise Ratio (dB)
Normalized Detection Accuracy vs. SNR
No Mic Proc SCNR 2cm BF 2cm BF + SCNR
It’s a System Problem!
20
Qui
et S
ound
D
etec
tion
Beam
For
min
g
Noi
se R
educ
tion
Keyw
ord
Det
ectio
n
Audi
o Co
mpr
essi
on
Clou
d/Vo
ice
Serv
ice…
Blue
toot
h Tr
ansm
issi
on
The neural network consumes <10% of total system power while mics consume most of the remaining system power
What Comes Next?
21
There Is No Moore’s Law for Batteries
22
Moore’s Law Is Barely an Option for Chips
23
Cost is minimized in older nodes with a mix of digital, analog, and RF – Moore’s Law alone cannot solve the AI energy problem
Improving Hardware for AI at the Mobile Edge
• SoCs are making HUGE strides forward – but they must be balanced
• Innovation is also happening in sensors and sensor interfaces
• And all of this innovation is still subject to serious cost constraints
24
Improving Tools for AI at the Mobile Edge
25
Give customers the tools they need to develop their own neural networks
Improving Algorithms for AI at the Mobile Edge
• Algorithms and neural network architectures are evolving daily
• Seeing NNs applied to pre-processing and noise reduction
• Seeing more and more end-to-end algorithms
• Seeing many startups and large companies with exciting offerings
26
Ambiq Will Be Setting New Standards Soon
27
The next generation in SPOT-enabled hardware offers unmatched AI performance and energy efficiency –
WITHOUT compromising on system cost
Apollo2MCU
Apollo3 Blue BLE SoC
Next Gen SoC– Details
Forthcoming
Ener
gy E
ffic
ienc
y,
Perf
orm
ance
Next Gen Apollo – Details
Forthcoming
ApolloMCU
Time
Attacking the µW, MHz, kB, and ¢ Problem
• There are big opportunities at the mobile edge
• But W, GHz, MB, and $ cannot solve the problem
• Researchers and companies must be pragmatic and solve REAL problems
• Let’s all focus on building complete solutions
28
ULP