ai in iot at the edge: mram’s golden opportunity · how can stt-mram help? - the engine to enable...
TRANSCRIPT
AI in IoT at the Edge: MRAM’s Golden Opportunity
Andy Walker
October 2019
2
Spin Memory Corporate Overview• US Company based in Fremont, CA
• Strong Corporate Partners- Arm- Applied Materials- www.spinmemory.com/spin-memory-announces-52-million-series-b-funding-round/
• Highly differentiated MRAM IP and Expertise- Design Techniques and MRAM Management- 10+ Years MRAM Design Expertise- Magnetics / MTJ technology- Selector/Process Expertise
• Complete MRAM Teams- Magnetics/Physics- Device Fabrication- CMOS Design- Test & Reliability Engineering
• 200mm MRAM Prototype Line Spin Technology Center - Fremont, CA
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
3
Contents
• AI, IoT and Colossal Energy Demand• Physics of Charging and Discharging Capacitors• Relevance to Integrated Circuits, Systems and AI• The Main Hog – Energy Demand in Memory• What Can We Do About This for AI?
- But First – What is STT-MRAM?- Minimize Static Energy Loss/Maximize On-Chip Memory Capacity- The Engine for Energy Efficiency
• Application-Targeted MRAM for SoCs / MRAM Macros / Markets for SRAM-like MRAM- The Selector- Fault Tolerance and Voltage Manipulation
• MRAM in AI
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
4
IoT, AI and Colossal Energy Demand• Energy efficiency key constraints for IoT and AI-in-IoT proliferation
- 1T IoT devices by 2035(1)
- Large growth in AI-at-the-Edge - Widespread innovations in IoT power sources- What can be done at the silicon level?
(1) P. Sparks, “The Route to a Trillion Devices”, Arm White Paper, June 2017 (2) “AI Chip Architectures Race to the Edge”, Semiconductor Engineering Nov. 2018 (3) A. Raj et al., J. Electrochemical Soc. 2018
arm
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
(1)
(2)
(3)
5
IoT, AI and Colossal Energy Demand• Energy efficiency key constraint for AI proliferation
- Training a single AI model can emit as much CO2 as five cars in their lifetimes(1)
- AI data centers to consume > 10% of world energy capacity by 2025(1)
(1) G. Dickerson, Applied Materials AI Design Forum, July 9 2019 and E. Strubell et al., “Energy and Policy Considerations for Deep Learning in NLP”, arXiv.org, June 5 2019
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
6
The Main Hog – Energy Demand in Memory• Fetching/storing data in solid state memory uses >/~ 60% of system energy(1)
- On-chip SRAM access ~ 10X energy of CPU data manipulation- Off-chip DRAM access ~ 1000X energy of CPU data manipulation
• On-chip SRAM has fundamental leakage – wasted energy(2) (3)
(1) A. Boroumand et al., “Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks”, ASPLOS’18, March 2018
(2) A. Pedram et al., “Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era”, IEEE Design & Test, vol.34, April 2017
(3) Simulations/estimates from 7nm transistor data in IEDM 2017
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
7
Physics of Charging and Discharging Capacitors
• Charging- Total energy stored in capacitor = 1
2𝐶𝐶𝑉𝑉𝑏𝑏2
- Total energy converted into heat = 12𝐶𝐶𝑉𝑉𝑏𝑏2
Charging Discharging* *
* http://hyperphysics.phy-astr.gsu.edu
• Discharging- Total energy stored in capacitor = 1
2𝐶𝐶𝑉𝑉02
- Total energy converted into heat = 12𝐶𝐶𝑉𝑉02
• During charging half** of electrical energy converted into heat and half** stored on capacitor
• During discharging all** of stored capacitor energy converted into heat** About right since does not count EM radiation
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
8
Relevance to Integrated Circuits, Systems and AI• Any IC and system is an electrical power supply and a network of capacitors and
resistors• Data movements require charging and discharging of wires• Wires are capacitors with 𝑪𝑪 ∝ 𝑳𝑳𝒘𝒘𝒘𝒘𝒘𝒘𝒘𝒘• Energy conversion into heat ∝ 𝑳𝑳𝒘𝒘𝒘𝒘𝒘𝒘𝒘𝒘• Long wires between data in on-chip SRAM and processor• Longest wires between data in off-chip DRAM and processor• Most energy conversion into heat takes place in data transactions with memory• AI requires extremely intensive store and recall between processor and memory• Memory requirements pose a huge challenge in energy efficiency for deep learning
models
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
9
What Can We Do About This for AI?• Domain Specific Architectures for AI accelerator chips• Algorithms that minimize data flow interactions with off-chip DRAM• Package solutions that minimize inter-chip impedances (capacitances)• In-memory compute• Near memory compute• Principle of locality in time and space – cache structure and control• Data compression to minimize weight populations• Reduced precision arithmetic• Maximize stand-alone main memory single chip capacity• Minimize static energy loss • Maximize on-chip memory capacity• Fault tolerance and voltage manipulation
What is doing about this?
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
10
Minimize Static Energy Loss/Maximize On-Chip Memory Capacity
Spin’s Solution What it is What it does Importance
ENGINECircuit tuned to the physics of the magnetic element
Allows MRAM tobehave as RAM-like (high endurance and symmetric R/W)
High density, ~zero leakage,and persistent memory
SELECTOR
Manufacturable, scalable semiconductor device using existing Fab tools, materials and switching mechanisms
Allows dramaticshrinks of MRAM cells (</~10F2) to enable SRAM replacement and persistent (e)DRAM
Very high density, ~zero leakage and persistent memory for embedded and stand-alone solutions
Reduce transactions with off-chip DRAM for dramatic energy efficiency as embedded memory
Low energy Storage Class Memory
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
11
The Engine for Energy EfficiencyEnables RAM-like Performance with Energy Efficient MRAM
• Engine allows reduced electrical stress- Results in large endurance increase (5 – 6 orders)
• Engine deals with resultant WER increase- Managed transparently to the user- No change in latency- Allows for faster pulses at high endurance- Symmetric Read/Write
MTJ Voltage (V)
Log(
Writ
e Er
ror R
ate)
Reduced electrical stress
Increased WER
ENGINE on-chip
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
12
Application-Targeted MRAM Design for SoCs
NVM
Retention
Spee
d &
Endu
ranc
e
Foundry NVM MTJ
Foundry SRAM MTJ
eNVMeFlash replacement
10+ years
25ns Rd / 50-500ns Wrt106-8 cycles>10 years retention
SRAMSRAM ReplacementLLC, AI, DDI, many others
10-15ns R/W>1013 cyclesDays-months retention
HS&ENVM
High Speed & Endurance NVMIoT, Edge AI
25-50ns R/W>1011 cycles>10 years retention
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
13
MRAM Compiler and Macro Availability
• Arm and Spin creating MRAM compilers
- HS&E NVM first- SRAM replacement near future
• Arm and Spin can create custom macros
- Especially SRAM-replacement at advanced nodes
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
14
Markets for SRAM-like MRAM
Display Driver IC (DDI)
CMOS Image Sensor (CIS)MCUs
CPUs & Networking
Datacenter AI
SSD Controller
5 - 7nm 22 - 28nm
Especially Edge AI
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
15
The SelectorDisruptive Non-Disruptive Technology – The Key to Manufacturability
3-D NAND
High voltage vertical NMOS transistorusing selective epitaxy
Adapt
Optimize+
Combine
Allows Embedded Persistent DRAM (<10F2)- Maximize on-chip memory capacities- Minimize off-chip DRAM transactions- Dramatic increase in energy efficiency- Useful for any switchable element
SPIN’s Selector
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
16
Fault Tolerance and Voltage Manipulation• Energy stored on a capacitor ∝ 𝑉𝑉2
• Voltage V can be supply voltage, maximum bit line voltage and so on• Trade off classification accuracy for energy efficiency• SRAM voltage scaling for energy efficiency in convolutional neural nets(1)
• SRAM/DRAM/Flash voltage scaling in deep neural net resilience study(2)
• Traditional memories tend to have chaotic bit behavior with reducing V• MRAM is stochastic but with predictable bit error behavior with V (read and write)• Match neural net fault tolerance with MRAM bit error rates using V(3)
- Quadratic reduction in energy conversion into heat- Large MRAM endurance boost due to less thin magnesium oxide wearout- Improve performance through fast reads with predictable read disturb rates- Improve MRAM density with smaller cells with predictable retention errors
(1) L. Yand and B. Murmann, ISQED, 2017 (2) B. Reagen et al., DAC 2018 (3) M. Tzoufras, M. Gajek and A. Walker, arXiv 2019.
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
17
MRAM in AI (1)
• Stochasticity is linked in a fundamental way to neural networks. At the same time it is an inherent property of MRAM that has hampered it for more than a decade
• The convergence between neural networks and MRAM presents a unique opportunity for research and for improving the performance of many ANN applications
• SRAM is on-chip and provides flawless but untuneable precision• DRAM is off chip and provides flawless but untuneable precision• MRAM can be integrated on-chip to provide dense and tuneable precision
(1) M. Tzoufras, M. Gajek and A. Walker, arXiv 2019.
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
18
Conclusions• Energy Use in AI and IoT calls for Energy Efficiency techniques• The Main Hog – Energy Demand in Memory• Physics of Charging and Discharging Capacitors• Relevance to Integrated Circuits, Systems and AI• How Can STT-MRAM Help?
- The Engine to enable RAM-like performance• High endurance and symmetric read/write• Application-Targeted MRAM Designs for SoCs / MRAM Macros / Markets
- The Selector• Very high density persistent embedded and stand-alone memory with ~zero leakage
- Fault Tolerance and Voltage Manipulation• MRAM in AI
Arm Tech Con 2019 – © Spin Memory, Inc.October 2019
Thank You
spinmemory.com
Spin Memory Inc.45500 Northpoint Loop WestFremont, CA 94538