neuromorphic c magneto-metallic n & s : p p · 2017. 5. 18. · kaushik roy abhronil sengupta,...
TRANSCRIPT
KAUSHIK ROYABHRONIL SENGUPTA, KARTHIK YOGENDRA, DELIANG FAN, SYED
SARWAR, PRIYA PANDA, GOPAL SRINIVASAN, JASON ALLRED, S. VENKATRAMANI, ZUBAIR AZIM, A. RAGHUNATHAN
ELECTRICAL & COMPUTER ENGINEERING
PURDUE UNIVERSITY
WEST LAFAYETTE, IN 47906, USA
NEUROMORPHIC COMPUTING WITH MAGNETO-METALLICNEURONS & SYNAPSES: PROSPECTS AND PERSPECTIVES
The Computational Efficiency Gap
2
20 W20 W
~200000 WIBM Watson playing Jeopardy, 2011
IBM Blue Gene supercomputer, equipped with 147456 CPUs and 144TB of memory, consumed 1.4MW of power to simulate 5 secs of brain activity of a cat at 83 times slower firing rates
Neuromorphic Computing Technologies
33
Hardware Accelerators
Approximate Computing, Semantic Decomposition, Conditional DLN
Post‐CMOS Technologies
• QUORA, MICRO ’13• ….
• Spin neuron, IJCNN ’12, APL’15, TNANO, DAC, DRC, IEDM
• Spintronic Deep Learning Engine, ISLPED ’14
• Spin synapse, APL ’15• ….
• Approximate Neural Nets, ISLPED ’14
• Conditional Deep Learning, DATE 2016
• ….
SW (Multicores/GPUs)1 uJ/neuron
Device/Circuit/Algorithm Co-Design: Spin/ANN
4
BUILDING BLOCKS: MEMORY, NEURONS, SYNAPSESLateral Spin Valve(Local & Non‐local)
Domain Wall “transistor”DI G
DWMm1 m2
I
Spin current
I+ = +∑vigiExcitory Input
LCH+
m
LCH–
I– = –∑vigiInhibitory Input
Preset
NM
FM Clock
Synaptic current∆In = (In+ – In–)
SHM
MgO
FL
PL
IREF
IREF ± |Iin |
Integrator
VOUT
DW-MTJ: Domain Wall Motion/MTJ
6
Three terminal device structure provides decoupled “write” and “read” current paths Write current flowing through heavy metal programs domain wall position Read current is modulated by device conductance which varies linearly with domain wall
position
Universal device: Suitable for memory, neuron, synapse, interconnects
DW-MTJ for Interconnects/Memory
7
Interconnect Design Memory Bit-Cell
Energy-efficient interconnect design can circumvent the energy and delay penalties in CMOS based global interconnects for scaled technology nodes
DW-MTJ memory bit cell with decoupled “write” and “read” current paths
Thresholding (Activation)
8
w2
w1
wn
nucleolus
axon ∑
synapses
transmittingneuron
Artificial NN
nucleolus
Signaltransmission
Summation of weighted inputs
Thresholdingfunction
Spin Hall based Switching DW‐MTJ
Switch a magnet using spin current, read using TMR effect
SHM
MgO
FL
PL
Step and Analog ANN Neurons
9
Neuron, acting as the computing element, provides an output current (IOUT) which is a function of the input current (IIN)
Axon functionality is implemented by the CMOS transistor Note: Stochastic nature of switching of MTJ can be in Stochastic Neural
nets
IN
OUT
IN
OUT
Step Neuron Analog Neuron
Sum of Weighted Inputs (Dot Product)
10
w2
w1
wn
nucleolus
axon ∑
synapses
transmittingneuron
Artificial NN
Thresholdingfunction
nucleolus
Signaltransmission
Summation of weighted inputs
V1
V2
V3
Programmable resistors/DWM
w11 w12 w13
w21 w22 w23
w31 w32 w33
∑Vi1wi1 ∑Vi2wi2 ∑Vi3wi2
Input Voltages
All-Spin Artificial Neural Network
11
All-spin ANN where spintronic devices directly mimic neuron and synapse functionalities and axon (CMOS transistor) transmits the neuron’s output to the next stage
Ultra-low voltage (~100mV) operation of spintronic synaptic crossbar array made possible by magneto-metallic spin-neurons
System level simulations for character recognition shows maximum energy consumption of 0.32fJ per neuron which is ~100x lower in comparison to analog and digital CMOS neurons (45nm technology)
Spin-synapse Spin-neuron
Biological Neural Network
Spintronic Neural Network
All-spin Neuromorphic Architecture
Benchmarking with CMOS Implementation
12
Neurons Power Speed Energy Function technology
CMOS Analog neuron 1 [1]
~12µW (assume 1V
supply)65ns 780fJ Sigmoid /
CMOS Analog neuron 2 [2] 15µW / / Sigmoid 180nm
CMOS Analog neuron 3 [5] 70µW 10ns 700fJ Step 45nm
Digital Neuron [3] 83.62µW 10ns 832.6fJ 5-bit tanh 45nm
Hard-Limiting Spin-Neuron 0.81µW 1ns 0.81fJ Step /
Soft-LimitingSpin-Neuron 1.25µW 3ns 3.75fJ Rational/
Hyperbolic /
[1]: A. J. Annema, “Hardware realisation of a neuron transfer function and its derivative”, Electronics Letters, 1994[2]: M. T. Abuelma’ati, etc, “A reconfigurable satlin/sigmoid/gaussian/triangular basis functions”, APCCAS, 2006[3]: S. Ramasubramanian, et al., "SPINDLE: SPINtronic Deep Learning Engine for large‐scale neuromorphic computing", ISLPED, 2014[4]: D. Coue, etc “A four-quadrant subthreshold mode multiplier for analog neural network applications”, TNN, 1996[5]: M. Sharad, etc, “Spin-neurons: A possible path to energy-efficient neuromorphic computers”, JAP, 2013
Compared with analog/ digital CMOS based neuron design, spin based neuron designs have the potential to achieve more than two orders lower energy consumption
SPIKING NEURAL NETWORKS(SELF LEARNING)
Spiking Neuron Membrane Potential
14The leaky fire and integrate can be approximated by an MTJ – the magnetization dynamics mimics the leaky fire and integrate operation
Biological Spiking Neuron MTJ Spiking Neuron
LIF Equation: LLGS Equation:
Spiking Neurons
15
LLGS Based Spiking Neuron
LLG Equation Mimicking Spiking Neurons
DW-MTJ base IF Neurons
DW Integrating Property Mimicking IF Neuron
Input Spikes
MembranePotential
Output SpikesInput Spikes MTJ conductance
Arrangement of DW-MTJ Synapses in Array for STDP Learning
16
• Spintronic synapse in spiking neural networks exhibits spike timing dependent plasticity observed in biological synapses
• Programming current flowing through heavy metal varies in a similar nature as STDP curve
• Decoupled spike transmission and programming current paths assist online learning• 15fJ energy consumption per synaptic event which is ~10-100x lower in
comparison to SRAM based synapses /emerging devices like PCM
Spike-Timing Dependent Plasticity
Stochastic SNN
19Sengupta, Roy et al., IEEE Trans. On Electron Devices, 2016
We propose ANN-SNN conversion where the neural transfer function is interpreted as the spiking probability of the neuron in a particular time-step
Such a functionality is enabled by the stochastic device physics of switching in a Magnetic Tunnel Junction in presence of thermal noise
System-level simulations indicate energy consumption of 19.5nJ per image classification at the end of 50 time-steps of SNN simulation (>97% accuracy on MNIST dataset)
Computing with Coupled STNOs
Spin‐Neurons & Synapses: Coupled Spin‐Torque Oscillators
STNOs can be used to provide thresholding functionality with tunable threshold
Coupled
IREF
IREF ± |Iin |
Reference(STNO_R)
Processor(STNO_P)
Integrator
VOUT
When |Iin | is within the locking range (neuron threshold), two oscillators are locked. Otherwise, they are unlocked.
VOUT = VHIGH when unlockedVLOW when locked
Edge Detection using STNOs
Image
Ibias based on pixel intensity
Gilbert damping constant (alpha) = 0.01Saturation magnetization = 800 emu/ccMagnet volume (IMA) = 20x20x2 nm3Eb = 30kTlambda = 2epsilon prime = 0P = 0.9Hext = 11k Oe at 0.45 degrees from normal to the plane; Ibias range = 10uA to 50uA (i.e 10uA for black and 50uA for white pixels); Distance between STOs = 70nm (for coupling)
Summary
• Spintronics do show promise for low-power non-Boolean/brain-inspired computin– Need for new leaning techniques suitable for
emerging devices– Materials research, new physics, new devices,
simulation models• A long (but interesting) path ahead…