space-time algebra: a model for neocortical computation · space-time algebra: a model for...
TRANSCRIPT
J E Smith
June 2018
Missoula, MT
Space-Time Algebra:
A Model for Neocortical Computation
A Design Challenge
June 2018 2copyright JE Smith
❑ Consider designing networks that compute functions as follows:
• Input values: encoded as patterns of transient events (e.g. “voltage spikes”).
Spike times encode values
• Computation: a wave of spikes passes from inputs to outputs
Feedforward flow
At most one spike per line
• Output values: output spike pattern is decoded into a useful form
.
.
.
Inpu
t V
alu
es
Encode
to
Spikes
Decode
from
Spikes
F1F4
F5
Fn
Fn-2
F2
F3
Fn-1
.
.
.
x1
x2
xm
z2
zp
z1
Ou
tpu
t Valu
es
...
...
...
...
...
...
.
.
.
.
.
.
.
.
.
Asynchronous, Local Computation
June 2018 3copyright JE Smith
❑ Assign values according to spike times
• Non-spike is denoted as “”
❑ Each functional block operates asynchronously and locally:
• Begins computation when its first input spike arrives
• Operates on input values relative to first spike time
• Eventually produces an output spike or no spike (“”)
.
.
.
Inpu
t V
alu
es
Encode
to
Spikes
Decode
from
Spikes
F1F4
F5
Fn
Fn-2
F2
F3
Fn-1
.
.
.
x1
x2
xm
z2
zp
z16
7
4
9
12
6269
75
Ou
tpu
t Valu
es
...
...
...
...
...
...
.
.
.
.
.
.
.
.
.
0
1
2
3
60
57
Why Would Anyone Want To Do This?
June 2018 4copyright JE Smith
❑ Conceptually simple, but this really seems like a difficult way to design a
computing device, even if we assume ideal delays.
.
.
.
Inp
ut V
alu
es
Encode
to
Spikes
Decode
from
Spikes
F1F4
F5
Fn
Fn-2
F2
F3
Fn-1
.
.
.
x1
x2
xm
z2
zp
z16
7
4
9
12
6269
75
Ou
tpu
t Valu
es
...
...
...
...
...
...
.
.
.
.
.
.
.
.
.
0
1
2
3
60
57
Why Would Anyone Want To Do This?
June 2018 5copyright JE Smith
❑ Call the functional blocks “neurons”, and we have an important class of
biologically plausible spiking neural networks
Potentially, perform brain-like functions in a brain-like way
.
.
.
Input
Val
ues
Encode
to
Spikes
Decode
from
Spikes
F1
F4
Fn
F2
F3
.
.
.
x1
x2
xm
z2
zp
z16
7
49
12
6269
75
Ou
tpu
t Valu
es
...
...
...
...
...
...
.
.
.
.
.
.
.
.
.
0
1
2
3
60
57
“p
arro
t loo
kin
g o
ver
its rig
ht w
ing
”
Spiking Neuron Model
❑ A volley of spikes is applied at inputs
❑ At each input’s synapse, a spike generates a pre-defined response function
• Synaptic weight (wi) determines amplitude: larger weight ⇒ higher amplitude
• Excitatory response shown; Inhibitory response has opposite polarity
❑ Responses are summed linearly at neuron body
❑ Fire output spike if/when potential exceeds threshold value ()
June 2018 copyright JE Smith 6
response for high weight
response for lower weight
❑ Basic Spike Response Model (SRM0) -- Kistler, Gerstner, & van Hemmen (1997)
.
.
....
response
functions
w1
w2
wp
Σ
body
potential
x2
x1
xp
y
input spikes synapses
0
2
3
6
Spiking Neural Networks
Communicating with Spikes: Rate Coding
❑ Rate Coding: The name just about says it all
• The first thing a typical computer engineer would think of doing
❑ A defining feature: Changing the value on a given line does not
affect values on other lines
June 2018 copyright JE Smith 8
rates
8
2
5
4
spike trains
time
Communicating with Spikes: Rate Coding
❑ Rate Coding: The name just about says it all
• The first thing a typical computer engineer would think of doing
❑ A defining feature: Changing the value on a given line does not
affect values on other lines
June 2018 copyright JE Smith 9
rates
3
2
5
4
spike trains
time
Temporal Coding
June 2018 copyright JE Smith 10
values encoded as spike
times relative to t =0 0
6
3
4t = 0
precise timing
relationships
time
❑ Use relative timing relationships across multiple parallel lines to
encode information
❑ A defining feature: Changing the value on a given line may affect
values on any/all the other lines
Temporal Coding
June 2018 copyright JE Smith 11
precise timing
values encoded as spike
times relative to t =0 2
3
0
1= t 0
relationships
time
❑ Use relative timing relationships across multiple parallel spikes to
encode information
❑ A defining feature: Changing the value on a given line may affect
values on any/all the other lines
Plot on Same (Plausible) Time Scale
June 2018 copyright JE Smith 12
Temporal method is
An order of magnitude faster
An order of magnitude more
efficient (#spikes)
10 msec
Neural Network Taxonomy
June 2018 copyright JE Smith 13
Subspecies
that can
hybridize
An entirely different genus
RNNs and TNNs are two very different models, both with SNN
implementations, and they should not be conflated
Neural Networks
rate theory
ANNs
temporal theory
spike
implementation
RNNs TNNs
rate
implementationspike
implementation
SNNs
rate
abstraction
8
2
5
4
spike train
spike times relative to t=0 encode values
0
6
3
4
t = 0
precise
timing
relationships
Space-Time Algebra
Space-Time Computing Network (informal)
June 2018 copyright JE Smith 15
A Space-Time Computing Network is a feedforward composition of functions, Fi, where:
1) Each Fi is a total function w/ finite implementation
2) Each Fi operates asynchronously
Begins computing when the first input spike arrives
3) Each Fi is causal
The output spike time is independent of later input spike times
4) Each Fi is invariant
If all the input spikes are delayed by some constant amount then the output
spike is delayed by the same constant amount
.
.
.
Input
Val
ues
Encode
to
Spikes
Decode
from
Spikes
F1
F4
F5
Fn
Fn-2
F2
F3
Fn-1
.
.
.
x1
x2
xm
z2
zp
z16
7
4
9
12
6269
75O
utp
ut V
alues
...
...
...
...
...
...
.
.
.
.
.
.
.
.
.
0
1
2
3
60
57
❑ A key feature for supporting localized, asynchronous operation
❑ Normalize a volley by subtracting first spike time from all the spike times
• A normalized volley has at least one spike at time = 0
• Each functional block observes only normalized volleys
F3
4
...
...
... ∞
1
3
...
9 F3
3
...
...
...
0
2
...
8invariance
Temporal Invariance
June 2018 16copyright JE Smith
global time local time
∞
❑ A functional block interprets inputs and outputs according to its own
local time frame
The network designer works with small non-negative integers
Space-Time Algebra
June 2018 copyright JE Smith 17
Bounded Distributive Lattice
• 0, 1, 2,…,
• Interpretation: points in time
• not complemented
.
.
.
3
2
1
0
Top
Bottom
Primitive S-T Operations
June 2018 copyright JE Smith 18
“atomic excitation”
“atomic inhibition”
“atomic delay”
inc: b = a + 1
a 1 b
lt: if a < b then c = a
else c =
ca
b
≺
min: if a < b then c = aelse c = b
ca
b
Space-Time Networks
June 2018 copyright JE Smith 19
❑ Theorem: Any feedforward composition of s-t functions is an s-t function
⇒ Build networks by composing s-t primitives
• Example:
note: shorthand for n increments in series:
x2
x1
y
2
x3 1
0 2
1
4 55
x46
5
a n b = a + n
Elementary Functions
June 2018 copyright JE Smith 20
❑ Increment and identity are the only one-input functions
❑ Table of all two-input s-t functions
• Many are non-commutative
function name symbol
if a < b then a; else b min
if a b then a; else not equal ≢
if a < b then a exclusive min xelse if b < a then b; else
if a > b then a; else b max
if a > b then a exclusive max xelse if b > a then b; else
if a =b then a; else equal ≡
if a b then a; else less or equal ≼
if a < b then a; else less than ≺
if a b then a; else greater or equal ≽
if a > b then a; else greater than ≻
Implementing Some Elementary Functions
June 2018 copyright JE Smith 21
greater or equal,
using lt
ack: Cristóbal Camarero
max using ge and min
equals using ge and
max
a
a
c = a b = a (a b)
b
a
b
c = a b = (a b) (b a )
a
b
a
b
a
b
c = a b = (a b) (b a )
if a = b then c = a
else c = ∞
if a ≥ b then c = a
else c = ∞
❑ Theorem: min, inc, and lt are functionally complete for the set of s-t functions
❑ Proof: by construction of normalized function table
• Equals implemented w/ min and lt
• Example: table with 3 entries
y
3
33
x32
x1 10
x21
2
0
3
2
1
x1
x2
x3
0
1
2
0
0
2
x1
x2
x3
0
1
2
1
4
3
3
1
3
June 2018 copyright JE Smith 22
x1 x2 x3 y
0 1 2 3
1 0 2
2 2 0 2
Completeness
Implement TNNs With S-T Primitives
June 2018 copyright JE Smith 23
❑ Corollary: All TNNs can be implemented with min, inc, and lt
• Theorem/Corollary say we can do it… so let’s construct basic TNN elements
• Focus on excitatory neurons and lateral (winner-take-all) inhibition
from Bichler et al. 2012
Quick Review: SRM0 Neuron Model
June 2018 copyright JE Smith 24
.
.
....
response
functions
w1
w2
wp
Σ
body
potential
x2
x1
xp
y
input spikes synapses
SRM0 Neurons
June 2018 copyright JE Smith 25
❑ Response Functions
• Can be specified as times of up and down steps
This not a toy example – it is realistically low resolution
0 1 2 3 4 5 6 7 8 9 10 11 12
1
2
3
4
5
time
ampli
tud
e
x
1
1
2
2
3
7
5
8
10
12
times of
up steps
times of
down steps
x+3
x+2
x+2
x+1
x+1
x+5
x+7
x+8
x+10
x+12
Bitonic Sort (Batcher 1968)
❑ A key component of the SRM0 construction
❑ A Bitonic Sorting Network is a composition of max/min elements
❑ max/min are s-t functions, so Sort is an s-t function
June 2018 copyright JE Smith 26
x1
Bitonic Sort Array
log2m (log2m+1)/2 layers
m/2 compare elts. each
x2
xm
z1
z2
zm
a
b max(a,b)
min(a,b)
Complete SRM0 neuron
June 2018 copyright JE Smith 27
❑ Up Network: sort times of all up steps
❑ Down Network: sort times of all down
steps
❑ ≺ & min tree determines the first time no. of up steps ≥ no. of down steps +
• This is the first time the threshold is
reached
• Note: error in paper; q - +1, not q - (+1)
❑ Supports all possible finite response
functions
• I.e., all response functions that can
be physically implemented
• Includes inhibitory (negative)
response functions
i + -1
i
≺
.
.
.
x1
xm
Sortsmallest
to
largest
1
2
θ
θ+1
q
q-1
. . .
Up
Network
z
.
.
.
u
u
u
. . .
u
u
u
. . .
.
.
.Sort
smallest
to
largest
1
2
q - θ
q - θ +1
. . .
Down
Network
d
d
d
. . .
d
d
d
. . .
. . .
Lateral Inhibition
June 2018 copyright JE Smith 28
❑ Inhibitory neurons modeled en masse
❑ Winner-Take-All (WTA)
• Only the first spike in a volley is allowed to pass from inputs to outputs
• If there is a tie for first, all first spikes pass
Aside: this is an example of
physical feedback, but no
functional feedback
x1 y1
xp yp
x2 y2
.
.
.
.
.
.
. . .
. . .
. . .1
.
.
.
5
7
6
5
TNN Summary
June 2018 copyright JE Smith 29
❑ We have implemented the primary TNN components as s-t functions
Neurons .
.
.
x1
xm
Sortsmallest
to
largest
1
2
θ
θ+1
q
q-1
. . .
Up
Network
z
.
.
.
u
u
u
. . .
u
u
u
. . .
.
.
.Sort
smallest
to
largest
1
2
q - θ
q - θ +1
. . .
Down
Network
d
d
d
. . .
d
d
d
. . .
. . .
Lateral Inhibition
from Bichler et al. 2012
x1 y1
xp yp
x2 y2
.
.
.
.
.
.
. . .
. . .
. . .1
.
.
.
Race Logic: A Direct Implementation
Race Logic: Shortest Path
June 2018 copyright JE Smith 31
❑ Find shortest path in a weighted directed graph
• Weights implemented as shift register delays
• Min functions implemented as AND gates
6
3
8
2
1
6
4
6
3
7
53
4
2
3
❑ By Madhavan, Sherwood, Strukov (2015)
6
4
6
1
3
4
2
8
5
3
7
2
3
t = 0
t = 0
t = 0
t = 2
t = 1
t = 4
t = 4
t = 7 3
6
t = 7
❑ S-T primitives implemented directly with conventional digital circuits
• Signal via 1 → 0 transitions rather than spikes
⇒ We can implement SRM0 neurons and WTA inhibition with off-the-shelf CMOS
⇒Very fast and efficient TNNs
Generalized Race Logic
June 2018 copyright JE Smith 32
Construct TNN architectures in the neuroscience domain
Implement directly with off-the-shelf CMOS
a
bmax(a,b)OR
a a + 1D Q
ab
min(a,b)AND
ab
a ≺ b
reset
NORNOR
NOR
Concluding Remarks
The Temporal Resource
❑ The flow of time has huge engineering advantages
• It is free – time flows whether we want it to or not
• It requires no space
• It consumes no energy
❑ Yet, we (humans) try to eliminate the effects of time when
constructing computer systems
• Assume worst case delays (w/ synchronizing clocks) & delay-independent
asynchronous circuits
• This may be the best choice for conventional computing problems and
technologies
❑ How about natural evolution?
• Tackles completely different set of computing problems
• With a completely different technology
June 2018copyright JE Smith
34
The flow of time can be used effectively as a
communication and computation resource.
The Box: The way we (humans) perform computation
June 2018 copyright JE Smith 35
❑ We try to eliminate temporal effects when implementing functions
• s-t uses the uniform flow of time as a key resource
❑ We use add and mult as primitives for almost all mathematical models
• Neither add nor mult is an s-t function
❑ We prefer high resolution (precision) data representations
• s-t practical only for very low-res direct implementations
❑ We strive for complete functional completeness
• s-t primitives complete only for s-t functions
• There is no inversion, complementation, or negation
❑ We strive for algorithmic, top-down design
• TNNs are based on implied functions
• Bottom-up, empirically-driven design methodology
A Remarkable Conjecture
❑ Addition and Multiplication are not space-time functions
• They fail invariance:
(a + 1) + (b + 1) (a + b) + 1
(a + 1) · (b + 1) (a · b) + 1
❑ Conventional machine learning models depend on addition and
multiplication as primitives
• Vector inner product: (inputs · weights)
IF the brain’s cognitive functions depend on space-time operations
Then: conventional machine learning, on its current trajectory, will never
implement the brain’s cognitive functions
June 2018 copyright JE Smith 36
s-t algebra is more than a different set of primitives –
it describes a different set of functions
Acknowledgements
June 2018 copyright JE Smith 37
Raquel Smith, Mikko Lipasti, Mark Hill, Margaret Martonosi, Cristobal
Camarero, Mario Nemirovsky, Mikko Lipasti, Tim Sherwood, Advait
Madhavan, Shlomo Weiss, Ido Guy, Ravi Nair, Joel Emer, Quinn
Jacobson, Abhishek Bhattacharjee