advances in modeling neocortex and its impact on machine intelligence jeff hawkins numenta inc....
TRANSCRIPT
Advances in Modeling Neocortexand its impact on machine intelligence
Jeff HawkinsNumenta Inc.VS265 Neural ComputationDecember 2, 2010
Documentation for the algorithms described in this talk can be found at www.numenta.com (papers)
Premise
1) The principles of brain function can be understood.
2) We can build machines that work on these principles.
3) Many machine learning, A.I., and robotics problems can only be solved this way.
Neocortex Is Our Focus
• 75% of volume of human brain• All high level vision, audition, motor, language, thought• Composed of a repetitive element
- complex- hierarchical
Process
Algorithmicneeds
Neurobiologyanatomy, physiology
Biologicalmodel
Computermodel
empirical results
Our computer models are biologically and empirically driven.
Neocortex (large scale architecture)
Neocortex overall- 1000 cm^2, 2 mm thick- 30 billion cells- 100 trillion synapses
Regions- Nearly identical architecture- Differentiated by connectivity- Common algorithms
Hierarchy- Convergence- Temporal slowness
Hierarchical Temporal Memory (Basic)
Regions- Learn common spatial patterns
(Sparse Distributed Representations of input)
- Learn sequences of common spatial patterns(variable order transitions of SDRs)
- Pass stable representations up hierarchy- Unfold sequences going down hierarchy
Hierarchy- Reduces memory and training time- Provides means of generalization
Sequence Memory is Key
Why is sequence memory so important?- Prediction- Motor behavior- Time-based inference- Spatial inference
Attributes- High capacity- Context dependent- Robust- Multiple simultaneous predictions- Form stable representations of sequences- On-line learning
Neocortical regions
Biology
- Five layers of cells Densely packed Massively interconnected
- Cells in columns have similar response properties- Majority of connections are within layer- Feed forward connections are few but strong- Layers 4 and 3 are primary feed forward layers
Layer 4 disappears as you ascend hierarchy
Hypothesis
- Common mechanism is used in each layer- Each layer is a sequence memory
Learns transitions of sparse distributed patterns- Layer 4 learns first order transitions
Ideal for spatial inference, “simple cells”- Layer 3 learns variable order transitions
Ideal for time-based inference, “complex cells”- Layer 5 motor
specific timing- Layers 2, 6 feedback, attention
1
2/3
4
5
6
to higher region
from lower region
Neurons
Real neuronProximal dendrites
Linear summationFeed forward connections
Distal dendritesDozens of regionsNon-linear integrationConnections to other cells in layer
Synapses Thousands on distal dendrites Hundreds on proximal dendrites Numerous learning rules Forming and un-forming constantly
OutputVariable spike rateBursts of spikesProjects laterally and inter-layer
Not a neuronSum of weighted synapsesNon-linear functionScalar output
HTM neuronProximal dendrite
Linear summationFeed forward connections
Distal dendritesDozens of regionsThreshold coincidence detectorsConnections to other cells in layer
Synapses Thousands on distal dendrites (dozens per segment) Hundreds on proximal dendrite Scalar Permanence Binary weight
OutputActive state (fast or burst)Predictive state (slow)Projects laterally and inter-region
HTM Regions
What is an HTM region?
- A set of neurons arranged in columns- Cells in column have same feed forward activation- Cells in column have different response in context
What does a region do?
1) Creates a sparse distributed representation of input2) Creates a representation of input in the context of prior inputs3) Learns sequences of representations from 2)
4) Forms a prediction based on the current input in the context of previous inputsThis prediction is a slow changing representation of sequence.It is the output of the region.
Cellular layer - 1 cell per column
Internal potential of cells (via feed forward input to proximal synapses)
Cellular layer - 1 cell per column
Cells with highest potential fire first, inhibit neighbors
Cellular layer - 1 cell per column
Sparse Distributed Representation of input (time = 1)
Cellular layer - 1 cell per column
Sparse Distributed Representation of input (time = 2)
Cellular layer - 1 cell per column
Sparse Distributed Representation of input (time = 1)
Cellular layer - 1 cell per column
Sparse Distributed Representation of input (time = 2)
Cellular layer - 1 cell per column
Prediction (via lateral connections to distal dendrite segments)
With 1 cell per column, all transitions are first order
Cellular layer - 4 cells per column – no context
Sparse Distributed Representation of input (if unpredicted, all cells in a column fire)
Cellular layer - 4 cells per column – with prior context
Sparse Distributed Representation of input (if predicted, one cell in a column fires)Represents input in the context of prior states (variable order sequence memory)
Cellular layer - 4 cells per column
Prediction In the context of prior states
Cellular layer - 4 cells per column
Predicted columns
Unpredicted column
HTM Neuron
Distal dendrite segments- Act like coincidence detectors Recognize state of region
- When segment active, cell enters predictive state- Typical threshold, 15 active synapses, sufficient!
- Synapses formed from “potential synapse” pool
- Each segment can learn several patterns without error
- One cell can participate in many different sequences
Learning rules- If a segment is active
Modify synapses on active segment Modify synapses on segment best matching t=-1
- Modify permanence for all potential synapses Increase for active synapses Decrease for inactive synapses Permanence range 0.0 to 1.0, >0.2 = valid
Feed forward inputLateral connections from other cells in region
Proximal dendrite- Shared by all cells in a column- Linear summation of input- Boosted by duty cycle- Synapses formed from “potential synapse” pool- Leads to self-adjusting representations
HTM Cortical Learning Algorithm
Variable order sequence memoryTime-based and static inferenceMassive predictive abilityUses sparse distributed representations
High capacityHigh noise immunityOn-line learningDeep biological mappingSelf adjusting representations
What’s next
Commercial ApplicationsNumenta is applying these new algorithms to data analytics problems, e.g.- Credit card fraud- Large sensor environments- Web click prediction
ResearchFull documentation plus pseudo code at www.numenta.com (papers)
We will release software in 2011
All use is free for research purposes
Engage Numenta for further discussions
EmploymentInterns and full time [email protected]
The Future of Machine Intelligence