neural turing machines
TRANSCRIPT
This is a brief summary of the paper
“Neural Turing Machines”http://arxiv.org/abs/1410.5401
Written byA. GravesG. Wayne
I. DanihelkaGoogle DeepMind, London UK
“Neural Turing Machines” are, in a single phrase, Neural Networks having the capability of
coupling to external memories.
The combined system is analogous to a Turing Machine.
Neural Network
・Neural Network(NN) learns from large amount of observational data.(data is a tuple of [External Input, External Output])
Neural Network
・Recurrent Neural Network(RNN) introduces directed circles to NN,which work as a sort of internal memories.
(Current states are determined by previous states and External Input)
Recurrent Neural Network
Directed circle
Recurrent Neural Network
・”Neural Turing Machine” is NN which has the capabilityof coupling to the external memories.
(Controller is NN with parameters for coupling to external memories)
External Memory
Neural Turing Machine
・ Read/Write heads use weights to access external memory.・ Weights are determined by the parameters on controller.・ Parameters are learned from large amount of external I/O data.
N ×M matrixN locations for M size vector
N
M
Read head
Write head
e: to erase vectorsa: to add new vectors
weighted access
Controller (NN with parametersfor adjusting weights)
External Memory
How to access external memories
External Input External output
Content Addressing:Weight adjustment based on the content on the each location.
Interpolation:Determines how much we use previous weight state.
Convolutional Shift and Sharping : Weight adjustment based on the location of the memory.
How to update weight
Result of copy algorithm
・ NTM learns some form of copy algorithm.・ NTM performs better than LSTM(a kind of RNN).・ Even NTM copy algorithm makes some mistakes
for long length data(as indicated by the red arrow).
NTM
・ Outputs are supposed to be a copy of targets.
Result of copy algorithm
LSTM
・ Outputs are supposed to be a copy of targets.
・ NTM learns some form of copy algorithm.・ NTM performs better than LSTM(a kind of RNN).・ Even NTM copy algorithm makes some mistakes
for long length data(as indicated by the red arrow).
How NTM uses an external memory for copy algorithm
・ All weight focus on a single location.・ Read locations exactly match the write locations.
ExternalInputs/Outputs
Adds/ReadsVectors toMemory
Write/Read Weightings
How NTM uses an external memory for repeat copy algorithm
・ All weights focus on a single location.・ Read locations are repeatedly referred by the write head.
Results of associate recall algorithm
・ NTM correctly produces the red box item after they see the green box item.
Results of Dynamical N-grams
・ NTM predicts the next bit almost as well as Optimal estimator.
Optimal:(N1, N0 is the number of 1,0 seen in the previous c bits)
Results of Priority Sort
・Write head writes to locations according to a linear function of priority ・Read head reads from locations in increasing order.
・”Neural Turing Machines” are, in a single phrase, Neural Networks having the capability of coupling to external memories.
Conclusion
・ We see the capability of using external memories through the application of copy, repeat copy, associative recall, dynamical N-grams,Priority sort.
・ I refer the readers who are really interested in this summary tothe original paper(http://arxiv.org/abs/1410.5401).