neural turing machines - meetupfiles.meetup.com/1406240/2016-06-23_ml-ny-meetup.pdf · conclusion...
TRANSCRIPT
Neural Turing MachinesTristan Deleu
@tristandeleu! June 23, 2016
Deep Learning
The building blocks
ConvolutionalLayer
Fully connectedLayer
RecurrentLayer
+
Object Recognition Object Detection Image Segmentation
Others
"
"
"
Predictions" Speech Recognition Language Processing
"
"
Examples
+ =
Object Detection
Predictions
+ =
PredictionsSpeech Recognition
+
Image Segmentation
Predictions
Face detection
Automatic speech recognition
=
Image segmentation
Examples
+ +
Object Recognition
Language Processing
Predictions
Sentiment analysis
Image captioning
Machine translation
=
=
=
+
Language Processing
+
Language Processing
Predictions
Language Processing
Frameworks
TheanoTorch
Tensorflow
Keras
Chainer
Neon
CNTK
MXNet
Caffe
LasagneLasagne
Theano + Lasagne
https://github.com/Lasagne/Lasagne/blob/master/examples/mnist.py
Neural Turing Machines
Recurrent Neural Network
ht
yt
xt
ht+1
yt+1
xt+1
yt�1
ht�1
xt�1
LSTMtLSTMt�1 LSTMt+1
Memory-augmented Networks
BOAT
Neural Network
Boats float on water You can’t sail against the wind Boats do not fly …
?
• Inspired by neuroscience
• Memory-augmented networks: add an external memory to neural networks to act as a knowledge base
• Keep track of intermediate computations — The story to answer the question in QA problems Memory Networks & Dynamic Memory Networks
Memory-augmented Networks
Memory Networks Dynamic Memory Networks Neural GPU
Neural Stack/Queue/DeQue Stack-augmented RNN
Current state Read Operation New state Write
0 0
1 0
0 1
1 0
Turing Machine
0 1 101010 1 0q0
q0
q0 q0
q0
q1
q1
q1
q1
· · ·
Neural Turing Machine
0 1 101010 1 0q0
Current state Read Operation New state Write
0 0
1 0
0 1
1 0
q0
q0 q0
q0
q1
q1
q1
q1
· · ·
Input Output
?
Heads
0 1 101010$
wt
Mt
$ $ $
Turing Machine Neural Turing Machine
Neural Turing Machine
FFt
ht
yt
rt
xt
xt
FFt+1
ht+1
yt+1
rt+1
xt+1
xt+1
yt�1
ht�1
rt�1
FFt�1
xt�1
xt�1
Mt�1 Mt
%&Controller
%'Read heads
%(Write heads
Neural Turing Machine
ht
yt
rt
xt
xt
ht+1
yt+1
rt+1
xt+1
xt+1
yt�1
ht�1
rt�1
xt�1
xt�1
Mt�1 Mt
LSTMtLSTMt�1 LSTMt+1
%&Controller
%'Read heads
%(Write heads
Neural Turing Machine
)
%*Memory
%&Controller
%'Read heads
%(Write heads
Input Output
&
NTM
Open-source Library
medium.com/snips-ai
github.com/snipsco/ntm-lasagne+
(
NTM-Lasagne
Algorithmic Tasks
• Goal: Learn full algorithms only from input/output examples Generate as much data as we need
• Strong Generalization: Generalize beyond the data the NTM has seen during trainingLonger sequences for example
,?Input Output
P (X,Y )
?
Copy taskInputs
Outputs
EOS
Training
Copy task
Copy task
Copy task
Length 120
Copy task
Length 150
Repeat Copy task
x5EOS
Inputs
Outputs
Repeat Copy task
Repeat Copy task
Associative Recall taskInputs
Outputs
Associative Recall task
Associative Recall task
Priority Sort task
bAbI tasks
bAbI tasks
Mary
John
bathroom
garden
Sandra
hallway
Mary
John
bathroom
garden
Sandra
hallway
Mary went to the garden John went to the garden Mary went back to the hallway Sandra journeyed to the bathroom John went to the hallway Mary went to the bathroom
bAbI tasks
Conclusion
• The NTM is able to learn algorithms only from examples
• It shows better generalization performances compared to other recurrent architecturesFor example LSTMs
• Fully differentiable structureDrawback: generalization is still not quite perfect
• New take on Artificial IntelligenceTrying to teach machines things they can do, the same way we would learn them
• Resources • Theano: http://deeplearning.net/software/theano/ • Lasagne: http://lasagne.readthedocs.io/en/latest/ • NTM-Lasagne: https://github.com/snipsco/ntm-lasagne
@tristandeleu! June 23, 2016
Thank you
@tristandeleu! June 23, 2016