introduction to deep learning with python
TRANSCRIPT
![Page 1: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/1.jpg)
From multiplication to convolutional networksHow do ML with Theano
![Page 2: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/2.jpg)
Today’s Talk● A motivating problem● Understanding a model based framework● Theano
○ Linear Regression ○ Logistic Regression○ Net○ Modern Net○ Convolutional Net
![Page 3: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/3.jpg)
Follow alongTutorial code at:https://github.com/Newmu/Theano-TutorialsData at:http://yann.lecun.com/exdb/mnist/Slides at:http://goo.gl/vuBQfe
![Page 4: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/4.jpg)
A motivating problemHow do we program a computer to recognize a picture of a handwritten digit as a 0-9?
What could we do?
![Page 5: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/5.jpg)
A dataset - MNISTWhat if we have 60,000 of these images and their label?
X = images
Y = labels
X = (60000 x 784) #matrix (list of lists)
Y = (60000) #vector (list)
Given X as input, predict Y
![Page 6: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/6.jpg)
An ideaFor each image, find the “most similar” image and guess
that as the label.
![Page 7: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/7.jpg)
An ideaFor each image, find the “most similar” image and guess
that as the label.
KNearestNeighbors ~95% accuracy
![Page 8: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/8.jpg)
Trying thingsMake some functions computing relevant information for
solving the problem
![Page 9: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/9.jpg)
What we can codeMake some functions computing relevant information for
solving the problem
feature engineering
![Page 10: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/10.jpg)
What we can codeHard coded rules are brittle and often aren’t obvious or
apparent for many problems.
![Page 11: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/11.jpg)
A Machine Learning Framework
8
Inputs Computation Outputs
Model
![Page 12: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/12.jpg)
A … model? - GoogLeNet
from arXiv:1409.4842v1 [cs.CV] 17 Sep 2014
![Page 13: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/13.jpg)
A very simple model
Input Computation Output
3 mult by x 12
![Page 14: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/14.jpg)
Theano intro
![Page 15: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/15.jpg)
Theano intro
imports
![Page 16: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/16.jpg)
Theano intro
imports
theano symbolic variable initialization
![Page 17: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/17.jpg)
Theano intro
imports
theano symbolic variable initializationour model
![Page 18: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/18.jpg)
Theano intro
imports
theano symbolic variable initializationour model
compiling to a python function
![Page 19: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/19.jpg)
Theano intro
imports
theano symbolic variable initializationour model
compiling to a python function
usage
![Page 20: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/20.jpg)
Theano
![Page 21: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/21.jpg)
Theano
imports
![Page 22: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/22.jpg)
Theano
imports
training data generation
![Page 23: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/23.jpg)
Theano
imports
training data generation
symbolic variable initialization
![Page 24: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/24.jpg)
Theano
imports
training data generation
symbolic variable initialization
our model
![Page 25: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/25.jpg)
Theano
imports
training data generation
symbolic variable initialization
our model
model parameter initialization
![Page 26: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/26.jpg)
Theano
imports
training data generation
symbolic variable initialization
our model
model parameter initialization
metric to be optimized by model
![Page 27: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/27.jpg)
Theano
imports
training data generation
symbolic variable initialization
our model
model parameter initialization
metric to be optimized by modellearning signal for parameter(s)
![Page 28: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/28.jpg)
Theano
imports
training data generation
symbolic variable initialization
our model
model parameter initialization
metric to be optimized by modellearning signal for parameter(s)how to change parameter based on learning signal
![Page 29: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/29.jpg)
Theano
imports
training data generation
symbolic variable initialization
our model
model parameter initialization
metric to be optimized by modellearning signal for parameter(s)how to change parameter based on learning signal
compiling to a python function
![Page 30: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/30.jpg)
Theano
imports
training data generation
symbolic variable initialization
our model
model parameter initialization
metric to be optimized by modellearning signal for parameter(s)how to change parameter based on learning signal
compiling to a python functioniterate through data 100 times and train model on each example of input, output pairs
![Page 31: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/31.jpg)
Theano doing its thing
![Page 32: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/32.jpg)
Logistic Regression
0.1
T.dot(X, w)
softmax(X)
0. 0.10. 0.0. 0.0. 0.10.7
Zero One Two Three Four Five Six Seven Eight Nine
![Page 33: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/33.jpg)
Back to Theano
![Page 34: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/34.jpg)
Back to Theano
convert to correct dtype
![Page 35: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/35.jpg)
Back to Theano
convert to correct dtype
initialize model parameters
![Page 36: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/36.jpg)
Back to Theano
convert to correct dtype
initialize model parameters
our model in matrix format
![Page 37: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/37.jpg)
Back to Theano
convert to correct dtype
initialize model parameters
our model in matrix formatloading data matrices
![Page 38: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/38.jpg)
Back to Theano
convert to correct dtype
initialize model parameters
our model in matrix formatloading data matrices
now matrix types
![Page 39: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/39.jpg)
Back to Theano
convert to correct dtype
initialize model parameters
our model in matrix formatloading data matrices
now matrix types
probability outputs and maxima predictions
![Page 40: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/40.jpg)
Back to Theano
convert to correct dtype
initialize model parameters
our model in matrix formatloading data matrices
now matrix types
probability outputs and maxima predictionsclassification metric to optimize
![Page 41: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/41.jpg)
Back to Theano
convert to correct dtype
initialize model parameters
our model in matrix formatloading data matrices
now matrix types
probability outputs and maxima predictionsclassification metric to optimize
compile prediction function
![Page 42: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/42.jpg)
Back to Theano
convert to correct dtype
initialize model parameters
our model in matrix formatloading data matrices
now matrix types
probability outputs and maxima predictionsclassification metric to optimize
compile prediction function
train on mini-batches of 128 examples
![Page 43: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/43.jpg)
What it learns
0 1 2 3 4 5 6 7 8 9
![Page 44: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/44.jpg)
What it learns
0 1 2 3 4 5 6 7 8 9
Test Accuracy: 92.5%
![Page 45: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/45.jpg)
An “old” net (circa 2000)
0.0
h = T.nnet.sigmoid(T.dot(X, wh))
y = softmax(T.dot(h, wo))
0. 0.10. 0.0. 0.0. 0.0.9
Zero One Two Three Four Five Six Seven Eight Nine
![Page 46: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/46.jpg)
A “old” net in Theano
![Page 47: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/47.jpg)
A “old” net in Theano
generalize to compute gradient descent on all model parameters
![Page 48: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/48.jpg)
Understanding SGD
2D moons datasetcourtesy of scikit-learn
![Page 49: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/49.jpg)
A “old” net in Theano
generalize to compute gradient descent on all model parameters
2 layers of computationinput -> hidden (sigmoid)hidden -> output (softmax)
![Page 50: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/50.jpg)
Understanding Sigmoid Units
![Page 51: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/51.jpg)
A “old” net in Theano
generalize to compute gradient descent on all model parameters
2 layers of computationinput -> hidden (sigmoid)hidden -> output (softmax)
initialize both weight matrices
![Page 52: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/52.jpg)
A “old” net in Theano
generalize to compute gradient descent on all model parameters
2 layers of computationinput -> hidden (sigmoid)hidden -> output (softmax)
initialize both weight matrices
updated version of updates
![Page 53: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/53.jpg)
What an “old” net learns
Test Accuracy: 98.4%
![Page 54: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/54.jpg)
A “modern” net - 2012+
0.0
h = rectify(T.dot(X, wh))
y = softmax(T.dot(h2, wo))
0. 0.10. 0.0. 0.0. 0.0.9
Zero One Two Three Four Five Six Seven Eight Nine
h2 = rectify(T.dot(h, wh))
Noise
Noise
Noise(or augmentation)
![Page 55: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/55.jpg)
A “modern” net in Theano
![Page 56: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/56.jpg)
A “modern” net in Theano
rectifier
![Page 57: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/57.jpg)
Understanding rectifier units
![Page 58: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/58.jpg)
A “modern” net in Theano
rectifier
numerically stable softmax
![Page 59: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/59.jpg)
A “modern” net in Theano
rectifier
numerically stable softmax
a running average of the magnitude of the gradient
![Page 60: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/60.jpg)
A “modern” net in Theano
rectifier
numerically stable softmax
a running average of the magnitude of the gradientscale the gradient based on running average
![Page 61: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/61.jpg)
Understanding RMSprop
2D moons datasetcourtesy of scikit-learn
![Page 62: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/62.jpg)
A “modern” net in Theano
rectifier
numerically stable softmax
a running average of the magnitude of the gradientscale the gradient based on running average
randomly drop values and scale rest
![Page 63: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/63.jpg)
A “modern” net in Theano
rectifier
numerically stable softmax
a running average of the magnitude of the gradientscale the gradient based on running average
randomly drop values and scale rest
Noise injected into modelrectifiers now used2 hidden layers
![Page 64: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/64.jpg)
What a “modern” net learns
Test Accuracy: 99.0%
![Page 65: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/65.jpg)
Quantifying the difference
![Page 66: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/66.jpg)
What a “modern” net is doing
![Page 67: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/67.jpg)
Convolutional Networks
from deeplearning.net
![Page 68: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/68.jpg)
A convolutional network in Theano
![Page 69: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/69.jpg)
A convolutional network in Theano
a “block” of computation conv -> activate -> pool -> noise
![Page 70: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/70.jpg)
A convolutional network in Theano
a “block” of computation conv -> activate -> pool -> noise
convert from 4tensor to normal matrix
![Page 71: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/71.jpg)
A convolutional network in Theano
a “block” of computation conv -> activate -> pool -> noise
convert from 4tensor to normal matrix
reshape into conv 4tensor (b, c, 0, 1) format
![Page 72: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/72.jpg)
A convolutional network in Theano
a “block” of computation conv -> activate -> pool -> noise
convert from 4tensor to normal matrix
reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix
![Page 73: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/73.jpg)
A convolutional network in Theano
a “block” of computation conv -> activate -> pool -> noise
convert from 4tensor to normal matrix
reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix
conv weights (n_kernels, n_channels, kernel_w, kerbel_h)
![Page 74: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/74.jpg)
A convolutional network in Theano
a “block” of computation conv -> activate -> pool -> noise
convert from 4tensor to normal matrix
reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix
conv weights (n_kernels, n_channels, kernel_w, kerbel_h)
highest conv layer has 128 filters and a 3x3 grid of responses
![Page 75: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/75.jpg)
A convolutional network in Theano
a “block” of computation conv -> activate -> pool -> noise
convert from 4tensor to normal matrix
reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix
conv weights (n_kernels, n_channels, kernel_w, kerbel_h)
highest conv layer has 128 filters and a 3x3 grid of responses
noise during training
![Page 76: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/76.jpg)
A convolutional network in Theano
a “block” of computation conv -> activate -> pool -> noise
convert from 4tensor to normal matrix
reshape into conv 4tensor (b, c, 0, 1) formatnow 4tensor for conv instead of matrix
conv weights (n_kernels, n_channels, kernel_w, kerbel_h)
highest conv layer has 128 filters and a 3x3 grid of responses
noise during trainingno noise for prediction
![Page 77: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/77.jpg)
What a convolutional network learns
Test Accuracy: 99.5%
![Page 78: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/78.jpg)
Takeaways● A few tricks are needed to get good results
○ Noise important for regularization○ Rectifiers for faster, better, learning○ Don’t use SGD - lots of cheap simple improvements
● Models need room to compute.● If your data has structure, your model should
respect it.
![Page 79: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/79.jpg)
Resources● More in-depth theano tutorials
○ http://www.deeplearning.net/tutorial/● Theano docs
○ http://www.deeplearning.net/software/theano/library/● Community
○ http://www.reddit.com/r/machinelearning
![Page 80: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/80.jpg)
A plugKeep up to date with indico:https://indico1.typeform.com/to/DgN5SP
![Page 81: Introduction to Deep Learning with Python](https://reader034.vdocument.in/reader034/viewer/2022052317/55d73eefbb61eb26488b456c/html5/thumbnails/81.jpg)
Questions?