lecture 17: neural networks and deep learninglecture 17: neural networks and deep learning jack...
TRANSCRIPT
![Page 1: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/1.jpg)
Lecture 17: Neural Networks and
Deep LearningJack Lanchantin
Dr. Yanjun Qi
1
UVA CS 6316 / CS 4501-004Machine Learning
Fall 2016
![Page 2: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/2.jpg)
Neurons1-Layer Neural Network
Multi-layer Neural Network
Loss Functions
Backpropagation
Nonlinearity Functions
NNs in Practice2
![Page 3: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/3.jpg)
3
x1
x2
x3
W1
w3
W2
x ŷ
![Page 4: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/4.jpg)
ewx+b
1 + ewx+b
Logistic Regression
Sigmoid Function(aka logistic, logit, “S”, soft-step)
4
P(Y=1|x) =
x
![Page 5: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/5.jpg)
ez
1 + ez
Expanded Logistic Regression
x1
x2
x3
Σ
+1
z
z = wT . x + b
y = sigmoid(z) =5
px11xp
p = 3
1x1
w1
w2
w3
b1SummingFunction
SigmoidFunction
Multiply by weights
ŷ = P(Y=1|x,w)
Input x
![Page 6: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/6.jpg)
“Neuron”
x1
x2
x3
Σ
+1
z
6
w1
w2
w3
b1
![Page 7: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/7.jpg)
Neurons
7http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 8: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/8.jpg)
Neuron
x1
x2
x3
Σ z ŷ
8
ez
1 + ez
z = wT . x
ŷ = sigmoid(z) =px11xp1x1
From here on, we leave out bias for simplicity
Input x
w1
w2
w3
![Page 9: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/9.jpg)
“Block View” of a Neuron
x *
Dot Product Sigmoid
w z
9
y
Input output
Dot Product SigmoidInput output
x *w z ŷ
parameterized block
ez
1 + ez
z = wT . x
ŷ = sigmoid(z) =px11xp1x1
![Page 10: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/10.jpg)
Neuron Representation
Σ
10
The linear transformation and nonlinearity together is typically considered a single neuron
ŷ
x1
x2
x3
x
w1
w2
w3
![Page 11: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/11.jpg)
Neuron Representation
*w
11
ŷx
The linear transformation and nonlinearity together is typically considered a single neuron
![Page 12: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/12.jpg)
Neurons
1-Layer Neural NetworkMulti-layer Neural Network
Loss Functions
Backpropagation
Nonlinearity Functions
NNs in Practice12
![Page 13: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/13.jpg)
13
x1
x2
x3
W1
w3
W2
x ŷ
![Page 14: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/14.jpg)
1-Layer Neural Network (with 4 neurons)
x1
x2
x3
Σ
Σ
Σ
Σ
Input x Output ŷ
14
ŷ
1
ŷ
2
ŷ
3
ŷ
4
Wz
Linear Sigmoid
1 layer
matrixvector
![Page 15: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/15.jpg)
1-Layer Neural Network (with 4 neurons)
x1
x2
x3
Σ
Σ
Σ
Σ
Input x
15
d = 4
p = 3
Wz
Output ŷ
ŷ
1
ŷ
2
ŷ
3
ŷ
4
ez
1 + ez
z =WT x
ŷ = sigmoid(z) =px1dxpdx1
dx1 dx1
Element-wise on vector z
![Page 16: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/16.jpg)
1-Layer Neural Network (with 4 neurons)
x1
x2
x3
Input x
16
ez
1 + ez
z =WT x
ŷ = sigmoid(z) =px1dxpdx1
dx1 dx1
d = 4
p = 3
W Σ
Σ
Σ
Σ
Output ŷ
ŷ
1
ŷ
2
ŷ
3
ŷ
4Element-wise on vector z
![Page 17: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/17.jpg)
“Block View” of a Neural Network
x *
Dot Product Sigmoid
w z
17
y
Input output
Dot Product SigmoidInput output
x *W z ŷ
W is now a matrix
ez
1 + ez
z =WT x
ŷ = sigmoid(z) =px1dxpdx1
dx1 dx1
z is now a vector
![Page 18: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/18.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural NetworkLoss Functions
Backpropagation
Nonlinearity Functions
NNs in Practice18
![Page 19: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/19.jpg)
19
x1
x2
x3
W1
w3
W2
x ŷ
![Page 20: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/20.jpg)
Multi-Layer Neural Network(Multi-Layer Perceptron (MLP) Network)
20
Outputlayer
x1
x2
x3
x ŷ
W1
w2
Hiddenlayer
weight subscript represents layer number
2-layer NN
![Page 21: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/21.jpg)
Multi-Layer Neural Network (MLP)
21
1st hidden layer
2nd hiddenlayer
Outputlayer
x1
x2
x3
x ŷ
3-layer NN
W1
w3
W2
![Page 22: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/22.jpg)
Multi-Layer Neural Network (MLP)
22
z1 =WT xh1 = sigmoid(z1)z2 =WT h1 h2 = sigmoid(z2)z3 =wT h2 ŷ = sigmoid(z3)
x1
x2
x3
h1 h2
W1
w3
W2
x ŷ
1
2
3
hidden layer 1 output
![Page 23: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/23.jpg)
Multi-Class Output MLP
x1
x2
x3
x ŷ
23
h1 h2
z1 =WT xh1 = sigmoid(z1)z2 =WT h1 h2 = sigmoid(z2)z3 =WT h2 ŷ = sigmoid(z2)
1
2
3
W1
w3
W2
![Page 24: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/24.jpg)
“Block View” Of MLP
x y
1st hidden layer
2nd hidden layer
Output layer
24
*W1
*W2
*W3
z1 z2 z3h1 h2
![Page 25: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/25.jpg)
“Deep” Neural Networks (i.e. > 1 hidden layer)
25Researchers have successfully used 1000 layers to train an object classifier
![Page 26: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/26.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural Network
Loss FunctionsBackpropagation
Nonlinearity Functions
NNs in Practice26
![Page 27: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/27.jpg)
27
x1
x2
x3
W1
w3
W2
x ŷ E (ŷ)
![Page 28: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/28.jpg)
ŷ = P(y=1|X,W)
28
x1
x2
x3
x
Binary Classification Loss
E= loss = - logP(Y = ŷ | X= x ) = - y log(ŷ) - (1 - y) log(1-ŷ)
W1
w3
W2
this example is for a single sample x
true output
![Page 29: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/29.jpg)
29
x1
x2
x3
Σ
W1
w3
W2
x
Regression Loss
E = loss= ( y - ŷ )212
ŷ
true output
![Page 30: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/30.jpg)
z1
z2
z3
30
x1
x2
x3
x
Σ
Σ
Σ
ŷ1
ŷ2
ŷ3
Multi-Class Classification Loss
“Softmax” function. Normalizing function which converts each class output to a probability.
E = loss = - yj ln ŷjΣj = 1...K
= P( ŷi = 1 | x )
W1 W3
W2
ŷi
“0” for all except true class
K = 3
![Page 31: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/31.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural Network
Loss Functions
BackpropagationNonlinearity Functions
NNs in Practice31
![Page 32: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/32.jpg)
32
x1
x2
x3
W1
w3
W2
x ŷ E (ŷ)
![Page 33: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/33.jpg)
Training Neural Networks
33
How do we learn the optimal weights WL for our task??● Gradient descent:
LeCun et. al. Efficient Backpropagation. 1998
WL(t+1) = WL(t) - E WL(t)
But how do we get gradients of lower layers?● Backpropagation!
○ Repeated application of chain rule of calculus○ Locally minimize the objective○ Requires all “blocks” of the network to be differentiable
x ŷ
W1w3
W2
![Page 34: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/34.jpg)
34
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 35: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/35.jpg)
35
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 36: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/36.jpg)
36
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 37: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/37.jpg)
37
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 38: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/38.jpg)
38
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 39: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/39.jpg)
39
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 40: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/40.jpg)
40
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 41: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/41.jpg)
41
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 42: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/42.jpg)
42
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 43: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/43.jpg)
43
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 44: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/44.jpg)
44
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 45: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/45.jpg)
45
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 46: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/46.jpg)
46
Backpropagation Intro
Tells us: by increasing x by a scale of 1, we decrease f by a scale of 4
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 47: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/47.jpg)
47
ŷ = P(y=1|X,W)
x1
x2
x3
W1
w2
x
Backpropagation(binary classification example)
Example on 1-hidden layer NN for binary classification
![Page 48: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/48.jpg)
48
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 49: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/49.jpg)
49
x y*W1
*w2
z1z2h1
E = loss =
Backpropagation(binary classification example)
Gradient Descent to Minimize loss: Need to find these!
![Page 50: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/50.jpg)
50
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 51: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/51.jpg)
51
Backpropagation(binary classification example)
= ??
= ??
x y*W1
*w2
z1z2h1
![Page 52: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/52.jpg)
52
Backpropagation(binary classification example)
= ??
= ??
Exploit the chain rule!
x y*W1
*w2
z1z2h1
![Page 53: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/53.jpg)
53
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 54: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/54.jpg)
54
Backpropagation(binary classification example)
chain rule
x y*W1
*w2
z1z2h1
![Page 55: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/55.jpg)
55
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 56: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/56.jpg)
56
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 57: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/57.jpg)
57
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 58: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/58.jpg)
58
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 59: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/59.jpg)
59
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 60: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/60.jpg)
60
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 61: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/61.jpg)
61
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 62: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/62.jpg)
62
Backpropagation(binary classification example)
already computed
x y*W1
*w2
z1z2h1
![Page 63: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/63.jpg)
63
“Local-ness” of Backpropagation
fx y
“local gradients”activations
gradients
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 64: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/64.jpg)
64
“Local-ness” of Backpropagation
fx y
“local gradients”activations
gradients
![Page 65: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/65.jpg)
x
65
Example: Sigmoid Block
sigmoid(x) = (x)
![Page 66: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/66.jpg)
Deep Learning = Concatenation of Differentiable Parameterized Layers (linear & nonlinearity functions)
x y
1st hidden layer
2nd hidden layer
Output layer
66
*W1
*W2
*W3
z1 z2 z3h1 h2
Want to find optimal weights W to minimize some loss function E!
![Page 67: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/67.jpg)
Backprop Whiteboard Demo
x1
x2
1
Σ
Σ
ŷ
67
w1 z1w2
w3w4
b1
b2
z2
h1
h2
Σ
w5
w6
1b3
z1 = x1w1 + x2w3 + b1z2 = x1w2 + x2w4 + b2
h1 =exp(z1)
1 + exp(z1)exp(z2)
1 + exp(z2)h2 =
ŷ = h1w5 + h2w6 + b3
E = ( y - ŷ )2
f1
f2
f3
f4
w(t+1) = w(t) - E w(t)
E w = ??
![Page 68: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/68.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural Network
Loss Functions
Backpropagation
Nonlinearity FunctionsNNs in Practice
68
![Page 69: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/69.jpg)
Nonlinearity Functions (i.e. transfer or activation functions)
x1
x2
x3
Σ
SummingFunction
SigmoidFunction
w1
w2
w3
+1
b1
z
x﹡w
Multiply by weights
69
![Page 70: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/70.jpg)
Nonlinearity Functions (i.e. transfer or activation functions)
x﹡w
70https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions
Name Plot Equation Derivative ( w.r.t x )
![Page 71: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/71.jpg)
Nonlinearity Functions (i.e. transfer or activation functions)
x﹡w
71https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions
Name Plot Equation Derivative ( w.r.t x )
![Page 72: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/72.jpg)
Nonlinearity Functions (aka transfer or activation functions)
x﹡w
72https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions
Name Plot Equation Derivative ( w.r.t x )
usually works best in practice
![Page 73: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/73.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural Network
Loss Functions
Backpropagation
Nonlinearity Functions
NNs in Practice73
![Page 74: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/74.jpg)
74
Neural Net Pipeline
x ŷ
W1w3
1. Initialize weights2. For each batch of input x samples S:
a. Run the network “Forward” on S to compute outputs and lossb. Run the network “Backward” using outputs and loss to compute gradientsc. Update weights using SGD (or a similar method)
3. Repeat step 2 until loss convergence
W2
E
![Page 75: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/75.jpg)
75
Non-Convexity of Neural Nets
In very high dimensions, there exists many local minimum which are about the same.
Pascanu, et. al. On the saddle point problem for non-convex optimization 2014
![Page 76: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/76.jpg)
76
Building Deep Neural Nets
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
fx
y
![Page 77: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/77.jpg)
77
Building Deep Neural Nets
“GoogLeNet” for Object Classification
![Page 78: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/78.jpg)
78
Block Example Implementation
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 79: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/79.jpg)
79
Advantage of Neural Nets
As long as it’s fully differentiable, we can train the model to automatically learn features for us.
![Page 80: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/80.jpg)
Advanced Deep Learning Models:
Convolutional Neural Networks
& Recurrent Neural Networks
Most slides from http://cs231n.stanford.edu/
80
![Page 81: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/81.jpg)
81
Convolutional Neural Networks(aka CNNs and ConvNets)
![Page 82: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/82.jpg)
Challenges in Visual Recognition
![Page 83: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/83.jpg)
Challenges in Visual Recognition
![Page 84: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/84.jpg)
84
Problems with “Fully Connected Networks” on Images
1000x1000 Image1M hidden units:→ 10^12 parameters
Spatial Correlation is local! → Connect units
locally
Each neuron corresponds to a specific pixel
![Page 85: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/85.jpg)
85
Problems with “Fully Connected Networks” on Images
1000x1000 Image1M hidden units:→ 10^12 parameters
Spatial Correlation is local! → Connect units
locally
Each neuron corresponds to a specific pixel
How do we deal with multiple dimensions in the input? Length,height, channel (R,G,B)
![Page 86: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/86.jpg)
86
Convolutional Layer
![Page 87: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/87.jpg)
87
Convolutional Layer
![Page 88: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/88.jpg)
88
Convolutional Layer
![Page 89: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/89.jpg)
89
Convolutional Layer
![Page 90: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/90.jpg)
90
Convolution
![Page 91: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/91.jpg)
91
Convolution
![Page 92: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/92.jpg)
92
Neuron View of Convolutional Layer
![Page 93: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/93.jpg)
93
Neuron View of Convolutional Layer
![Page 94: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/94.jpg)
94
Convolutional Layer
![Page 95: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/95.jpg)
95
Convolutional Layer
![Page 96: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/96.jpg)
96
Neuron View of Convolutional Layer
![Page 97: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/97.jpg)
97
Convolutional Neural Networks
![Page 98: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/98.jpg)
98
Convolutional Neural Networks
![Page 99: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/99.jpg)
99
Convolutional Neural Networks
![Page 100: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/100.jpg)
100
Convolutional Neural Networks
http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
![Page 101: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/101.jpg)
101
Pooling Layer
![Page 102: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/102.jpg)
102
Pooling Layer (Max Pooling example)
![Page 103: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/103.jpg)
103
History of ConvNets
Gradient-based learning applied to document recognition [LeCun, Bottou, Bengio, Haffner]
ImageNet Classification with Deep Convolutional Neural Networks [Krizhevsky, Sutskever, Hinton, 2012]
1998 2012
![Page 104: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/104.jpg)
104
![Page 105: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/105.jpg)
105
![Page 106: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/106.jpg)
106
![Page 107: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/107.jpg)
107
![Page 108: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/108.jpg)
108
Recurrent Neural Networks
![Page 109: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/109.jpg)
109
Standard “Feed-Forward” Neural Network
![Page 110: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/110.jpg)
110
Standard “Feed-Forward” Neural Network
![Page 111: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/111.jpg)
111
Recurrent Neural Networks (RNNs)RNNs can handle
![Page 112: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/112.jpg)
112
Recurrent Neural Networks (RNNs)RNNs can handle
![Page 113: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/113.jpg)
Recurrent Neural Networks (RNNs)
Traditional “Feed Forward” Neural Network Recurrent Neural Network
input
hidden
output
predict a vector at each timestep
input
hidden
output
![Page 114: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/114.jpg)
114
Recurrent Neural Networks
ht
ht-1
![Page 115: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/115.jpg)
115
Recurrent Neural Networks
ht
ht-1
![Page 116: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/116.jpg)
116
“Vanilla” Recurrent Neural Network
ht
ht-1
![Page 117: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/117.jpg)
117
Character Level Language Model with an RNN
![Page 118: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/118.jpg)
118
Character Level Language Model with an RNN
![Page 119: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/119.jpg)
119
Character Level Language Model with an RNN
![Page 120: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/120.jpg)
120
Character Level Language Model with an RNN
![Page 121: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/121.jpg)
121
Character Level Language Model with an RNN
![Page 122: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/122.jpg)
122
Character Level Language Model with an RNN
![Page 123: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/123.jpg)
123
Character Level Language Model with an RNN
![Page 124: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/124.jpg)
124
Example: Generating Shakespeare with RNNs
![Page 125: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/125.jpg)
125
Example: Generating C Code with RNNs
![Page 126: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/126.jpg)
Long Short Term Memory Networks (LSTMs)Recurrent networks suffer from the “vanishing gradient problem”● Aren’t able to model long term dependencies in sequences
input
hidden
output
![Page 127: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/127.jpg)
Long Short Term Memory Networks (LSTMs)Recurrent networks suffer from the “vanishing gradient problem”● Aren’t able to model long term dependencies in sequences
Use “gating units” to learn when to remember
input
output
![Page 128: Lecture 17: Neural Networks and Deep LearningLecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016. Neurons](https://reader030.vdocument.in/reader030/viewer/2022041016/5ec900236b6392450e61247d/html5/thumbnails/128.jpg)
128
RNNs and CNNs Together