deep neural networks are our friendslxmls.it.pt/2016/deep-neural-networks-are-our-friends.pdf ·...
TRANSCRIPT
![Page 1: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/1.jpg)
Deep Neural NetworksAre Our Friends
Wang Ling
![Page 2: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/2.jpg)
● Part I - Neural Networks are our friends○ Numbers are our friends ○ Operators are our friends○ Functions are our friends○ Parameters are our friends○ Cost Functions are our friends○ Optimizers are our friends○ Gradients are our friends○ Computation Graphs are our friends
Outline
![Page 3: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/3.jpg)
● Part I - Neural Networks are our friends● Part 2 - Into Deep Learning
○ Nonlinear Neural Models○ Multilayer Perceptrons○ Using Discrete Variables○ Example Applications
Outline
![Page 4: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/4.jpg)
Numbers are our friends
![Page 5: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/5.jpg)
Numbers are our friendsAbby Cadabby
How many apples does Abby have?
![Page 6: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/6.jpg)
Numbers are our friends
4
Abby Cadabby
![Page 7: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/7.jpg)
Numbers are our friends● Types of Numbers:
○ Integers : 5○ Rationals : 1/2○ Reals : 1.4e10 ...
![Page 8: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/8.jpg)
Operators are our friends
4
Bert
![Page 9: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/9.jpg)
Operators are our friends
41
Bert
If Abby has 4 apples, and gives Bert 1 apple, how many apples will
Abby have?
![Page 10: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/10.jpg)
Operators are our friends
3 1
Bert
![Page 11: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/11.jpg)
Operators are our friends● Arithmetic Operators
○ Addition : 23 + 12 = 35○ Subtraction : 31 - 15 = 16○ Multiplication : 4 x 5 = 20○ Division : 20 / 5 = 4
![Page 12: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/12.jpg)
Functions are our friends
41
![Page 13: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/13.jpg)
Functions are our friends
4
5?
1
If Bert always returns 3 bananas for each apple, how many bananas will
Abby receive for 2 apples
![Page 14: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/14.jpg)
Functions are our friends
y = 3x
● Input, x - Number of Apples given by Abby
![Page 15: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/15.jpg)
Functions are our friends
y = 3x
● Input, x - Number of Apples given by Abby
● Output, y - Number of Bananas received by Abby
![Page 16: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/16.jpg)
Functions are our friends
4
5?
1
y = 3x
![Page 17: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/17.jpg)
Functions are our friends
4
5?
1
y = 3x , x =1
![Page 18: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/18.jpg)
Functions are our friends
4
53
1
y = 3x , x =1y = 3
![Page 19: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/19.jpg)
Functions are our friendsy = 3x
![Page 20: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/20.jpg)
Functions are our friendsy = 3x
Cookie Monster
![Page 21: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/21.jpg)
Functions are our friendsy = 3x y = ??
![Page 22: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/22.jpg)
Functions are our friendsy = ??
0
1
![Page 23: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/23.jpg)
Functions are our friendsy = ??
0
1
16
5
![Page 24: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/24.jpg)
Functions are our friendsy = ??
0
1
16
5
20
6
![Page 25: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/25.jpg)
Functions are our friendsy = ??
0
1
16
5
20
6
?
3
If Abby gives Cookie Monster 3 apples, how many bananas
does she get?
![Page 26: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/26.jpg)
Parameters are our friends
y = 3x + 1
● Input● Output
![Page 27: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/27.jpg)
Parameters are our friends
y = wx + b
● Input● Output● Parameters
Input - Fixed, comes from dataParameters - Need to be estimated
![Page 28: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/28.jpg)
Parameters are our friendsy = wx + b
0
1
16
5
20
6
?
3
![Page 29: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/29.jpg)
Data
Parameters are our friendsy = wx + b
0
1
16
5
20
6
?
3
![Page 30: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/30.jpg)
Parameters are our friendsy = wx + b
?
3
x y
1 0
5 16
6 20
![Page 31: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/31.jpg)
Parameters are our friends
y = wx + bx y
1 0
5 16
6 20
Data Model
![Page 32: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/32.jpg)
Parameters are our friends
y = wx + bx y
1 0
5 16
6 20
Data Model
How to find the parameters w and b?
![Page 33: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/33.jpg)
Parameters are our friends
y = wx + bx y
1 0
5 16
6 20
Data ModelModel
Candidate 1x y ŷ
1 0 1
5 16 5
6 20 6y = 1x + 0
![Page 34: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/34.jpg)
Parameters are our friends
y = wx + bx y
1 0
5 16
6 20
Data ModelModel
Candidate 1x y ŷ
1 0 1
5 16 5
6 20 6
Model Candidate 2 x y ŷ
1 0 4
5 16 12
6 20 14
y = 1x + 0
y = 2x + 2
![Page 35: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/35.jpg)
Parameters are our friends
y = wx + bx y
1 0
5 16
6 20
Data ModelModel
Candidate 1x y ŷ
1 0 1
5 16 5
6 20 6
Model Candidate 2 x y ŷ
1 0 4
5 16 12
6 20 14
y = 1x + 0
y = 2x + 2Which one is better ?
![Page 36: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/36.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data ModelModel
Candidate 1x y ŷ
1 0 1
5 16 5
6 20 6
Model Candidate 2 x y ŷ
1 0 4
5 16 12
6 20 14
y = 1x + 0
y = 2x + 2
![Page 37: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/37.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data ModelModel
Candidate 1x y ŷ
1 0 1
5 16 5
6 20 6
Model Candidate 2 x y ŷ
1 0 4
5 16 12
6 20 14
y = 1x + 0
y = 2x + 2
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
![Page 38: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/38.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data ModelModel
Candidate 1
Model Candidate 2 x y ŷ
1 0 4
5 16 12
6 20 14
y = 1x + 0
y = 2x + 2
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
n x y ŷ (y-ŷ)
0 1 0 1 1
1 5 16 5
2 6 20 6
2
![Page 39: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/39.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data ModelModel
Candidate 1
Model Candidate 2 x y ŷ
1 0 4
5 16 12
6 20 14
y = 1x + 0
y = 2x + 2
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
n x y ŷ (y-ŷ)
0 1 0 1 1
1 5 16 5 121
2 6 20 6
2
![Page 40: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/40.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data ModelModel
Candidate 1
Model Candidate 2 x y ŷ
1 0 4
5 16 12
6 20 14
y = 1x + 0
y = 2x + 2
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
n x y ŷ (y-ŷ)
0 1 0 1 1
1 5 16 5 121
2 6 20 6 196
2
![Page 41: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/41.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data ModelModel
Candidate 1
Model Candidate 2 x y ŷ
1 0 4
5 16 12
6 20 14
y = 1x + 0
y = 2x + 2
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
n x y ŷ (y-ŷ)
0 1 0 1 1
1 5 16 5 121
2 6 20 6 196
2
318C(1,0)
![Page 42: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/42.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data ModelModel
Candidate 1
n x y ŷ (y-ŷ)
0 1 0 1 1
1 5 16 5 121
2 6 20 6 196
Model Candidate 2
y = 1x + 0
y = 2x + 2
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
2
318
n x y ŷ (y-ŷ)
0 1 0 4 16
1 5 16 12 16
2 6 20 14 36
2
68
C(1,0)
C(2,2)
![Page 43: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/43.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data ModelModel
Candidate 1
Model Candidate 2
y = 1x + 0
y = 2x + 2
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
318
68
C(1,0)
C(2,2)
![Page 44: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/44.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data Model
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
![Page 45: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/45.jpg)
Cost functions are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data Model
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
How to find the parameters w and b?
![Page 46: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/46.jpg)
Optimizers are our friends
yn = wxn + bn x y
0 1 0
1 5 16
2 6 20
Data Model
Cost
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2Optimizer
arg min C(w,b)w,b∈[-∞,∞]
![Page 47: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/47.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
![Page 48: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/48.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w0,b0 = 2,2 : C(w0,b0) = 68
w
b
y = wx + b
![Page 49: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/49.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w0,b0 = 2,2 : C(w0,b0) = 68
w
b
2
2
68
y = wx + b
![Page 50: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/50.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w0,b0 = 2,2 : C(w0,b0) = 68w1,b1 = 3,2 : C(w1,b1) = ?
w
b
y = wx + b
![Page 51: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/51.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w0,b0 = 2,2 : C(w0,b0) = 68w1,b1 = 3,2 : C(w1,b1) = 26
n x y ŷ (y-ŷ)
0 1 0 5 25
1 5 16 17 1
2 6 20 20 0
C(3,2) 26
w
b
2
y = wx + b
![Page 52: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/52.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w0,b0 = 2,2 : C(w0,b0) = 68w1,b1 = 3,2 : C(w1,b1) = 26
n x y ŷ (y-ŷ)
0 1 0 5 25
1 5 16 17 1
2 6 20 20 0
C(3,2) 26
w
b
2
y = wx + b
![Page 53: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/53.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w1,b1 = 3,2 : C(w1,b1) = 26w2,b2 = 4,2 : C(w2,b2) = ??
w
b
y = wx + b
![Page 54: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/54.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w1,b1 = 3,2 : C(w1,b1) = 26w2,b2 = 4,2 : C(w2,b2) = 136
w
b
n x y ŷ (y-ŷ)
0 1 0 6 36
1 5 16 22 64
2 6 20 26 36
C(4,2) 136
2
y = wx + b
![Page 55: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/55.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w1,b1 = 3,2 : C(w1,b1) = 26
w
b
y = wx + b
![Page 56: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/56.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w1,b1 = 3,2 : C(w1,b1) = 26w2,b2 = 3,3 : C(w2,b2) = 41
w
b
n x y ŷ (y-ŷ)
0 1 0 6 36
1 5 16 18 4
2 6 20 21 1
C(3,3) 41
2
y = wx + b
![Page 57: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/57.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w1,b1 = 3,2 : C(w1,b1) = 26
w
b
y = wx + b
![Page 58: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/58.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w1,b1 = 3,2 : C(w1,b1) = 26w2,b2 = 3,1 : C(w2,b2) = 17
w
b
n x y ŷ (y-ŷ)
0 1 0 4 16
1 5 16 16 0
2 6 20 19 1
C(3,1) 17
2
y = wx + b
![Page 59: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/59.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w2,b2 = 3,1 : C(w2,b2) = 17
w
b
y = wx + b
![Page 60: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/60.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w2,b2 = 3,1 : C(w2,b2) = 17
w
b
w3,b3 = 3,0 : C(w3,b3) = 13
n x y ŷ (y-ŷ)
0 1 0 3 9
1 5 16 15 1
2 6 20 18 4
C(3,0) 13
2
y = wx + b
![Page 61: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/61.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w3,b3 = 3,0 : C(w3,b3) = 13
y = wx + b
![Page 62: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/62.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w3,b3 = 3,0 : C(w3,b3) = 13w4,b4 = 3,-1 : C(w4,b4) = 17
n x y ŷ (y-ŷ)
0 1 0 2 4
1 5 16 14 4
2 6 20 17 9
C(3,-1) 17
2
y = wx + b
![Page 63: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/63.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w3,b3 = 3,0 : C(w3,b3) = 13w4,b4 = 2,0 : C(w4,b4) = 104
n x y ŷ (y-ŷ)
0 1 0 2 4
1 5 16 10 36
2 6 20 12 64
C(2,0) 104
2
y = wx + b
![Page 64: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/64.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w3,b3 = 3,0 : C(w3,b3) = 13w4,b4 = 4,0 : C(w4,b4) = 104
n x y ŷ (y-ŷ)
0 1 0 4 16
1 5 16 20 16
2 6 20 24 16
C(2,0) 54
2
y = wx + b
![Page 65: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/65.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w3,b3 = 3,0 : C(w3,b3) = 13
y = wx + b
![Page 66: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/66.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w?,b? = 4,-2 : C(w?,b?) = ??
y = wx + b
![Page 67: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/67.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
n x y ŷ (y-ŷ)
0 1 0 2 4
1 5 16 18 4
2 6 20 22 4
C(4,-2) 12
2
w?,b? = 4,-2 : C(w?,b?) = 12
y = wx + b
![Page 68: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/68.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w3,b3 = 3,0 : C(w3,b3) = 13
y = wx + b
![Page 69: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/69.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w3,b3 = 3,0 : C(w3,b3) = 13
Search Problem
y = wx + b
![Page 70: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/70.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w3,b3 = 3,0 : C(w3,b3) = 13w4,b4 = 3.01,0 : C(w4,b4) = 12.82
n x y ŷ (y-ŷ)
0 1 0 3.01 9.06
1 5 16 15.01 0.98
2 6 20 18.01 3.96
C(3.01,0) 12.82
2
y = wx + b
![Page 71: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/71.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w*,b* = 4,-2 : C(w*,b*) = 12
y = wx + b
![Page 72: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/72.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w*,b* = 4,-2 : C(w*,b*) = 12
y = wx + b
![Page 73: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/73.jpg)
Optimizers are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
w*,b* = 4,-4 : C(w*,b*) = 0
y = wx + b
![Page 74: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/74.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
Should be used sparingly
y = wx + b
![Page 75: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/75.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
![Page 76: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/76.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
hwhw = 1
![Page 77: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/77.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
hwhw = 1C(w0+hw,b0) = C(3,2) = 26
![Page 78: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/78.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
hwhw = 1C(w0+hw,b0) = C(3,2) = 26 (C(w0+1,b0)-C(w0,b0))
(C(3,2)-C(2,2))=-421
1
rw=
rw=
![Page 79: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/79.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
hwhw = 1, r = -42hw = 0.1, r = -98hw = 0.01, r = -104hw = 0.001, r = -104
![Page 80: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/80.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
hwhw = 1, r = -42hw = 0.1, r = -98hw = 0.01, r = -104hw = 0.001, r = -104 ∂C
∂w(w0,b0)hw → 0, r =
![Page 81: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/81.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
hw∂C
∂w=
∂∑(ŷn-yn) 2
∂wn
![Page 82: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/82.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
hw∂C
∂w=
∂∑(ŷn-yn) 2
∂wn = ∑-2(ŷn-yn)xn
n
![Page 83: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/83.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w0,b0 = 2,2 : C(w0,b0) = 68
∂C
∂w=
∂∑(ŷn-yn) 2
∂wn = ∑-2(ŷn-yn)xn
n
∂w(w0,b0)hw → 0, rw = = -104
∂C
n x y ŷ (ŷ-y) -2(ŷ-y)x
0 1 0 4 4 8
1 5 16 12 -4 -40
2 6 20 14 -6 -72
![Page 84: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/84.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w
b
y = wx + b
w0,b0 = 2,2 : C(w0,b0) = 68
2
2
68
hw∂C
∂w=
∂∑(ŷn-yn) 2
∂wn = ∑-2(ŷn-yn)xn
n
∂C
∂b=
∂∑(ŷn-yn) 2
∂bn = ∑-2(ŷn-yn)
n
![Page 85: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/85.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w0,b0 = 2,2 : C(w0,b0) = 68
∂w(w0,b0)hw → 0, rw = = -104
∂C
n x y ŷ (ŷ-y) -2(ŷ-y)
0 1 0 4 4 8
1 5 16 12 -4 -8
2 6 20 14 -6 -12
∂w(w0,b0)hb → 0, rb = = -12
∂C
![Page 86: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/86.jpg)
Gradients are our friendsOptimizer
arg min C(w,b)w,b∈[-∞,∞]
w0,b0 = 2,2 : C(w0,b0) = 68
∂w(w0,b0)hw → 0, rw = = -104
∂C
∂w(w0,b0)hb → 0, rb = = -12
∂C
w
b
y = wx + b
2
2w1 = w0 - rw
b1 = b0 - rb → Learning Rate
![Page 87: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/87.jpg)
Gradients are our friendsy = 4x-4
Data
0
1
16
5
20
6
?
3
![Page 88: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/88.jpg)
Gradients are our friendsy = 4x-4
Data
0
1
16
5
20
6
8
3
![Page 89: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/89.jpg)
Computation Graphs are our friends
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
∂C
∂w=
∂∑(ŷn-yn)
∂wn = ∑-2(ŷn-yn)xn
n
∂C
∂b=
∂∑(ŷn-yn) 2
∂bn = ∑-2(ŷn-yn)
n
y = wx + b
Easy!
2
![Page 90: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/90.jpg)
Computation Graphs are our friends
Harder!
y = wx + b + tanh(yx + b)2
![Page 91: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/91.jpg)
Computation Graphs are our friends
Computation Graphs can
compute gradients for you!
y = wx + b + tanh(yx + b)2
![Page 92: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/92.jpg)
Computation Graphs are our friends
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
∂C
∂w=
∂∑(ŷn-yn)
∂wn = ∑-2(ŷn-yn)xn
n
∂C
∂b=
∂∑(ŷn-yn) 2
∂bn = ∑-2(ŷn-yn)
n
y = wx + b
2
![Page 93: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/93.jpg)
Computation Graphs are our friends
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
∂C
∂w=
∂(ŷn-yn)
∂ynn
= ∑-2(ŷn-yn)xn n
2
= ∑-2(ŷn-yn) n
y = wx + b
∂yn
∂w
2
∑
∂C
∂b=
∂(ŷn-yn)
∂ynn
∂yn
∂b∑
![Page 94: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/94.jpg)
Computation Graphs are our friends
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
∂C
∂w=
∂(ŷn-yn)
∂ynn
2
y = wx + b
∂yn
∂w
2
∑
∂C
∂b=
∂(ŷn-yn)
∂ynn ∂b∑ ∂yn
![Page 95: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/95.jpg)
Computation Graphs are our friends
C(w,b) = ∑(yn-ŷn)n∈{0,1,2}
2
∂C
∂w=
∂(ŷn-yn)
∂ynn
2
y = o + bo = wx
∂yn
∂w
2
∑
∂C
∂b=
∂(ŷn-yn)
∂ynn ∂b∑ ∂yn
![Page 96: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/96.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂ynn
2
c = dd = y - ŷy = o + bo = wx
∂yn
∂w
2
∑
∂C
∂b=
∂(ŷn-yn)
∂ynn ∂b∑ ∂yn
2
∂(ŷn-yn)
![Page 97: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/97.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
2
c = dd = y - ŷy = o + bo = wx
∂on
∂w∑
∂C
∂b=
∂(ŷn-yn)
∂ynn ∂b∑ ∂yn
2
∂dn
∂yn
∂yn
∂on
![Page 98: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/98.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
c = dd = y - ŷy = o + bo = wx
∂on
∂w∑
∂C
∂b
2
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
![Page 99: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/99.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
c = dd = y - ŷy = o + bo = wx
∂on
∂w∑
∂C
∂b
2
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
Add
Product
Sub
![Page 100: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/100.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
c = dd = y - ŷy = o + bo = wx
∂on
∂w∑
∂C
∂b
2
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
Add
Product
forward(x,y) → zbackward(x,y,dz) → dx,dy
Sub
![Page 101: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/101.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
c = dd = y - ŷy = o + bo = wx
∂on
∂w∑
∂C
∂b
2
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
Add
Product
forward(x,y) : return x - ybackward(x,y,dz) : return dz, -dz
Sub
![Page 102: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/102.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
c = dd = y - ŷy = o + bo = wx
∂on
∂w∑
∂C
∂b
2
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
Add
Product
forward(x,y) : return x - ybackward(x,y,dz) : return dz, -dz
Sub
![Page 103: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/103.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
c = dd = y - ŷy = o + bo = wx
∂on
∂w∑
∂C
∂b
2
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
Add
Product
forward(x,y) : return x - ybackward(x,y,dz) : return 1, -1
Sub ∂dn
∂ŷn
![Page 104: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/104.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
c = dd = y - ŷy = o + bo = wx
∂on
∂w∑
∂C
∂b
2
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
Add
Product
o
w x
Product
![Page 105: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/105.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
c = dd = y - ŷ
∂on
∂w∑
∂C
∂b
2
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
o
w x
Product
b
Add
y
![Page 106: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/106.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0,1,2}
∂C
∂w=
∂cn
∂dnn
∂on
∂w∑
∂C
∂b
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
o
w x
Product
b
Add
y ŷ
d c
![Page 107: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/107.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0}
∂C
∂w=
∂cn
∂dnn
∂on
∂w∑
∂C
∂b
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
![Page 108: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/108.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0}
∂C
∂w=
∂cn
∂dnn
∂on
∂w∑
∂C
∂b
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
Input
![Page 109: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/109.jpg)
Computation Graphs are our friends
C(w,b) = ∑cnn∈{0}
∂C
∂w=
∂cn
∂dnn
∂on
∂w∑
∂C
∂b
∂dn
∂yn
∂yn
∂on
= ∂cn
∂dnn
∑ ∂dn
∂yn
∂yn
∂b
Power 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
Input
Parameters
![Page 110: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/110.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs
![Page 111: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/111.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables
Variables
![Page 112: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/112.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables
Variables
2 values: x and dx
0,0
0,0
0,00,0 0,0
![Page 113: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/113.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
0,0
0,00,0 0,0
![Page 114: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/114.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
0,0
0,00,0 0,0
1st
2nd
3rd4th 5th
![Page 115: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/115.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables
10,0
0,0
0,00,0 0,0
1st
2nd
3rd4th 5th
![Page 116: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/116.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables
10,0
12,0
0,00,0 0,0
1st
2nd
3rd4th 5th
![Page 117: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/117.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
0,0
0,00,0 0,0
1st
2nd
3rd4th 5th
![Page 118: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/118.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
2,0
0,00,0 0,0
1st
2nd
3rd4th 5th
![Page 119: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/119.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables
10,0
2,0
0,00,0 0,0
1st
2nd
3rd4th 5th
![Page 120: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/120.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
0,0
0,00,0 0,0
![Page 121: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/121.jpg)
Computation Graphs are our friendsPower 2
Sub
o
Add
y
d c Id CForward:
1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
0,0
0,00,0 0,0
![Page 122: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/122.jpg)
Computation Graphs are our friends
o
y
d c CForward:
1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
0,0
0,00,0 0,0
1st
2nd
3rd4th 5th
![Page 123: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/123.jpg)
Computation Graphs are our friendsPower 2
Sub
o
Add
y
d c Add CForward:
1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
0,0
0,00,0 0,0
g0,0
Add
s 0,0
![Page 124: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/124.jpg)
Computation Graphs are our friends
o
y
d c CForward:
1-Initialize inputs2-Initialize variables3-Topological Sort variables
0,0
0,0
0,00,0 0,0
g0,0
s 0,0
1st
2nd
3th
4th
5th 6th 7th
![Page 125: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/125.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them
0,0
0,0
0,00,0 0,0
1st
2nd
3rd
4th 5th
![Page 126: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/126.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them
10,0
0,0
0,00,0 0,0
1st
2nd
3rd
4th 5th
![Page 127: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/127.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them
10,0
12,0
0,00,0 0,0
1st
2nd
3rd
4th 5th
![Page 128: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/128.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them
10,0
12,0
-4,00,0 0,0
1st
2nd
3rd
4th 5th
![Page 129: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/129.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them
10,0
12,0
-4,016,0 0,0
1st
2nd
3rd
4th 5th
![Page 130: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/130.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them
10,0
12,0
-4,016,0
1st
2nd
3rd
4th 5th16,0
![Page 131: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/131.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them
5-Set gradients to final variables
10,0
12,0
-4,016,0
1st
2nd
3rd
4th 5th16,1
![Page 132: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/132.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,0
12,0
-4,016,0
1st
2nd
3rd
4th 5th16,1
∂C
∂c C=c =1
dc = dC ∂C
∂c
![Page 133: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/133.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,0
12,0
-4,016,1
1st
2nd
3rd
4th 5th16,1
∂C
∂c C=c =1
dc = dC ∂C
∂c
![Page 134: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/134.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,0
12,0
-4,016,1
1st
2nd
3rd
4th 5th16,1
c = d2
dd = dc ∂c
∂d
∂c
∂d= 2d
![Page 135: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/135.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,0
12,0
-4,016,1
1st
2nd
3rd
4th 5th16,1
c = d2
dd = dc ∂c
∂d
∂c
∂d= 2 x -4
![Page 136: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/136.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,0
12,0
-4,016,1
1st
2nd
3rd
4th 5th16,1
c = d2
dd = dc ∂c
∂d
∂c
∂d= -8
![Page 137: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/137.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,0
12,0
-4,-816,1
1st
2nd
3rd
4th 5th16,1
c = d2
dd = dc ∂c
∂d
∂c
∂d= -8
![Page 138: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/138.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,0
12,0
-4,-816,1
1st
2nd
3rd
4th 5th16,1
d = y - ŷ ∂d
∂y= 1
![Page 139: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/139.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,0
12,-8
-4,-816,1
1st
2nd
3rd
4th 5th16,1
d = y - ŷ ∂d
∂y= 1
dy = dd ∂d
∂y
![Page 140: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/140.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,-8
12,-8
-4,-816,1
1st
2nd
3rd
4th 5th16,1
y = o + b
∂y
∂o= 1
do = dy ∂y
∂o
![Page 141: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/141.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,-8
12,-8
-4,-816,1
1st
2nd
3rd
4th 5th16,1
y = o + b
∂y
∂o= 1
∂y
∂b= 1
bt+1 = b - dy ∂y
∂b
![Page 142: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/142.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,-8
12,-8
-4,-816,1
1st
2nd
3rd
4th 5th16,1
y = o + b
∂y
∂o= 1
∂y
∂b= 1
bt+1 = b - dy ∂y
∂b
![Page 143: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/143.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,-8
12,-8
-4,-816,1
1st
2nd
3rd
4th 5th16,1
y = o + b
∂y
∂o= 1
∂y
∂b= 1
bt+1 = b - ∂c
∂d
∂d∂y
∂y∂b
∂C
∂c
![Page 144: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/144.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,-8
12,-8
-4,-816,1
1st
2nd
3rd
4th 5th16,1
y = o + b
∂y
∂o= 1
∂y
∂b= 1
bt+1 = b - ∂C
∂b
![Page 145: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/145.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52
2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)10,-8
12,-8
-4,-816,1
1st
2nd
3rd
4th 5th16,1
o = wx
∂o
∂w= x
wt+1 = w - do ∂o
∂w
![Page 146: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/146.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52.8
2.2
Forward:1-Initialize inputs2-Initialize variables3-Topological Sort variables4-For each variable in topological
order, run the forward method of all operations that link to them (Forward)
5-Set gradients to final variables6-run the operations backward method
in reverse order (Backward)7-update parameters 10,-8
12,-8
-4,-816,1
1st
2nd
3rd
4th 5th16,1
o = wx
∂o
∂w= x
wt+1 = w - do ∂o
∂w
![Page 147: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/147.jpg)
Computation Graphs are our friendsPower 2
Sub
o
w x
Product
b
Add
y ŷ
d c Id C
16
52.8
2.210,-8
12,-8
-4,-816,1 16,1
o = wx
∂o
∂w= x
wt+1 = w - do ∂o
∂w
Existing Tools:-Tensorflow ( https://www.tensorflow.org )-Torch ( https://github.com/torch/nn )-CNN ( https://github.com/clab/cnn )-JNN ( https://github.com/wlin12/JNN )-Theano (http://deeplearning.net/software/theano/ )
![Page 148: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/148.jpg)
Into Deep Learning
![Page 149: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/149.jpg)
Nonlinear Neural Modelsy = 4x-4
Data
0
1
16
5
20
6
?
3
![Page 150: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/150.jpg)
Nonlinear Neural Models
Data
0
1
16
5
20
6
?
3
There is a limit of bananas I can give you
![Page 151: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/151.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
Data
x
y y = 4x-4
![Page 152: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/152.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data
x
y y = 4x-4
![Page 153: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/153.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data
x
y y = 2x+3
Model Problem
![Page 154: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/154.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data
x
y y = 2x+3
Model Problem
Underfitting
![Page 155: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/155.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data
x
y y = ???
Can we learn arbitrary functions?
![Page 156: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/156.jpg)
Nonlinear Neural Models
y = (w1x + b1)s1 + (w2x+b2)s2
Use different linear functions depending on the value of x?
![Page 157: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/157.jpg)
Nonlinear Neural Models
y = (w1x + b1)s1 + (w2x+b2)s2s1 - 1 if x < 6 and 0 otherwises2 - 1 if x >= 6 and 0 otherwise
![Page 158: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/158.jpg)
Nonlinear Neural Models
y = (w1x + b1)s1 + (w2x+b2)s2
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data
y = (4x - 4)s1 + (0x+20)s2
s1 - 1 if x < 6 and 0 otherwises2 - 1 if x >= 6 and 0 otherwise
![Page 159: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/159.jpg)
Nonlinear Neural Models
s = (wx + b)
(t) = 11 + e-t
![Page 160: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/160.jpg)
Nonlinear Neural Models
s = (1000x)
x = 0.1 then (1000x) = 1
x = -0.1 then (1000x) = 0
![Page 161: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/161.jpg)
Nonlinear Neural Models
s = (1000x)
x = 0.1 then (1000x) = 1
x = -0.1 then (1000x) = 0
![Page 162: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/162.jpg)
Nonlinear Neural Models
s = (1000x - 6000)
x = 6.1 then (1000x - 6000) = 1
x = 5.9 then (1000x - 6000) = 0
![Page 163: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/163.jpg)
Nonlinear Neural Models
y = (w1x + b1)s1 + (w2x+b2)s2
s1 = (w3x + b3)s2 = (w4x + b4)
![Page 164: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/164.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (4x - 4)s1 + (0x+20)s2
s1 = (-1000x + 6000)s2 = (1000x - 6000)
![Page 165: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/165.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (4x - 4)s1 + (0x+20)s2
s1 = (-1000x + 6000)s2 = (1000x - 6000)
![Page 166: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/166.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (16)s1 + (0x+20)s2
s1 = (-1000x + 6000)s2 = (1000x - 6000)
![Page 167: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/167.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (16)s1 + (20)s2
s1 = (-1000x + 6000)s2 = (1000x - 6000)
![Page 168: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/168.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (16)s1 + (20)s2
s1 = (1000)s2 = (1000x - 6000)
![Page 169: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/169.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (16)s1 + (20)s2
s1 = (1000)s2 = (-1000)
![Page 170: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/170.jpg)
Nonlinear Neural Models
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (16)1 + (20)0
s1 = (1000)s2 = (-1000)
![Page 171: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/171.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = 16
s1 = (1000)s2 = (-1000)
Nonlinear Neural Models
![Page 172: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/172.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (4x - 4)s1 + (0x+20)s2
s1 = (-1000x + 6000)s2 = (1000x - 6000)
Nonlinear Neural Models
![Page 173: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/173.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (32)s1 + (0x+20)s2
s1 = (-1000x + 6000)s2 = (1000x - 6000)
Nonlinear Neural Models
![Page 174: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/174.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (32)s1 + (20)s2
s1 = (-1000x + 6000)s2 = (1000x - 6000)
Nonlinear Neural Models
![Page 175: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/175.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (32)s1 + (20)s2
s1 = (-3000)s2 = (1000x - 6000)
Nonlinear Neural Models
![Page 176: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/176.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (32)s1 + (20)s2
s1 = (-3000)s2 = (3000)
Nonlinear Neural Models
![Page 177: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/177.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = (32)0 + (20)1
s1 = (-3000)s2 = (3000)
Nonlinear Neural Models
![Page 178: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/178.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data y = 20
s1 = (-3000)s2 = (3000)
Nonlinear Neural Models
![Page 179: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/179.jpg)
Data
0
1
16
5
20
6
?
3
If you give me too many apples, I will give them to...
Nonlinear Neural Models
![Page 180: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/180.jpg)
Data
0
1
16
5
20
6
?
3
Count Von Count
Nonlinear Neural Models
![Page 181: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/181.jpg)
Multilayer Perceptrons
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
Data
x
y y = (4x - 4)s1 + (0x+20)s2
![Page 182: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/182.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
x
y y = (4x - 4)s1 + (0x+20)s2
Multilayer Perceptrons
![Page 183: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/183.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (4x - 4)s1 + (0x+20)s2 + (0x+1)s3 s1 = (-1000x + 6000)s2 = ????s3 = (1000x - 15000)
Multilayer Perceptrons
![Page 184: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/184.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (4x - 4)s1 + (0x+20)s2 + (0x+1)s3 s1 = (-1000x + 6000)s2 = not s1 and not s3
s3 = (1000x - 15000)
Multilayer Perceptrons
![Page 185: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/185.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
s1 = (w4x + b4)s2 = (w5s1 + w6s3 + b5)s3 = (w7x + b6)
Multilayer Perceptrons
![Page 186: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/186.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
s1 = (w4x + b4)s2 = (w5s1 + w6s3 + b5)s3 = (w7x + b6)
Layer 1 Perceptron
Layer 1 Perceptron
Multilayer Perceptrons
![Page 187: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/187.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
s1 = (w4x + b4)s2 = (w5s1 + w6s3 + b5)s3 = (w7x + b6)
Layer 2 Perceptron
Layer 1 Perceptron
Layer 1 Perceptron
Multilayer Perceptrons
![Page 188: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/188.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (4x - 4)s1 + (0x+20)s2 + (0x+1)s3 s1 = (-1000x + 6000)s2 = not s1 and not s3
s3 = (1000x - 15000)
Multilayer Perceptrons
![Page 189: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/189.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (4x - 4)s1 + (0x+20)s2 + (0x+1)s3 s1 = (-1000x + 6000)s2 = (-1000s1 - 1000s3 + 500)s3 = (1000x - 15000)
Multilayer Perceptrons
![Page 190: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/190.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (4x - 4)s1 + (0x+20)s2 + (0x+1)s3 s1 = (-1000x + 6000)s2 = (-1000s1 - 1000s3 + 500)s3 = (1000x - 15000)
Multilayer Perceptrons
![Page 191: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/191.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (40)s1 + (20)s2 + (1)s3 s1 = (-1000x + 6000)s2 = (-1000s1 - 1000s3 + 500)s3 = (1000x - 15000)
Multilayer Perceptrons
![Page 192: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/192.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (40)s1 + (20)s2 + (1)s3 s1 = (-5000) = 0s2 = (-1000s1 - 1000s3 + 500)s3 = (-4000) = 0
Multilayer Perceptrons
![Page 193: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/193.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (40)s1 + (20)s2 + (1)s3 s1 = (-5000) = 0s2 = (-1000s4 - 1000s5 + 500)s3 = (-4000) = 0
Multilayer Perceptrons
![Page 194: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/194.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (40)s1 + (20)s2 + (1)s3 s1 = (-5000) = 0s2 = (500)s3 = (-4000) = 0
Multilayer Perceptrons
![Page 195: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/195.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (40)s1 + (20)s2 + (1)s3 s1 = (-5000) = 0s2 = (500) = 1s3 = (-4000) = 0
Multilayer Perceptrons
![Page 196: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/196.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (40)0 + (20)1 + (1)0s1 = (-5000) = 0s2 = (500) = 1s3 = (-4000) = 0
Multilayer Perceptrons
![Page 197: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/197.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = 20s1 = (-5000) = 0s2 = (500) = 1s3 = (-4000) = 0
Multilayer Perceptrons
![Page 198: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/198.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (4x - 4)s1 + (0x+20)s2 + (0x+1)s3 s1 = (-1000x + 6000)s2 = (-1000s1 - 1000s3 + 500)s3 = (1000x - 15000)
Multilayer Perceptrons
![Page 199: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/199.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (772)s1 + (20)s2 + (1)s3 s1 = (-1000x + 6000)s2 = (-1000s4 - 1000s5 + 500)s3 = (1000x - 15000)
Multilayer Perceptrons
![Page 200: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/200.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (772)s1 + (20)s2 + (1)s3 s1 = (-13000) = 0s2 = (-1000s4 - 1000s5 + 500)s3 = (4000) = 1
Multilayer Perceptrons
![Page 201: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/201.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (772)s1 + (20)s2 + (1)s3 s1 = (-13000) = 0s2 = (-1000 + 0 + 500)s3 = (4000) = 1
Multilayer Perceptrons
![Page 202: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/202.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (772)s1 + (20)s2 + (1)s3 s1 = (-13000) = 0s2 = (-500) = 0s3 = (4000) = 1
Multilayer Perceptrons
![Page 203: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/203.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = (772)0 + (20)0 + (1)1s1 = (-13000) = 0s2 = (-500) = 0s3 = (4000) = 1
Multilayer Perceptrons
![Page 204: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/204.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
y = 1s1 = (-13000) = 0s2 = (-500) = 0s3 = (4000) = 1
Multilayer Perceptrons
![Page 205: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/205.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
x
yy = (4x - 4)s1 + (0x+20)s2 + (0x+1)s3
Multilayer Perceptrons
![Page 206: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/206.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
s1 = (w4x + b4)s2 = (w5s1 + w6s3 + b5)s3 = (w7x + b6)
Layer 2 Perceptron
Layer 1 Perceptron
Layer 1 Perceptron
Multilayer Perceptrons
![Page 207: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/207.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
s1 = (w4x + b4)s2 = (w5s1 + w6s3 + b5)s3 = (w7x + b6)
x
s1
s3
s2
w4x
b4
Multilayer Perceptrons
![Page 208: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/208.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
s1 = (w4x + b4)s2 = (w5s1 + w6s3 + b5)s3 = (w7x + b6)
x
s2
w4x
b4
w7x
b5
s1
s3
Multilayer Perceptrons
![Page 209: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/209.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
s1 = (w4x + b4)s2 = (w5s1 + w6s3 + b5)s3 = (w7x + b6)
x
s2
s1
s3
w6s3w5s1
b5
Multilayer Perceptrons
![Page 210: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/210.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
x
s2
s1
s3x < 6 x > 15
!(x > 15) & !(x < 6)
Multilayer Perceptrons
![Page 211: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/211.jpg)
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
x
s2
s1
s3x < 6 x > 15
x∈[6,15]
Multilayer Perceptrons
![Page 212: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/212.jpg)
x
s2
s1
s3x < 6 x > 15
x∈[6,15]
s4
x∈]-∞,6] & ]15,∞]
Multilayer Perceptrons
![Page 213: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/213.jpg)
x
s5
s1
s2x < 6 x > 15
x∈[6,15]
s3 x > 2
s4 x < 3
s7
s6
s7
x∈]-∞,6] & ]15,∞] x∈[2,15] x∈[2,3]
Multilayer Perceptrons
![Page 214: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/214.jpg)
x
s5
s1
s2x < 6 x > 15
x∈[6,15]
s3 x > 2
s4 x < 3
s7
s6
s7
x∈]-∞,6] & ]15,∞] x∈[2,15] x∈[2,3]
Input
Layer 1 (Input Features)
Layer 2 (And and Or Combinations)
Multilayer Perceptrons
![Page 215: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/215.jpg)
x
s5
s1
s2x < 6 x > 15
x∈[6,15]
s3 x > 2
s4 x < 3
s7
s6
s7
x∈]-∞,6] & ]15,∞] x∈[2,15] x∈[2,3]
Input
Layer 1 (Input Features)
Layer 2 (And and Or Combinations)
And(s1,s2) = (1000s1 + 1000s3 - 1500)Or(s1,s2) = (1000s1 + 1000s3 - 500)
Multilayer Perceptrons
![Page 216: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/216.jpg)
x
s5
s1
s2
s3
s4
s7
s6
s7
Input
Layer 1 (Input Features)
Layer 2 (And and Or Combinations)
Layer 3 (Xor Combinations)s8
s9
sa
sb
Multilayer Perceptrons
![Page 217: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/217.jpg)
x
s5
s1
s2
s3
s4
s7
s6
s7
Input
Layer 1 (Input Features)
Layer 2 (And and Or Combinations)
Layer 3 (Xor Combinations)s8
s9
sa
sb
Xor(s1,s2) = Or(And(s1,!s2), And(!s1,s2))
Multilayer Perceptrons
![Page 218: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/218.jpg)
x
s5
s1
s2
s3
s4
s7
s6
s7
Input
Layer 1 (Input Features)
Layer 2 (And and Or Combinations)
Layer 3 (Xor Combinations)s8
s9
sa
sb
Xor(s1,s2) = Or(s5, s6)
Multilayer Perceptrons
![Page 219: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/219.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
x
y
Universal approximator
Multilayer Perceptrons
![Page 220: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/220.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
x
y
but...
Multilayer Perceptrons
![Page 221: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/221.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 9 20
4 11 20
5 15 1
6 19 1
Data
x
y
No guarantee that the best function will
be found
Multilayer Perceptrons
![Page 222: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/222.jpg)
x
s5
s1
s2x > 1 x < 2
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
x∈[5,6[ x∈[6,∞]
n x y
0 1 0
1 5 16
2 6 20
y
Multilayer Perceptrons
![Page 223: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/223.jpg)
x
s5
s1
s2x > 1 x < 2
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
x∈[5,6[ x∈[6,∞]
n x y
0 1 0
1 5 16
2 6 20
y = 0s5 + 16s6 + 20s7
y
Multilayer Perceptrons
![Page 224: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/224.jpg)
x
s5
s1
s2x > 1 x < 2
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
x∈[5,6[ x∈[6,∞]
n x y
0 1 0
1 5 16
2 6 20
y
y = 0s5 + 16s6 + 20s7
Multilayer Perceptrons
![Page 225: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/225.jpg)
x
s5
s1
s2x > 1 x < 2
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
x∈[5,6[ x∈[6,∞]
n x y
0 1 0
1 5 16
2 6 20
y
y = 0s5 + 16s6 + 20s7
Multilayer Perceptrons
![Page 226: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/226.jpg)
x
s5
s1
s2x > 1 x < 2
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
x∈[5,6[ x∈[6,∞]
n x y
0 1 0
1 5 16
2 6 20Overfitting
y = 0s5 + 16s6 + 20s7
Multilayer Perceptrons
y
Model Problem
![Page 227: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/227.jpg)
Task Complexity
Model Complexity
Multilayer Perceptrons
![Page 228: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/228.jpg)
Task Complexity
Model Complexity
Underfitting
Multilayer Perceptrons
![Page 229: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/229.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Multilayer Perceptrons
![Page 230: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/230.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Multilayer Perceptrons
![Page 231: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/231.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Line
ar R
egre
ssio
n
MLP
1 L
ayer
MLP
2 L
ayer
MLP
3 L
ayer
Multilayer Perceptrons
![Page 232: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/232.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Line
ar R
egre
ssio
n
Line
ar
Reg
ress
ion
mor
e fe
atur
es
Multilayer Perceptrons
![Page 233: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/233.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Line
ar R
egre
ssio
n
MLP
1 L
ayer
MLP
2 L
ayer
MLP
3 L
ayer
Multilayer Perceptrons
![Page 234: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/234.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Line
ar R
egre
ssio
n
MLP
1 L
ayer
MLP
2 L
ayer
MLP
3 L
ayer
Sentiment analysis
Multilayer Perceptrons
![Page 235: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/235.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Line
ar R
egre
ssio
n
MLP
1 L
ayer
MLP
2 L
ayer
MLP
3 L
ayer
Sentiment analysis
Machine Translation
Multilayer Perceptrons
![Page 236: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/236.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Data
Multilayer Perceptrons
![Page 237: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/237.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Data
Multilayer Perceptrons
![Page 238: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/238.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Data
Multilayer Perceptrons
![Page 239: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/239.jpg)
yn x y
0 1 0
1 5 16
2 6 20
y y
Multilayer Perceptrons
![Page 240: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/240.jpg)
yn x y
0 1 0
1 5 16
2 6 20
3 2 4
y y
Multilayer Perceptrons
![Page 241: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/241.jpg)
n x y
0 1 0
1 5 16
2 6 20
3 2 4
y y
Multilayer Perceptrons
![Page 242: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/242.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Model Bias
Multilayer Perceptrons
![Page 243: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/243.jpg)
Task Complexity
Model Complexity
Overfitting
Underfitting
Happy Zone
Model BiasL1 & L2 RegularizationStochastic Dropout (Srivastava et al, 2014)Model Structure (CNN, RNNs)
Multilayer Perceptrons
![Page 244: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/244.jpg)
Regularization
C(w,b) = ∑(yn-ŷn) + (w+b)ß
ß = Regularization constantn∈{0,1,2}
2
Multilayer Perceptrons
![Page 245: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/245.jpg)
x
s5
s1
s2x > 1 x < 2
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
x∈[5,6[ x∈[6,∞]
y
Regularization
Multilayer Perceptrons
![Page 246: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/246.jpg)
x
s5
s1
s2x > 1 nothing
x∈]-∞,1]
s3 nothing
s4 x < 6
s7
s6
nothing x∈[6,∞]
y
Regularization
Multilayer Perceptrons
![Page 247: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/247.jpg)
x
s5
s1
s2x > 1 nothing
x∈]-∞,1]
s3 nothing
s4 x < 6
s7
s6
nothing x∈[6,∞]
y
Regularization
Find solutions that require less effort
Multilayer Perceptrons
![Page 248: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/248.jpg)
x
s5
s1
s2x > 1 x < 2
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
x∈[5,6[ x∈[6,∞]
y
Stochastic Dropout (Srivastava et al, 2014)
Multilayer Perceptrons
![Page 249: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/249.jpg)
Stochastic Dropout (Srivastava et al, 2014)
x
s5
s1
s2x > 1 0
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
0 0
y
Multilayer Perceptrons
![Page 250: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/250.jpg)
Stochastic Dropout (Srivastava et al, 2014)
x
s5
s1
s2x > 1 0
x∈]-∞,1]
s3 x < 5
s4 x < 6
s7
s6
0 0
y Find robust models
Multilayer Perceptrons
![Page 251: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/251.jpg)
Model Structure
Weighted sum of linear functions VS MLP
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
Multilayer Perceptrons
![Page 252: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/252.jpg)
Model Structure
Weighted sum of linear functions VS MLP
y = (w1x + b1)s1 + (w2x+b2)s2 + (w3x+b3)s3
Convolutional Vs RNNs
Multilayer Perceptrons
![Page 253: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/253.jpg)
s1 = (w4x + b4)s2 = (w5s1 + w6s3 + b5)s3 = (w7x + b6)
x
s2
s1
s3
w6s3w5s1
b5
Representation
Multilayer Perceptrons
![Page 254: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/254.jpg)
s1 = (W3x + b3)s2 = (W4s1 + b4)
Representation
s1
s2
2
1
1xx
s2
s1
s3
Multilayer Perceptrons
![Page 255: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/255.jpg)
Representation
s1
s2
1000
1000
100x
s1 = (Ws2 + b)
Multilayer Perceptrons
![Page 256: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/256.jpg)
Representation
s1
s2
1000
1000
100x
s1 = (Ws2 + b)Tensoflow Code
s1 = tf.matmul(x, W1) + b1
s1 = tf.nn.sigmoid(s1)
s2 = tf.matmul(s1, W2) + b2
s2 = tf.nn.sigmoid(s2)
Multilayer Perceptrons
![Page 257: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/257.jpg)
Using Discrete Variables
Data
0
1
16
5
20
6
?
3
![Page 258: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/258.jpg)
Using Discrete Variables
Data
0
1
16
5
20
6
?
3
![Page 259: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/259.jpg)
Using Discrete Variables
Data
0
1
16
5
20
6
?
3
?
![Page 260: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/260.jpg)
Using Discrete Variables
x
s5
s1
s2
s3
s4
s7
s6
y
Number of fruit to offer
Number of fruit received
![Page 261: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/261.jpg)
Using Discrete Variablesx
y
Number of fruit to offer
Number of fruit received
s1
s2
![Page 262: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/262.jpg)
Using Discrete Variablesx
y
Number of fruit to offer
uType of fruit to offer
v Number of fruit receivedType of fruit received
s1
s2
![Page 263: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/263.jpg)
Using Discrete Variablesx
y
Number of fruit to offer
uType of fruit to offer
v Number of fruit receivedType of fruit received
s1
s2
u∈{Apple, Banana, Coconut}
v∈{Apple, Banana, Coconut}
![Page 264: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/264.jpg)
Using Discrete VariablesLookup Tables
e1 e2 e3 e4
Apple 0.1 -0.4 0.2 0.5
Banana 0.4 1.4 -1.0 0.1
Coconut 1.1 0.9 1.1 0.5
u
V = 3
![Page 265: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/265.jpg)
Using Discrete VariablesLookup Tables
e1 e2 e3 e4
Apple 0.1 -0.4 0.2 0.5
Banana 0.4 1.4 -1.0 0.1
Coconut 1.1 0.9 1.1 0.5
u
V = 3
![Page 266: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/266.jpg)
Using Discrete VariablesLookup Tables
e1 e2 e3 e4
Apple 0.1 -0.4 0.2 0.5
Banana 0.4 1.4 -1.0 0.1
Coconut 1.1 0.9 1.1 0.5
u
Embedding for u Size = 4
V = 3
![Page 267: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/267.jpg)
Using Discrete VariablesLookup Tables
e1 e2 e3 e4
Apple 0.1 -0.4 0.2 0.5
Banana 0.4 1.4 -1.0 0.1
Coconut 1.1 0.9 1.1 0.5
u
Embedding for u
Banana
Size = 4
V = 3
![Page 268: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/268.jpg)
Using Discrete VariablesLookup Tables
e1 e2 e3 e4
0 0.1 -0.4 0.2 0.5
1 0.4 1.4 -1.0 0.1
2 1.1 0.9 1.1 0.5
u
Embedding for u
1
Size = 4
V = 3
![Page 269: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/269.jpg)
Using Discrete VariablesLookup Tables
u
Embedding for u
1
Lookup
Size = 4
![Page 270: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/270.jpg)
Using Discrete Variablesx
y
Number of fruit to offer
uType of fruit to offer
v Number of fruit receivedType of fruit received
s1
s2
u∈{Apple, Banana, Coconut}
v∈{Apple, Banana, Coconut}
eu
Lookup
![Page 271: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/271.jpg)
Using Discrete VariablesSoftmax
V = 3
Apple Banana Coconut
w1 0.1 -0.4 0.2
w2 0.4 1.4 -1.0
w3 1.1 0.9 1.1
w4 1.3 0.1 0.4
![Page 272: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/272.jpg)
Using Discrete VariablesSoftmax
Input vector Size = 4V = 3
Apple Banana Coconut
w1 0.1 -0.4 0.2
w2 0.4 1.4 -1.0
w3 1.1 0.9 1.1
w4 1.3 0.1 0.4
![Page 273: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/273.jpg)
Using Discrete VariablesSoftmax
Input vector Size = 4
logits Size = V
V = 3
Apple Banana Coconut
w1 0.1 -0.4 0.2
w2 0.4 1.4 -1.0
w3 1.1 0.9 1.1
w4 1.3 0.1 0.4
![Page 274: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/274.jpg)
Using Discrete VariablesSoftmax
Input Vector
Logits
V = 3
Apple Banana Coconut
w1 0.1 -0.4 0.2
w2 0.4 1.4 -1.0
w3 1.1 0.9 1.1
w4 1.3 0.1 0.4
s1
s2
s3
s4
d1
d2
d3
1 -1 -2
![Page 275: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/275.jpg)
Using Discrete VariablesSoftmax
Input Vector
Logits
V = 3
Apple Banana Coconut
w1 0.1 -0.4 0.2
w2 0.4 1.4 -1.0
w3 1.1 0.9 1.1
w4 1.3 0.1 0.4
s1
s2
s3
s4
d1
d2
d3
1 -1 -2
p1
p2
p2
0.84 0.11 0.05
![Page 276: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/276.jpg)
Using Discrete VariablesSoftmax
Input Vector
Logits
V = 3
Apple Banana Coconut
w1 0.1 -0.4 0.2
w2 0.4 1.4 -1.0
w3 1.1 0.9 1.1
w4 1.3 0.1 0.4
s1
s2
s3
s4
d1
d2
d3
1 -1 -2
p1
p2
p2
0.84 0.11 0.05
Apple
![Page 277: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/277.jpg)
Using Discrete Variablesx
y
Number of fruit to offer
uType of fruit to offer
v Number of fruit receivedType of fruit received
s1
s2
u∈{Apple, Banana, Coconut}
v∈{Apple, Banana, Coconut}
eu
Softmax
Lookup
![Page 278: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/278.jpg)
Using Discrete Variablesx
y
Number of fruit to offer
uType of fruit to offer
v Number of fruit receivedType of fruit received
s1
s2
u∈{Apple, Banana, Coconut}
v∈{Apple, Banana, Coconut}
eu
Softmax
Lookup
![Page 279: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/279.jpg)
Example Applications
Window-based Tagging (Collobert et al, 2011)
Abby likes to eat apples and bananas
NNP VBZ TO VB NNS CC NNS
![Page 280: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/280.jpg)
Example Applications
Window-based Tagging (Collobert et al, 2011)
Abby likes to eat apples and bananas
e-2 e-1 e-0 e1 e2
![Page 281: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/281.jpg)
Example Applications
Window-based Tagging (Collobert et al, 2011)
Abby likes to eat apples and bananas
e-2 e-1 e-0 e1 e2 Word Embeddings
Non-Linear Layer 1s1
s2 Non-Linear Layer 2
![Page 282: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/282.jpg)
Example Applications
Window-based Tagging (Collobert et al, 2011)
Abby likes to eat apples and bananas
e-2 e-1 e-0 e1 e2 Word Embeddings
Non-Linear Layer 1s1
s2 Non-Linear Layer 2
VB Softmax
![Page 283: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/283.jpg)
Example Applications
Window-based Tagging (Collobert et al, 2011)
Abby likes to eat apples and bananas
e-2 e-1 e-0 e1 e2 Word Embeddings
Non-Linear Layer 1s1
s2 Non-Linear Layer 2
VB Softmax
![Page 284: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/284.jpg)
Example Applications
Window-based Tagging (Collobert et al, 2011)
![Page 285: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/285.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
![Page 286: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/286.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
ContextPredict
![Page 287: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/287.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
e-4 e-3 e-2 e-1
s1
s2
Softmax
![Page 288: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/288.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
0.2<s>
![Page 289: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/289.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
0.10.2
![Page 290: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/290.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
0.10.2 0.3
![Page 291: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/291.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
0.10.2 0.3 0.5 0.7 0.4 0.20.000378
![Page 292: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/292.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas 0.000378
Abby dislikes to drink apples and bananas 0.00012
John does to eat coconuts and bananas 0.00003
![Page 293: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/293.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas 0.000378
Abby dislikes to drink apples and bananas 0.00012
John does to eat coconuts and bananas 0.00003
![Page 294: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/294.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
ContextPredict
Translation
Source
Abby gosta de comer macas e bananas
![Page 295: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/295.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
ContextPredict
Translation
Source
Abby gosta de comer macas e bananas
![Page 296: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/296.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Abby likes to eat apples and bananas
Translation
macas
e-4 e-3 e-2 e-1
s1
s2
f-1
![Page 297: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/297.jpg)
Example Applications
Translation Rescoring (Devlin et al, 2014)
Translation Score (BLEU) Arabic - English Chinese - English
Best Rescored System 52.8 34.7
1st OpenMT12 49.5 32.6
Hierarchical 43.4 30.1
![Page 298: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/298.jpg)
![Page 299: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/299.jpg)
Deep Neural Networks are our friends?Convolutional Neural Network
![Page 300: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/300.jpg)
Deep Neural Networks are our friends?Convolutional Neural Network
x1 x2 x3 x4
x5 x6 x7 x8
x9 x10 x11 x12
x13 x14 x15 x16
4x4 image
![Page 301: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/301.jpg)
Deep Neural Networks are our friends?Convolutional Neural Network
x1 x2 x3 x4
x5 x6 x7 x8
x9 x10 x11 x12
x13 x14 x15 x16
4x4 image
![Page 302: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/302.jpg)
Deep Neural Networks are our friends?Convolutional Neural Network
x1 x2 x3 x4
x5 x6 x7 x8
x9 x10 x11 x12
x13 x14 x15 x16
4x4 image
z1
x1
x2
...
x11
z1
w9
w1
![Page 303: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/303.jpg)
Deep Neural Networks are our friends?Convolutional Neural Network
x1 x2 x3 x4
x5 x6 x7 x8
x9 x10 x11 x12
x13 x14 x15 x16
4x4 image
z1 z2
x2
x3
...
x12
z1
w1
w9
![Page 304: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/304.jpg)
Deep Neural Networks are our friends?Convolutional Neural Network
x1 x2 x3 x4
x5 x6 x7 x8
x9 x10 x11 x12
x13 x14 x15 x16
4x4 image
z1 z2
z3 z4
![Page 305: Deep Neural Networks Are Our Friendslxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf · 2016. 7. 26. · Gradients are our friends Computation Graphs are our friends Outline](https://reader034.vdocument.in/reader034/viewer/2022051603/5ff36ec78c35bc368117ce69/html5/thumbnails/305.jpg)
Deep Neural Networks are our friends?Convolutional Neural Network
x1 x2 x3 x4
x5 x6 x7 x8
x9 x10 x11 x12
x13 x14 x15 x16
4x4 image
z1 z2
z3 z4
z1
z2
z3
z4
y Is this a cat?