basic structure: fully connected feedforward...
TRANSCRIPT
![Page 1: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/1.jpg)
Network StructureHung-yi Lee
李宏毅
![Page 2: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/2.jpg)
Step 1: Neural Network
Step 2: Cost Function
Step 3: Optimization
Three Steps for Deep Learning
Step 1. A neural network is a function composed of simple functions (neurons)
➢ Usually we design the network structure, and let machine find parameters from data
Step 2. Cost function evaluates how good a set of parameters is
Step 3. Find the best function set (e.g. gradient descent)
➢ We design the cost function based on the task
![Page 3: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/3.jpg)
Outline
• Basic structure (3/03)
• Fully Connected Layer
• Recurrent Structure
• Convolutional/Pooling Layer
• Special Structure (3/17)
• Spatial Transformation Layer
• Highway Network / Grid LSTM
• Recursive Structure
• Batch Normalization
• Sequence-to-sequence / Attention (3/24)
![Page 4: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/4.jpg)
Prerequisite
• Brief Introduction of Deep Learning
• https://youtu.be/Dr-WRlEFefw?list=PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49
• Convolutional Neural Network
• https://youtu.be/FrKWiRv254g?list=PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49
• Recurrent Neural Network (Part I)
• https://youtu.be/xCGidAeyS4M?list=PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49
• Recurrent Neural Network (Part II)
• https://www.youtube.com/watch?v=rTqmWlnwz_0&list=PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49&index=25
![Page 5: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/5.jpg)
Basic Structure:Fully Connected Layer
![Page 6: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/6.jpg)
Fully Connected Layer
……
nodeslN
Layer l
1
1
la
1
2
la
1l
ja……
……
Layer 1l
nodes1lN
la1
la2
l
ia……
l
ia
Output of a neuron:
Neuron i
Layer l
Output of one layer:
la : a vector
1
2
j
1
2
i
![Page 7: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/7.jpg)
Fully Connected Layer
……
nodeslN
Layer l
……
……
Layer 1l
nodes1lN
……
l
ijw
1
2
j
1
2
i
l
ijwLayer 1lto Layer l
from neuron j
to neuron i
(Layer )1l(Layer )l
ll
ll
l ww
ww
W 2221
1211
lN
1N l
1
1
la
1
2
la
1l
ja
la1
la2
l
ia
![Page 8: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/8.jpg)
Fully Connected Layer
……
nodeslN
Layer l
……
……
Layer 1l
nodes1lN
……
1
2
j
1
2
i
1
1
la
1
2
la
1l
ja
la1
la2
l
ia
l
ib : bias for neuron i at layer l
l
ib
1
lb1
lb2
l
i
l
l
l
b
b
b
b
2
1
bias for all neurons in layer l
![Page 9: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/9.jpg)
Fully Connected Layer
……
nodeslN
Layer l
1
1
la
1
2
la
1l
ja……
Layer 1l
nodes1lN
l
ia……
1
2
j i
l
iz
l
i
N
j
l
j
l
ij
l
i bawzl
1
1
1
: input of the activation function for neuron i at layer l
l
ijw
l
iw 2
l
iw 1 l
i
ll
i
ll
i
l
i bawawz 1
22
1
11
l
iz
l
ib
1
: input of the activation function all the neurons in layer l
lz
![Page 10: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/10.jpg)
Relations between Layer Outputs
……
nodeslN
Layer l
……
……
Layer 1l
nodes1lN
……
1
2
j
1
2
i
1
1
la
1
2
la
1l
ja
la1
la2
l
ia
lz1
lz2
l
iz
lalz1la
![Page 11: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/11.jpg)
Relations between Layer Outputs
……
nodeslN
Layer l
……
……
Layer 1l
nodes1lN
……
1
2
j
1
2
i
1
1
la
1
2
la
1l
ja
la1
la2
l
ia
lz1
lz2
l
iz
lalz1la
l
i
l
l
l
i
l
l
ll
ll
l
i
l
l
b
b
b
a
a
a
ww
ww
z
z
z
2
1
1
1
2
1
1
2221
12112
1
llll baWz 1
llllll bawawz 1
1
212
1
1111 llllll bawawz 2
1
222
1
1212
l
i
ll
i
ll
i
l
i bawawz 1
22
1
11
……
![Page 12: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/12.jpg)
Relations between Layer Outputs
l
i
l
l
l
i
l
l
z
z
z
a
a
a
2
1
2
1
l
i
l
i za
ll za
……
nodeslN
Layer l
……
……
Layer 1l
nodes1lN
……
1
2
j
1
2
i
1
1
la
1
2
la
1l
ja
la1
la2
l
ia
lz1
lz2
l
iz
lalz
![Page 13: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/13.jpg)
Relations between Layer Outputs
……
nodeslN
Layer l
……
……
Layer 1l
nodes1lN
……
1
2
j
1
2
i
1
1
la
1
2
la
1l
ja
la1
la2
l
ia
lz1
lz2
l
iz
lalz1la
llll baWz 1
ll za
llll baWa 1
![Page 14: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/14.jpg)
Basic Structure:Recurrent Structure
Simplify the networkby using the same function again and again
![Page 15: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/15.jpg)
Reference
https://www.cs.toronto.edu/~graves/preprint.pdf
K. Greff, R. K. Srivastava, J. Koutník, B. R.
Steunebrink, J. Schmidhuber, "LSTM: A Search
Space Odyssey," in IEEE Transactions on
Neural Networks and Learning Systems, 2016
Rafal Józefowicz, Wojciech Zaremba, Ilya
Sutskever, “An Empirical Exploration of
Recurrent Network Architectures,”
in ICML, 2015
![Page 16: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/16.jpg)
Recurrent Neural Network
• Given function f: ℎ′, 𝑦 = 𝑓 ℎ, 𝑥
fh0 h1
y1
x1
f h2
y2
x2
f h3
y3
x3
……
No matter how long the input/output sequence is, we only need one function f
h and h’ are vectors with the same dimension
![Page 17: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/17.jpg)
Deep RNN
f1h0 h1
y1
x1
f1 h2
y2
x2
f1 h3
y3
x3
……
f2b0 b1
c1
f2 b2
c2
f2 b3
c3
……
… ……
ℎ′, 𝑦 = 𝑓1 ℎ, 𝑥 𝑏′, 𝑐 = 𝑓2 𝑏, 𝑦 …
![Page 18: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/18.jpg)
f1h0 h1
a1
x1
f1 h2
a2
x2
f1 h3
a3
x3
f2b0 b1 f2 b2 f2 b3
Bidirectional RNN
x1 x2 x3
c1 c2 c3
f3 f3 f3y1 y2 y3
ℎ′, 𝑎 = 𝑓1 ℎ, 𝑥 𝑏′, 𝑐 = 𝑓2 𝑏, 𝑥
𝑦 = 𝑓3 𝑎, 𝑐
![Page 19: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/19.jpg)
Pyramidal RNN
• Reducing the number of time steps
W. Chan, N. Jaitly, Q. Le and O. Vinyals, “Listen, attend and spell: A neural
network for large vocabulary conversational speech recognition,” ICASSP, 2016
![Page 20: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/20.jpg)
Naïve RNN
• Given function f: ℎ′, 𝑦 = 𝑓 ℎ, 𝑥
fh h'
y
x
Ignore bias here
h'
y Wo
Wh
h'= 𝜎
softmax
= 𝜎 h Wi x+
![Page 21: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/21.jpg)
LSTM
c change slowly
h change faster
ct is ct-1 added by something
ht and ht-1 can be very different
Naive ht
yt
xt
ht-1
LSTM
yt
xt
ct
htht-1
ct-1
![Page 22: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/22.jpg)
xt
zzizf zo
ht-1
ct-1
zxt
ht-1W= 𝑡𝑎𝑛ℎ
zixt
ht-1Wi= 𝜎
zfxt
ht-1Wf= 𝜎
zoxt
ht-1Wo= 𝜎
![Page 23: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/23.jpg)
xt
zzizf zo
ht-1
ct-1
“peephole”
z W= 𝑡𝑎𝑛ℎ
xt
ht-1
ct-1
diagonal
zizfzo obtained by the same way
![Page 24: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/24.jpg)
ht
𝑐𝑡 = 𝑧𝑓⨀𝑐𝑡−1+𝑧𝑖⨀𝑧
xt
zzizf zo
+
yt
ht-1
ct-1 ct
ℎ𝑡 = 𝑧𝑜⨀𝑡𝑎𝑛ℎ 𝑐𝑡⨀
⨀ ⨀tanh
𝑦𝑡 = 𝜎 𝑊′ℎ𝑡
![Page 25: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/25.jpg)
LSTM
xt
zzizf zo
⨀ +
yt
ht-1
ct-1 ct
xt+1
zzizf zo
+
yt+1
ht
ct+1
⨀
⨀ ⨀
⨀
⨀tanh tanh
ht
![Page 26: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/26.jpg)
ht-1
GRU
r z
yt
xtht-1
h'
⨀
xt
⨀
⨀1-
+ ht
reset update
ℎ𝑡 = 𝑧⨀ℎ𝑡 + 1 − 𝑧 ⨀ℎ′
![Page 27: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/27.jpg)
Example Task
• (Simplified) Speech Recognition: Frame classification on TIMIT
TSI TSI TSI I I N N N
x1 x2 x3 x4
y1 y2 y3 y4 ……
……
S S @ @ @ @
x1 x2 x3 x4
y1 y2 y3 y4 ……
……
Utterance 1 Utterance 2
![Page 28: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/28.jpg)
Target Delay
• Only for unidirectional RNN
TSI TSI TSI I I N N NTrue labels:
Delay 3 steps: TSI TSI TSI I I N N Nx x x
![Page 29: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/29.jpg)
LSTM > RNN > feedforward
Bi-direction > uni-direction
![Page 30: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/30.jpg)
![Page 31: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/31.jpg)
Forward direction
Reverse direction
![Page 32: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/32.jpg)
Training LSTM is faster than RNN
![Page 33: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/33.jpg)
![Page 34: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/34.jpg)
LSTM: A Search Space Odyssey
Standard LSTM works wellSimply LSTM: coupling input and forget gate, removing peepholeForget gate is critical for performanceOutput gate activation function is critical
![Page 35: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/35.jpg)
An Empirical Exploration of Recurrent Network
Architectures
Importance: forget > input > output
Large bias for forget gate is helpful
LSTM-f/i/o: removing forget/input/output gates
LSTM-b: large bias
![Page 36: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/36.jpg)
An Empirical Exploration of Recurrent Network
Architectures
![Page 37: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/37.jpg)
Stack RNN
Armand Joulin, Tomas Mikolov, Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets, arXiv Pre-Print, 2015
stack
xt
yt
……
f
Push, Pop, Nothing0.7 0.2 0.1
Information to store
Pop NothingPush
X0.7 X0.2 X0.1
+ +… … …
![Page 38: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/38.jpg)
Basic Structure:Convolutional / Pooing
Layer
Simplify the neural network(based on prior knowledge of the task)
![Page 39: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/39.jpg)
Convolutional Layer
……
11
22
Sparse Connectivity
3
4
5
Each neural only connects to part of the output of the previous layer
3
4
Receptive Field
Different neurons have
different, but overlapping,
receptive fields
![Page 40: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/40.jpg)
Convolutional Layer
……
11
22
Sparse Connectivity
3
4
5
Each neural only connects to part of the output of the previous layer
3
4
Parameter Sharing
The neurons with different receptive fields can use the same set of parameters.
Less parameters then fully connected layer
![Page 41: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/41.jpg)
Convolutional Layer
……
11
22
3
4
5
3
4
Considering neuron 1 and 3 as “filter 1” (kernel 1)
filter (kernel) size: size of the receptive field of a neuron
Stride = 2
Considering neuron 2 and 4 as “filter 2” (kernel 2)
Kernel size, no. of filter, stride are all designed by the developers.
![Page 42: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/42.jpg)
Example –1D Signal + Single Channel
1 2 3 4
𝑥1 𝑥2 𝑥3 𝑥4𝑥5
Classification, Predict the future …
Audio Signal, Stock Value …
![Page 43: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/43.jpg)
Example –1D Signal + Multiple Channel
1
2
3
A document: each word is a vector
I
like
this
movie
very
much4
……
𝑥1 𝑥2 𝑥3 𝑥4 𝑥5
𝑥6 𝑥7
Does this kind of receptive field make sense?
![Page 44: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/44.jpg)
Example –2D Signal + Single Channel
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 black & white picture image
1:
2:
3:
…
7:
8:
9:…
13:
14:
15:
…
4:
10:
16:
1
0
0
0
0
1
0
0
0
0
1
1Only show 1 filter here
Size of Receptive field is 3x3, Stride is 1
![Page 45: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/45.jpg)
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
Example –2D Signal + Multiple Channel
6 x 6 colorful image
1:
2:
3:
…
7:
8:
9:
…
13:
14:
15:
…
4:
10:
16:
1
0
0
0
0
1
0
0
0
0
1
1
Only show 1 filter here
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
Size of Receptive field is 3x3x3, Stride is 1
1
0
1
0
0
1
1
0
1
1
0
1
0
0
0
0
0
0
0
0
1
1
1
0
![Page 46: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/46.jpg)
Zero Padding
Without Zero Padding
![Page 47: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/47.jpg)
Pooling Layer
nodeskN /
Layer lLayer 1l
nodesN
…
…
1
1
k
…
1k
k2
…
2
k outputs in layer 𝑙 − 1 are grouped together
Each output in layer 𝑙“summarizes” k inputs1
1
la
1l
kala1
k
j
l
j
l ak
a1
1
1
1
Average Pooling:
Max Pooling:
L2 Pooling:
11
2
1
11 ,,,max l
k
lll aaaa
k
j
l
j
l ak
a1
21
1
1
![Page 48: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/48.jpg)
Pooling LayerWhich outputs should be grouped together?
Group the neurons corresponding to the same filter with nearby receptive fields
……
11
22
3
4
5
3
4
Convolutional Layer
PoolingLayer
1
2
Subsampling
![Page 49: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/49.jpg)
Pooling LayerWhich outputs should be grouped together?
Group the neurons with the same receptive field
……
11
22
3
4
5
3
4
Convolutional Layer
PoolingLayer
1
2
Maxout Network
How can you know whether the neurons detect the same pattern?
![Page 50: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/50.jpg)
Tara N. Sainath, Ron J. Weiss, Andrew Senior, Kevin W. Wilson, Oriol Vinyals, “Learning the Speech Front-end With Raw Waveform CLDNNs,” In INTERPSEECH 2015
Combination of Different Basic Layers
![Page 51: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/51.jpg)
Tara N. Sainath, Ron J. Weiss, Andrew Senior, Kevin W. Wilson, Oriol Vinyals, “Learning the Speech Front-endWith Raw Waveform CLDNNs,” In INTERPSEECH 2015
Combination of Different Basic Layers
![Page 52: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/52.jpg)
Tara N. Sainath, Ron J. Weiss, Andrew Senior, Kevin W. Wilson, Oriol Vinyals, “Learning the Speech Front-endWith Raw Waveform CLDNNs,” In INTERPSEECH 2015
Combination of Different Basic Layers
3 layers
![Page 53: Basic Structure: Fully Connected Feedforward Networkspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/Basic Structure (v8).pdf · Outline •Basic structure (3/03) •Fully Connected](https://reader030.vdocument.in/reader030/viewer/2022040217/5dd0a855d6be591ccb620f1b/html5/thumbnails/53.jpg)
Next Time
• 3/10: TAs will teach TensorFlow
• TensorFlow for regression
• TensorFlow for word vector
• word vector: https://www.youtube.com/watch?v=X7PH3NuYW0Q
• TensorFlow for CNN
• If you want to learn Theano
• http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Lecture/Theano%20DNN.ecm.mp4/index.html
• http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Lecture/Theano%20RNN.ecm.mp4/index.html