applied deep learning 11/03 convolutional neural networks
TRANSCRIPT
SlidecreditfromMarkChang 1
ConvolutionalNeuralNetworks• Weneedacoursetotalkaboutthistopic◦ http://cs231n.stanford.edu/syllabus.html
• However, we only have a lecture
2
Outline• CNN(ConvolutionalNeuralNetworks)Introduction
• EvolutionofCNN
• VisualizingtheFeatures
• CNNasArtist
• SentimentAnalysisbyCNN
3
Outline• CNN(ConvolutionalNeuralNetworks)Introduction
• EvolutionofCNN
• VisualizingtheFeatures
• CNNasArtist
• SentimentAnalysisbyCNN
4
ImageRecognition
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
5
ImageRecognition
6
LocalConnectivity
Neuronsconnecttoasmallregion
7
Parameter Sharing• Thesamefeatureindifferentpositions
Neuronssharethesameweights
8
Parameter Sharing• Differentfeaturesinthesameposition
Neuronshavedifferentweights
9
Convolutional Layers
depth
widthwidthdepth
weights weights
height
sharedweight
10
Convolutional Layers
c1
c2
b1
b2
a1
a2
a3
wb1
wb2
b1 =wb1a1+wb2a2
wb1
wb2
b2 =wb1a2+wb2a3
wc1
wc2
c1 =wc1a1+wc2a2
wc2
wc1
c2 =wc1a2+wc2a3
depth=2depth=1
11
Convolutional Layers
c1
b1
b2
a1
a2
d1
b3
a3
c2
d2
depth=2 depth=2
wc1
wc2
wc3
wc4
c1 = a1wc1 + b1wc2
+ a2wc3 + b2wc4
wc1
wc2
wc3
wc4
c2 = a2wc1 + b2wc2
+ a3wc3 + b3wc4
12
Convolutional Layers
c1
b1
b2
a1
a2
d1
b3
a3
c2
d2
c1 = a1wc1 + b1wc2
+ a2wc3 + b2wc4
c2 = a2wc1 + b2wc2
+ a3wc3 + b3wc4
wd1
wd2
wd3
wd4 d1 = a1wd1 + b1wd2
+ a2wd3 + b2wd4
wd1
wd2
wd3
wd4 d2 = a2wd1 + b2wd2
+ a3wd3 + b3wd4
depth=2 depth=2
13
Convolutional LayersA B C
A B C D
14
Hyper-parametersofCNN• Stride • Padding
0 0
Stride=1
Stride=2
Padding=0
Padding=1
15
Example
OutputVolume(3x3x2)
InputVolume(7x7x3)
Stride=2
Padding=1
http://cs231n.github.io/convolutional-networks/
Filter(3x3x3)
16
Convolutional Layers
http://cs231n.github.io/convolutional-networks/
17
Convolutional Layers
http://cs231n.github.io/convolutional-networks/
18
Convolutional Layers
http://cs231n.github.io/convolutional-networks/
19
RelationshipwithConvolution
y[n] =X
k
x[k]w[n� k]
x[n]
w[n]
n
n
y[n]
x[k]
k
k
w[0� k]
n
y[0] = x[�2]w[2] + x[�1]w[1] + x[0]w[0]
20
RelationshipwithConvolution
y[n] =X
k
x[k]w[n� k]
x[n]
w[n]
n
n
y[n]
x[k]
k
k
n
w[1� k]
y[1] = x[�1]w[2] + x[0]w[1] + x[2]w[0]
21
RelationshipwithConvolution
y[n] =X
k
x[k]w[n� k]
x[n]
w[n]
n
n
y[n]
x[k]
k
k
n
y[2] = x[0]w[2] + x[1]w[1] + x[2]w[0]
w[2� k]
22
RelationshipwithConvolution
y[n] =X
k
x[k]w[n� k]
x[n]
w[n]
n
n
y[n]
x[k]
k
k
n
w[4� k]
y[4] = x[2]w[2] + x[3]w[1] + x[4]w[0]
23
Nonlinearity• RectifiedLinear(ReLU)
nout
=
⇢nin
if nin
> 0
0 otherwise
nin n
2
664
14�31
3
775
2
664
1401
3
775ReLU
24
WhyReLU?• Easytotrain
• Avoidgradientvanishingproblem
Sigmoidsaturated
gradient≈0 ReLU notsaturated
25
WhyReLU?• Biologicalreason
strong stimulationReLU
weak stimulation
neuron t
v
strong stimulation
neuron t
v
weak stimulation
26
PoolingLayer
1 3 2 4
5 7 6 8
0 0 3 3
5 5 0 0
4 5
5 3
7 8
5 3
MaximumPooling
AveragePooling
Max(1,3,5,7)=7 Avg(1,3,5,7)=4
nooverlap
noweights
depth=1
Max(0,0,5,5)=5
27
Why“Deep” Learning?
28
VisualPerceptionofHuman
http://www.nature.com/neuro/journal/v8/n8/images/nn0805-975-F1.jpg
29
VisualPerceptionofComputer
ConvolutionalLayer
ConvolutionalLayer Pooling
Layer
PoolingLayer
ReceptiveFieldsReceptiveFields
InputLayer
30
VisualPerceptionofComputer
Input Layer
ConvolutionalLayerwith
ReceptiveFields:
Max-poolingLayerwith
Width=3,Height=3
FilterResponses
FilterResponses
Input Image31
Fully-Connected Layer• Fully-ConnectedLayers:Globalfeatureextraction
• Softmax Layer:Classifier
ConvolutionalLayer Convolutional
LayerPoolingLayer
PoolingLayer
InputLayer
InputImage
Fully-ConnectedLayer Softmax
Layer
5
7
ClassLabel
32
VisualPerceptionofComputer• Alexnet
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
http://vision03.csail.mit.edu/cnn_art/data/single_layer.png
33
Training• ForwardPropagation
n2 n1
n1(out)n2(out) n2(in) w21
34
n2(in) = w21n1(out)
n2(out) = g(n2(in)), g is activation function
Training• Updateweights
n2 n1JCost
function:
@J
@w21=
@J
@n2(out)
@n2(out)
@n2(in)
@n2(in)
@w21
w21 w21 � ⌘@J
@w21
) w21 w21 � ⌘@J
@n2(out)
@n2(out)
@n2(in)
@n2(in)
@w21
n2(out) n2(in) w21 n1(out)
35
@J
@w21=
@J
@n2(out)
@n2(out)
@n2(in)
@n2(in)
@w21
w21 w21 � ⌘@J
@w21
) w21 w21 � ⌘@J
@n2(out)
@n2(out)
@n2(in)
@n2(in)
@w21
Training• Updateweights
n2 n1JCost
function:
n2(out) n2(in) w21 n1(out)
w21 w21 � ⌘@J
@n2(out)
@n2(out)
@n2(in)
@n2(in)
@w21
) w21 w21 � ⌘@J
@n2(out)g0(n2(in))n1(out)
n2(out) = g(n2(in)), n2(in) = w21n1(out)
)@n2(out)
@n2(in)= g0(n2(in)),
@n2(in)
@w21= n1(out)
36
Training• Propagatetothepreviouslayer
n2 n1JCost
function:
@J
@n1(in)=
@J
@n2(out)
@n2(out)
@n2(in)
@n2(in)
@n1(out)
@n1(out)
@n1(in)
n2(out) n2(in) n1(out) n1(in)
37
TrainingConvolutional Layers• example:
a3
a2
a1
b2
b1wb1
wb1
wb2
wb2
output input
ConvolutionalLayer
To simplify the notations, in the following slides, we make:
b1 means b1(in), a1 means a1(out), and so on.
38
TrainingConvolutional Layers• Forwardpropagation
a3
a2
a1
b2
b1input
ConvolutionalLayer
b1 = wb1a1 + wb2a2
b2 = wb1a2 + wb2a3
wb1
wb1
wb2
wb2
39
TrainingConvolutional Layers• Updateweights
JCost
function:
a3
a2
a1
b2
b1
@J
@b1
@J
@b2
wb1
wb1
@b1@wb1
@b2@wb1
wb1 wb1 � ⌘(@J
@b1
@b1@wb1
+@J
@b2
@b2@wb1
)
40
TrainingConvolutional Layers• Updateweights
a3
a2
a1
b2
b1 wb1
wb1
b1 = wb1a1 + wb2a2
b2 = wb1a2 + wb2a3
@b1@wb1
= a1
@b2@wb1
= a2
wb1 wb1 � ⌘(@J
@b1a1 +
@J
@b2a2)
@J
@b1
@J
@b2
JCost
function:
41
TrainingConvolutional Layers• Updateweights
a3
a2
a1
b2
b1
wb2 wb2 � ⌘(@J
@b1
@b1@wb2
+@J
@b2
@b2@wb2
)
@b1@wb2
@b2@wb2
wb2
wb2
@J
@b1
@J
@b2
JCost
function:
42
TrainingConvolutional Layers• Updateweights
a3
a2
a1
b2
b1wb2
wb2
@b1@wb2
= a2
@b2@wb2
= a3
wb2 wb2 � ⌘(@J
@b1a2 +
@J
@b2a3)
b1 = wb1a1 + wb2a2
b2 = wb1a2 + wb2a3
@J
@b1
@J
@b2
JCost
function:
43
TrainingConvolutional Layers• Propagatetothepreviouslayer
JCost
function:
a3
a2
a1
b2
b1
@J
@b1
@J
@b2
@b1@a1
@b1@a2
@b2@a2
@b2@a3
@J
@b1
@b1@a1
@J
@b2
@b2@a3
@J
@b1
@b1@a2
+@J
@b2
@b2@a2
44
TrainingConvolutional Layers• Propagatetothepreviouslayer
JCost
function:
a3
a2
a1
b2
b1
@J
@b1
@J
@b2
b1 = wb1a1 + wb2a2
b2 = wb1a2 + wb2a3
@b1@a1
= wb1@b1@a2
= wb2
@b2@a2
= wb1
@b2@a3
= wb2
@J
@b1wb1
@J
@b1wb1 +
@J
@b2wb2
@J
@b2wb2
45
Max-Pooling LayersduringTraining• Poolinglayershavenoweights
• Noneedtoupdateweights
a3
a2
a1
b2
b1 a1 > a2
a2 > a3b2 = max(a2, a3)
b1 = max(a1, a2)
Max-pooling
46
=
⇢a2 if a2 � a3a3 otherwise @b2
@a2=
⇢1 if a2 � a30 otherwise
Max-Pooling LayersduringTraining• Propagatetothepreviouslayer
a3
a2
a1
b2
b1
@J
@b1
@J
@b2
@b1@a1
= 1
a2 > a3
@b2@a2
= 1
a1 > a2
@J
@b1@J
@b2
@b1@a2
= 0
@b2@a3
= 0
J
Costfunction:
47
Max-Pooling LayersduringTraining• ifa1 =a2 ??◦ Choosethenodewithsmallerindex
a3
a2
a1
b2
b1
@J
@b1
@J
@b2
@J
@b1@J
@b2J
Costfunction:
a1 = a2 = a3
48
Avg-Pooling LayersduringTraining• Poolinglayershavenoweights
• Noneedtoupdateweights
a3
a2
a1
b2
b1b1 =
1
2(a1 + a2)
b2 =1
2(a2 + a3)
@b2@a2
=1
2
@b2@a3
=1
2
Avg-pooling
49
Avg-Pooling LayersduringTraining• Propagatetothepreviouslayer
JCost
function:
a3
a2
a1
b2
b1
@J
@b1
@J
@b2
@b1@a1
=@b1@a2
=1
2
@b2@a2
=@b2@a3
=1
2
1
2
@J
@b1
1
2(@J
@b1+
@J
@b2)
1
2
@J
@b2
50
ReLUduringTraining
nout
=
⇢nin
if nin
> 0
0 otherwise
nin n
@nout
@nin
=
⇢1 if n
in
> 1
0 otherwise
51
Training CNN
52
Outline• CNN(ConvolutionalNeuralNetworks)Introduction
• EvolutionofCNN
• VisualizingtheFeatures
• CNNasArtist
• SentimentAnalysisbyCNN
53
LeNet◦ Paper:http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf
YannLeCun http://yann.lecun.com/exdb/lenet/
54
ImageNetChallenge• ImageNetLargeScaleVisualRecognitionChallenge◦ http://image-net.org/challenges/LSVRC/
• Dataset:◦ 1000categories◦ Training:1,200,000◦ Validation:50,000◦ Testing:100,000
http://vision.stanford.edu/Datasets/collage_s.png
55
ImageNetChallenge
http://www.qingpingshan.com/uploads/allimg/160818/1J22QI5-0.png
56
AlexNet (2012)• Paper:
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
• TheresurgenceofDeepLearning
GeoffreyHintonAlexKrizhevsky57
VGGNet (2014)• Paper:https://arxiv.org/abs/1409.1556
D:VGG16E:VGG19Allfiltersare3x3
58
VGGNet• Morelayers&smallerfilters(3x3)isbetter
• Morenon-linearity,fewerparameters
One5x5filter• Parameters:5x5=25
• Non-linear:1
Two3x3filters• Parameters:3x3x2=18
• Non-linear:2
59
VGG19
depth=643x3convconv1_1conv1_2
maxpool
depth=1283x3convconv2_1conv2_2
maxpooldepth=2563x3convconv3_1conv3_2conv3_3conv3_4
depth=5123x3convconv4_1conv4_2conv4_3conv4_4
depth=5123x3convconv5_1conv5_2conv5_3conv5_4
maxpool maxpool maxpool
size=4096FC1FC2
size=1000softmax
60
GoogLeNet (2014)• Paper:
http://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf
22layersdeepnetwork
InceptionModule
61
Inception Module• Bestsize?◦ 3x3?5x5?
• Usethemall,andcombine
62
Inception Module
1x1convolution
3x3convolution
5x5convolution
3x3max-pooling
Previouslayer
FilterConcatenate
63
Inception ModulewithDimensionReduction
• Use1x1filterstoreducedimension
64
Inception ModulewithDimensionReduction
Previouslayer
1x1convolution(1x1x256x128)
Reduceddimension
Inputsize1x1x256
128256
Outputsize1x1x128
65
ResNet (2015)• Paper:https://arxiv.org/abs/1512.03385
• ResidualNetworks
• 152layers
66
ResNet• Residuallearning:abuildingblock
Residualfunction
67
Residual LearningwithDimensionReduction
• using1x1filters
68
PretrainedModelDownload• http://www.vlfeat.org/matconvnet/pretrained/◦ Alexnet:◦ http://www.vlfeat.org/matconvnet/models/imagenet-matconvnet-alex.mat
◦ VGG19:◦ http://www.vlfeat.org/matconvnet/models/imagenet-vgg-verydeep-19.mat
◦ GoogLeNet:◦ http://www.vlfeat.org/matconvnet/models/imagenet-googlenet-dag.mat
◦ ResNet◦ http://www.vlfeat.org/matconvnet/models/imagenet-resnet-152-dag.mat
69
UsingPretrainedModel• Lowerlayers:edge,blob,texture(moregeneral)
• Higherlayers:objectpart(morespecific)
http://vision03.csail.mit.edu/cnn_art/data/single_layer.png
70
Transfer Learning• ThePretrained Modelis
trainedonImageNetdataset
• IfyourdataissimilartotheImageNet data
◦ FixallCNNLayers◦ TrainFClayer
Conv layer
FClayer FClayer
LabeleddataLabeleddataLabeleddataLabeleddataLabeleddataImageNetdata
Yourdata
…
Conv layer
Conv layer…
Conv layer
Yourdata
… …
71
Transfer Learning• ThePretrainedModelis
trainedonImageNetdataset
• IfyourdataisfardifferentfromtheImageNet data
◦ FixlowerCNNLayers◦ TrainhigherCNNandFClayers
Conv layer
FClayer FClayer
LabeleddataLabeleddataLabeleddataLabeleddataLabeleddataImageNetdata
Yourdata
…
Conv layer
Conv layer…
Conv layer
Yourdata
… …
72
Tensorflow Transfer LearningExample• https://www.tensorflow.org/versions/r0.11/how_tos/styl
e_guide.html
daisy634
photos
dandelion899
photos
roses642
photos
tulips800
photos
sunflowers700
photoshttp://download.tensorflow.org/example_images/flower_photos.tgz
Tensorflow Transfer LearningExampleFixtheselayers Trainthislayer
Outline• CNN(ConvolutionalNeuralNetworks)Introduction
• EvolutionofCNN
• VisualizingtheFeatures
• CNNasArtist
• SentimentAnalysisbyCNN
75
Visualizing CNN
http://vision03.csail.mit.edu/cnn_art/data/single_layer.png
76
Visualizing CNN
CNN
CNN
flower
randomnoise
filterresponse
filterresponse
77
filterresponse:
GradientAscent• Magnifythefilterresponse
randomnoise:
x f
score: F =X
i,j
fi,j
fi,j
lowerscore
higherscore
x
F
gradient:@F
@x
78
filterresponse:
GradientAscent• Magnifythefilterresponse
randomnoise:
x f
x
gradient:
fi,j
lowerscore
higherscore
F
@F
@x
update x
learningrate
x x+ ⌘@F
@x
79
GradientAscent
80
Different LayersofVisualization
CNN
81
Multiscale ImageGeneration
visualize resize visualize
resize
visualize
82
Multiscale ImageGeneration
83
DeepDream• https://research.googleblog.com/2015/06/inceptionism-
going-deeper-into-neural.html
• Sourcecode:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/deepdream/deepdream.ipynb
http://download.tensorflow.org/example_images/flower_photos.tgz
84
DeepDream
85
DeepDream
86
Outline• CNN(ConvolutionalNeuralNetworks)Introduction
• EvolutionofCNN
• VisualizingtheFeatures
• CNNasArtist
• SentimentAnalysisbyCNN
87
NeuralArt
• Paper:https://arxiv.org/abs/1508.06576
• Sourcecode:https://github.com/ckmarkoh/neuralart_tensorflow
content
style
artwork
http://www.taipei-101.com.tw/upload/news/201502/2015021711505431705145.JPG
https://github.com/andersbll/neural_artistic_style/blob/master/images/starry_night.jpg?raw=true
88
TheMechanismofPainting
BrainArtist
Scene Style ArtWork
Computer NeuralNetworks
89
Misconception
90
ContentGenerationBrainArtistContent
CanvasMinimize
thedifference
NeuralStimulation
Draw
91
ContentGeneration
92
FilterResponsesVGG19
Updatethecolorofthepixels
Content
Canvas
Result
Width*HeightDepth
Minimizethe
difference
ContentGeneration
Layerl’sFilterlResponses:
Layerl’sFilterResponses:
InputPhoto: Input
Canvas:
Width*Height (j)
Depth(i)
Width*Height (j)
Depth(i)
93
ContentGeneration• BackwardPropagation
Layerl’sFilterlResponses:
InputCanvas:
VGG19
UpdateCanvas
LearningRate
94
ContentGeneration
95
ContentGeneration
VGG19
conv1_2 conv2_2 conv3_4 conv4_4 conv5_2conv5_1
96
StyleGeneration
VGG19Artwork
G
G
FilterResponses GramMatrix
Width*Height
Depth
Depth
Depth
Position-dependent
Position-independent
97
StyleGeneration
1. .5
.5
.5
1.
1. .5 .25 1.
.5 .25 .5
.25 .25
1. .5 1.
Width*Height
Depth
k1 k2
k1
k2
Depth
Depth
Layerl’sFilterResponsesGramMatrix
G
98
StyleGeneration
Layerl’sFilterResponses
Layerl’sGramMatrix
Layerl’sGramMatrix
InputArtwork:
InputCanvas:
99
StyleGeneration
VGG19Filter
ResponsesGramMatrix
Minimizethe
difference
G
G
Style
Canvas
UpdatethecolorofthepixelsResult
100
StyleGeneration
101
StyleGenerationVGG19
Conv1_1 Conv1_1Conv2_1
Conv1_1Conv2_1Conv3_1
Conv1_1Conv2_1Conv3_1Conv4_1
Conv1_1Conv2_1Conv3_1Conv4_1Conv5_1
102
ArtworkGenerationFilterResponsesVGG19
GramMatrix
103
ArtworkGeneration
VGG19 VGG19
Conv1_1Conv2_1Conv3_1Conv4_1Conv5_1
Conv4_2
104
ArtworkGeneration
105
Contentv.s.Style
0.15 0.05
0.02 0.007
106
NeuralDoodle• Paper:https://arxiv.org/abs/1603.01768
• Sourcecode:https://github.com/alexjc/neural-doodle
style
content resultsemanticmaps
107
NeuralDoodle• Imageanalogy
108
NeuralDoodle• Imageanalogy
恐怖連結,慎入!https://raw.githubusercontent.com/awentzonline/image-analogies/master/examples/images/trump-image-analogy.jpg
109
Real-timeTextureSynthesis• Paper:https://arxiv.org/pdf/1604.04382v1.pdf◦ GAN:https://arxiv.org/pdf/1406.2661v1.pdf◦ VAE:https://arxiv.org/pdf/1312.6114v10.pdf
• SourceCode:https://github.com/chuanli11/MGANs
110
Outline• CNN(ConvolutionalNeuralNetworks)Introduction
• EvolutionofCNN
• VisualizingtheFeatures
• CNNasArtist
• SentimentAnalysisbyCNN
111
AConvolutional NeuralNetworkforModellingSentences
• Paper:https://arxiv.org/abs/1404.2188
• Sourcecode:https://github.com/FredericGodin/DynamicCNN
112
Drawbacks ofRecursiveNeuralNetworks(RvNN)
• Needhuman-labeledsyntaxtreeduringtraining
This is a dog
TrainRvNN
Wordvector This is a dog
RvNN
RvNN
RvNN
113
Drawbacks ofRecursiveNeuralNetworks(RvNN)
• Ambiguityinnaturallanguage
http://3rd.mafengwo.cn/travels/info_weibo.php?id=2861280
114
http://www.appledaily.com.tw/realtimenews/article/new/20151006/705309/
Element-wise 1Doperationsonwordvectors
• 1DConvolutionor1DPooling
This is a
operationoperation
This is a
Representedby
115
FromRvNN toCNN• RvNN • CNN
This is a dog
conv3
conv2
conv1conv1conv1
conv2
SameRvNN
Differentconv layers
This is a dog
RvNN
RvNN
RvNN
116
CNNwithMax-Pooling Layers• Similartosyntaxtree
• Buthuman-labeledsyntaxtreeisnotneeded
This is a dog
conv2
pool1
conv1conv1conv1
pool1
This is a dog
conv2
pool1
conv1conv1
pool1
MaxPooling
117
SentimentAnalysisbyCNN• Use softmax layer to classify the sentiments
positive
This movie is awesome
conv2
pool1
conv1conv1conv1
pool1
softmax
negative
This movie is awful
conv2
pool1
conv1conv1conv1
pool1
softmax
118
SentimentAnalysisbyCNN• Buildthe “correct syntax tree” by training
negative
This movie is awesome
conv2
pool1
conv1conv1conv1
pool1
softmax
negative
This movie is awesome
conv2
pool1
conv1conv1conv1
pool1
softmax
Backwardpropagation
error
119
SentimentAnalysisbyCNN• Buildthe “correct syntax tree” by training
negative
This movie is awesome
conv2
pool1
conv1conv1conv1
pool1
softmax
positive
This movie is awesome
conv2
pool1
conv1conv1conv1
pool1
softmax
Updatetheweights
120
MultipleFilters• RicherfeaturesthanRNN
This is
filter11 Filter13filter12
a
filter11 Filter13filter12
filter21 Filter23filter22
121
Sentencecan’t be easily resized• Imagecanbeeasilyresized • Sentence can’t be easily
resized
全台灣最高樓在台北
resize
resize
全台灣最高的高樓在台北市
全台灣最高樓在台北市
台灣最高樓在台北
122
Various InputSize• Convolutionallayersandpoolinglayers◦ canhandleinputwithvarioussize
This is a dog
pool1
conv1conv1conv1
pool1
the dog run
pool1
conv1conv1
123
Various InputSize• Fully-connectedlayerandsoftmax layer◦ needfixed-sizeinput
The dog run
fc
softmax
This is a
fc
softmax
dog
124
k-maxPooling• choosethek-maxvalues
• preservetheorderofinputvalues
• variable-sizeinput,fixed-sizeoutput
3-maxpooling
13 4 1 7 812 5 21 15 7 4 9
3-maxpooling
12 21 15 13 7 8
125
WideConvolution• Ensuresthatallweightsreachtheentiresentence
conv conv conv conv conv convconv conv
Wide convolution Narrow convolution
126
Dynamick-maxPooling
wideconvolution&k-maxpooling
wideconvolution&k-maxpooling
kl
ktop
L
s
ktop
and L are constants
l : index of current layer
kl
: k of current layer
ktop
: k of top layer
L : total number of layers
s : length of input sentence
k
l
= max(ktop
, dL� l
L
se)
127
Dynamick-maxPooling
s = 10
L = 2
k1 = max(3, d2� 1
2⇥ 10e) = 5
ktop
= 3
k
l
= max(ktop
, dL� l
L
se)
conv &pooling
conv &pooling
128
Dynamick-maxPooling
conv &pooling
conv &pooling
L = 2
ktop
= 3
k
l
= max(ktop
, dL� l
L
se)
k1 = max(3, d2� 1
2⇥ 14e) = 7
s = 14
129
Dynamick-maxPooling
conv &pooling
conv &pooling
L = 2
ktop
= 3
k
l
= max(ktop
, dL� l
L
se)
s = 8
k1 = max(3, d2� 1
2⇥ 8e) = 4
130
Dynamick-maxPooling
Wideconvolution&Dynamick-maxpooling
131
Convolutional NeuralNetworksforSentenceClassification
• Paper:http://www.aclweb.org/anthology/D14-1181
• Sourcee code:https://github.com/yoonkim/CNN_sentence
132
Static&Non-StaticChannel• Pretrained byword2vec
• Static:fixthevaluesduringtraining
• Non-Static:updatethevaluesduringtraining
133
AbouttheLecturer
MarkChang
• Email:ckmarkoh at gmail dot com• Blog: http://cpmarkchang.logdown.com• Github:https://github.com/ckmarkoh• Slideshare:http://www.slideshare.net/ckmarkohchang• Youtube:https://www.youtube.com/channel/UCckNPGDL21aznRhl3EijRQw
134
HTCResearch&HealthcareDeepLearningAlgorithmsResearchEngineer