classification of human facial expression using artificial

128
วิทยานิพนธ์ การจับความรู ้สึกของคนจากใบหน้าด้วยเทคนิคปัญญาประดิษฐ์ CLASSIFICATION OF HUMAN FACIAL EXPRESSION USING ARTIFICIAL INTELLIGENCE TECHNIQUES นางสาว ศศิชา บุญเก่า บัณฑิตวิทยาลัย มหาวิทยาลัยเกษตรศาสตร์ ปีการศึกษา ๒๕๖๒

Upload: others

Post on 29-Mar-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Classification of Human Facial Expression using Artificial Intelligence Techniques
Classification of Human Facial Expression using Artificial Intelligence Techniques


Sasicha Boonkao : Classification of Human Facial Expression using Artificial Intelligence
Techniques. Master of Engineering (Mechanical Engineering), Major Field: Mechanical Engineering, Department of Mechanical Engineering. Thesis Advisor: Associate Professor Withit Chartlatanagulchai, Ph.D. Academic Year 2019
At present, the machines are being used in different work, increasing in society.
Therefore leading to a variety of tasks including very complex things Therefore, the machine's perception requires the machine to understand the environment and the understanding of the interlocutor. In this article, research on the learning of machinery that recognizes the emotions expressed in human faces. The development in this research uses artificial intelligence techniques, combined between emotional classification and in-depth learning by using a artificial neural network that has been trained together with transfer learning to identify The seven major emotions of human emotions include anger, disgust, fear, happiness, sadness, surprise and neutral. We can take advantage of the tool set and transfer this deep learning technique, such as recognizing the emotions expressed in the face of the elderly that show when needing help. By the results of the experiment, this research can effectively detect human emotions. With precision and can be used to develop into commercial products.
_________________ _______________________________ ____ / ____ / ____ Student's signature Thesis Advisor's signature
 


5. (Backpropagation) ....................................................................... 12
6.1 Input layer .................................................................................................................... 15
6.3 Subsampling (S2) ......................................................................................................... 15
6.9 Output layer .................................................................................................................. 16
7. CNN ................................................................................................................. 16
7.1 Input layers ................................................................................................................... 16
7.1.2 Image 3D input layer ...................................................................................... 16
7.1.3 Sequence input layer ....................................................................................... 16
7.1.4 Region of interest input layer .......................................................................... 16
7.2 Convolution and Fully Connected layers ..................................................................... 16
7.2.1 Convolution layer ........................................................................................... 17
7.2.4 Inception module ............................................................................................ 18
7.2.5 Fire module ..................................................................................................... 20
7.3 Activation layers .......................................................................................................... 21
7.3.1 Sigmoid Function ............................................................................................ 21
7.3.2 Tanh Function ................................................................................................. 22
7.4.1 Batch normalization layer ............................................................................... 26
7.4.2 Cross channel normalization layer .................................................................. 26
7.4.3 Dropout layer .................................................................................................. 27
7.4.4 Cropping layer ................................................................................................ 27
7.5.1 Average pooling layer ..................................................................................... 28
7.5.2 Max pooling layer ........................................................................................... 28
7.5.3 Max unpooling layer ....................................................................................... 29
7.5.4 Global average pooling ................................................................................... 29
7.5.5 Global max pooling layer ............................................................................... 30
7.6 Output layers ................................................................................................................ 30
7.6.1 Softmax layer .................................................................................................. 30
7.6.2 Classification layer ......................................................................................... 30
7.6.3 Regression layer .............................................................................................. 32
............................................................................................................................... 97
....................................................................................................................... 98
........................................................................................................................................ 99
MATLAB2018a convert dataset image files ......... 100
MATLAB2018a transfer learning pretrained CNN ........................................................ 103
MATLAB2018a pretrained CNN ............................................................................................. 109
................................................................................................... 113
1 OKER (088) BLACK ............................................................... 46
2 ........................................................................ 46
3 Pretrained CNN layer .................................... 94
4 Preteained CNN 4 ....................................... 96
 
2 .................................................................................... 6
4 (neural networks) .......................................................................... 8
5 2 .............................................................................................. 8
6 j hidden unit j ............................................................... 13
7 .......................................................................... 14
8 .......................................................................... 15
10 Transposed convolution layer ................................................................ 18
11 Inception module .................................................................................... 18
12 Inception module filter 3×3 5×5 ................................ 19
13 Inception module filter ............................... 19
14 Inception module convolution layer ......... 20
15 Fire module .................................................................................................................... 21
16 Sigmoid Function .............................................................................................. 22
17 hyperbolic Function .......................................................................................... 23
18 ReLU Function ............................................................................................................... 24
19 Leaky ReLU ................................................................................................................... 25
20 Clipped ReLU ................................................................................................................ 25
 
23 Average pooling max pooling ........................................................ 29
24 max pooling unpooling .................................................................. 29
25 global average pooling () global max pooling () ................ 30
26 SVM Cross-entropy loss .............................. 31
27 MATLAB ...................................................................................................... 34
28 7 Kaggle dataset ........................................................... 35
29 Kaggle dataset ............................................... 35
30 GoogLeNet ...................................................................................... 36
31 GoogLeNet in matlab ............................................................................... 37
32 VGG16 ............................................................................................ 38
33 VGG16 in matlab ...................................................................................... 39
34 residual function ResNet50 ................................................................... 40
35 ResNet50 in matlab .................................................................................. 41
36 SqueezeNet ..................................................................................... 43
37 SqueezeNet in matlab ............................................................................... 44
38 transfer learning .................................................................................................. 45
39 OKER (088) BLACK ............................................................................................... 45
40 20 35,887 Kaggle dataset ...................................... 51
41 label 7 ................................................................................... 51
42 Plot layer graph GoogLeNet ................................................................................... 52
43 connect fully connected softmax classification ............................. 53
44 20 35,887 Kaggle dataset ...................................... 55
 
46 Plot layer graph VGG16 ......................................................................................... 56
47 connect fully connected softmax classification ............................. 57
48 20 35,887 Kaggle dataset ...................................... 59
49 label 7 ................................................................................... 59
50 Plot layer graph ResNet50 ....................................................................................... 60
51 connect fully connected softmax classification ............................. 61
52 20 35,887 Kaggle dataset ...................................... 63
53 label 7 ................................................................................... 63
54 Plot layer graph SqueezeNet .................................................................................. 64
55 connect fully connected softmax classification ............................. 65
56 Training progress Preteained CNN GoogLeNet .................................................................................................................... 74
57 Preteained CNN GoogLeNet ............................. 75
58 Preteained CNN GoogLeNet .......................... 76
59 Preteained CNN GoogLeNet ............................... 76
60 Preteained CNN GoogLeNet ...................... 77
61 Preteained CNN GoogLeNet ............................... 77
62 Preteained CNN GoogLeNet .................. 78
63 Preteained CNN GoogLeNet ....................... 78
64 Training progress Preteained CNN VGG16 .......................................................................................................................... 79
65 Preteained CNN VGG16 ................................... 80
66 Preteained CNN VGG16 ................................. 81
 
72 Training progress Preteained CNN ResNet50 ....................................................................................................................... 84
73 Preteained CNN ResNet50 ................................ 85
74 Preteained CNN ResNet50 .............................. 86
75 Preteained CNN ResNet50 .................................. 86
76 Preteained CNN ResNet50 ......................... 87
77 Preteained CNN ResNet50 .................................. 87
78 Preteained CNN ResNet50 ..................... 88
79 Preteained CNN ResNet50 .......................... 88
80 Training progress Preteained CNN SqueezeNet ................................................................................................................... 89
81 Preteained CNN SqueezeNet ............................. 90
82 Preteained CNN SqueezeNet .......................... 91
83 Preteained CNN SqueezeNet ............................... 91
84 Preteained CNN SqueezeNet ..................... 92
85 Preteained CNN SqueezeNet ............................... 92
86 Preteained CNN SqueezeNet .................. 93
87 Preteained CNN SqueezeNet ....................... 93
  1

(Deep Learning) (: https://www.thaiprogrammer.org/2018/12/whatisai/)

1.2.1 (Supervised Learning) (Training Data) (Label) ( Label) input-output data output data input data (Supervised Learning) (classification), (regression)
1.2.2 (Unsupervised Learning) Label (Unknown Structure) (input data) (output data) (Unsupervised Learning) (dimension reduction), (clustering)
1.2.3 (Reinforcement Learning) Action (Reward) Action Action Action Action
(Machine learning)
  6
3) (Test Dataset)
(Biological Neural Networks) Martin ,Harward and Mark (1996) 2 (neuron) ( Synapse )
1) (Dendrite) Cell Body
2) (Cell Body) Dendrite
3) (Axon) Cell Body Neuron
2 (: https://codeinsane.wordpress.com/2018/09/29/neural-network/)
1) (input) p1 , p2 pn p
2) (weights) w1,w2 wn
3) (Summation function) (pi), (wi) (bias)
4) (transfer function) (activation function) f
5) (output) y
3 (neuron) (: http://www.mut.ac.th/research-detail-92)

  8
1) input layer Layer 2) hidden layer Layer Neural Networks (DEEP NN) 3) output layer Layer hidden layer

3. (Feed-Forward Network)
5 2 (: Bishop - Pattern Recognition And Machine Learning - Springer 2006)

inputs







  9
[3] 5 2 input layer D input xi i = 1,…,D input layer x0 x0 = 1
( )
=
layer ( )1
0jw bias aj activation function h(.) ( )=j jz h a (2)
z0 = 1 h(.) sigmoidal functions logistic sigmoid tanh functions
output layer k linear combinations
( )
=
multiclass activation function softmax function
( )=k ky s a

y x w w h w x (4)
universal approximator
4. (Error Functions)
( ) ( ) =
E w y x w t (5)
y(xn,w) (4) {xn} input data n = 1,…,N
target {tn} binary classification t=1 class C1 t=0 class C2
output logistic sigmoid
  11
( ) ( ) ( )

= − 1| , , 1 , ttp t x w y x w y x w

E w t y t y
yn = y(xn,w) K separate binary classifications K outputs 0,1kt
k = 1,…,K
( ) ( ) −
=
p t x w y x w y x w

multiclass classifition input K
class output

  12
a x w
E w t y x w
5. (Backpropagation)
[3] w E(w), gradient descent method on-line gradient descent ( ) ( ) ( )( ) + = − 1w w E w (6)
step size ( )( )−  E w
gradient ( )E w


square error (5) output units
= −k k ky t (8)
hidden units
( )
= = =
E E a h a w
a a a (9)
units k units j (9)
hidden unit 6 j hidden unit j units k units j
6 j hidden unit j (: Bishop - Pattern Recognition And Machine Learning - Springer 2006)
1) inputs xi forward propagate
activations units
  14
2) k output units (8) 3) Backpropagate (9) j hidden unit
4) (7)
6. (convolutional neural network (CNN))
LeCun et al. (1990) (Feature Map) (Digit)
7 (: https://medium.com/olarik/)
1998 (LeCun et al., 1998)
LeNet-5 “Full Connection” “Fully- Connected Layer” MLP (Input- Hidden-Output) 5 8 0 9 [4]
  15
LeNet (layer) 6.1 Input layer input data matrix 32×32×1 pixel
grayscale 6.2 Convolutional layer (C1) (Convolutional Layer)
filter 6 filter 5×5 stride 1 unit
output 6 feature map 32 ×32×1 28×28×6 pixel
6.3 Subsampling (S2) 2 Sub-Sampling Layer filter 6 filter 2×2 stride 2 unit output 6 feature map 28×28×6 14×14×6 pixel
6.4 Convolution layer (C3) 3 filter 16
filter 5×5 stride 1 unit output 16 feature
map 14 ×14×6 10×10×16 pixel
6.5 Subsampling layer (S4) 4 Sub-Sampling Layer filter 16 filter 2×2 stride 2 unit output 16 feature map 10×10×16 5×5×16 pixel
6.6 Convolution layer (C5) 5 Feature Map 120
6.8 Fully Connected layer 84 unit C5
  16
6.9 Output layer Layer F6 84 unit
Euclidean Radial Basis Function units (RBF) Unit Output
Layer 10 unit ( 0-9)
( )= − 2
y x w
i index output j index input ijw
weight thi output
7. CNN Convolutional neural network (CNN) (layer) 7.1 Input layers
7.1.1 Image input layer (layer) CNN input data CNN (image) matrix 3 dimension [ pixel × pixel × channel ] channel 1 (grayscale image) 3 (RGB image) element matrix pixel channel 8 0 255
7.1.2 Image 3D input layer (layer) CNN (image) 3 matrix 4 dimension [ pixel × pixel × pixel × channel ] channel Image input layer
7.1.3 Sequence input layer (layer) CNN input data CNN input sequence vector input sequence
7.1.4 Region of interest input layer ( ROI input layer) (layer) CNN input data CNN Fast R-CNN
7.2 Convolution and Fully Connected layers CNN weight bias
  17
7.2.1 Convolution layer convolution (dot product) filter matrix input matrix feature input 9
9 Convolution layer (: https://hackernoon.com/visualizing-parts-of-convolutional-neural-networks-using-keras-
and-cats-5cc01b214e59)
7.2.2 Transposed convolution layer Convolution layer down-sampling up-sampling convolution layer semantic segmentation decoder classify pixel
10 Transposed convolution input matrix 2×2 kernel matrix 2×2 stride 1 unit 0 padding output matrix 3×3
  18
7.2.3 Fully connected layer input weight matrix
bias neural network 4 7.2.4 Inception module Szegedy et al. (2015) CNN
CNN inception layer overfitting CNN CNN
inception layer local sparse structure CNN dense structure 11 inception layer convolution layer max pooling layer [5]
11 Inception module (: Szegedy et al. (2015))
  19
inception layer filter 5×5 convolution layer inception layer 12 14 [6]
12 Inception module filter 3×3 5×5 (: Szegedy et al. (2016))
13 Inception module filter (: Szegedy et al. (2016))
  20
14 Inception module convolution layer (: Szegedy et al. (2016))
7.2.5 Fire module Forrest N. Iandola et al. (2016) module
SqueezeNet CNN CNN fire module overfitting CNN CNN inception module module
fire layer squeeze layer 1 × 1 filters expand layer filters 1 × 1 3 × 3 filters input expand layer 3 × 3 filters layer ( ) × ( filters) × (3 × 3)
  21
15 Fire module (: Forrest N. Iandola et al. (2016))
7.3 Activation layers Activation function
(CNN) Sigmoid function, Tanh function, rectified linear unit (ReLU function)
7.3.1 Sigmoid Function Curve S Output Sigmoid Function 0 – 1 Output (Probability) Output 1=Yes, 0=No
Sigmoid Function
( ) ( )( ) = −  1x x
Sigmoid Function 1) Output 0-1 Probability, Boolean 2) 3) Derivative Sigmoid Function 1) Input -5 5 Slope 0
Gradient Train Vanishing Gradient 2) Output Balance Mean 0
Optimize 3) Converge
7.3.2 Tanh Function Hyperbolic Tangent Activation Function Sigmoid S
Tanh Function
e e
Tanh = ( Sigmoid × 2 ) - 1 Sigmoid 2
0-1 0-2 1 0-2 -1-1 Derivative Tanh Function
  23
17 hyperbolic Function (: http://cs231n.github.io/)
Tanh Function 1) Output Balance Mean 0 Optimize 2) Output -1, 1 Classification 2
Normalize 1 Unit 3) 4) Derivative 5) Input 0 Derivative Slope Sigmoid Gradient
Tanh Function 1) Probability 2) Input -3 3 Slope 0
Gradient Train Vanishing Gradient 3) Converge
7.3.3 ReLU function Activation Function Input Slope 1 Gradient ( Vanishing Gradient)
  24
for x
( ) ( )
= =
for x
Leaky ReLU apply ReLU
input
( )
=
x for x
  25
Clipped ReLU apply ReLU
saturate input ceiling
( )

=
ceiling x ceiling
Clipped ReLU
20 Clipped ReLU
ReLU Function 1) Slope 1 Gradient Vanishing Gradient

  26
2) Derivative Derivative 0 1 Input 3) Converge ReLU Function 1) Output Balance Mean 0
Optimize 2) Output 0-Infinity Limit 3) Input Output 0
7.4 Normalization, Dropout, Cropping Layers 7.4.1 Batch normalization layer CNN
neural network batch normalization convolution layer activation layer ReLU layers [7]
Batch normalization
B variance mini-batch ˆ
ix normalized activation input mean covariance
optimal normalized activation scale shift
= +ˆ i iy x
learnable weight scale shift
yi output batch normalization layer batch-normalization overfitting Dropouts
accuracy 7.4.2 Cross channel normalization layer normalize
channel channel channel-wise normalization
  27
x input x output s normalization
window channel , ,K 7.4.3 Dropout layer Geoffrey Hinton 2012
Node input over-fitting training unseen data
21 Dropout layer (: http://laid.delanover.com/dropout-explained-and-implementation-in-tensorflow)
7.4.4 Cropping layer input
Cropping
layers-with-keras/#why-cropping )
7.5.1 Average pooling layer down-sampling region
7.5.2 Max pooling layer down-sampling region 23 Average pooling layer Max pooling layer
  29
23 Average pooling max pooling (: https://www.researchgate.net/publication/335609766_Predictive_Neural_Network_
Applications_for_Insurance_Processes)
7.5.3 Max unpooling layer feature maps output feature maps pooling 24 max pooling unpooling maximum switch variables feature maps [8]
24 max pooling unpooling (: Omid E. David1 and Nathan S. Netanyahu (2016))
7.5.4 Global average pooling down-sampling
region region
7.5.5 Global max pooling layer down-sampling region region 3D 25 global average pooling global max pooling
25 global average pooling () global max pooling () (: https://peltarion.com)
7.6 Output layers
7.6.1 Softmax layer Input Vector Logit function Normalize Probability 0 1 1 Classification
softmax function
7.6.2 Classification layer loss function output minimize backpropagation trainable parameters network Loss function
  31
( )
L s s
index yi class index j class s
output softmax layer 2) Cross-entropy loss
( ) ( ) ( ), log
x

  32
1) Mean Absolute Error (MAE)
= − 1

=
( )= − 21
( )= − 21
train
MATLAB Signal Processing Communication Image and Video Processing Control System Instruments and Control Economic Biology
MATLAB MATLAB MATLAB MATLAB MATLAB
  34
3) MATLAB function function function
4) MATLAB FORTRAN, C
5) MATLAB 2 3 6) MATLAB Dynamic Link Word,
Excel windows 7) MATLAB toolbox function

1.2 Kaggle dataset Facial Expression Recognition Challenge [9]
1) 35,887 2) 48x48 pixels (8 grayscale)
  35
4) 7 28
28 7 Kaggle dataset (: Alexandru Savoiu et al (2017))
29
29 Kaggle dataset : Alexandru Savoiu et al (2017)
  36
30 GoogLeNet (: Szegedy et al. (2015))
  37
  38
32 VGG16 (: https://engmrk.com/tag/vgg16-implementation-using-keras/feed/)
  39
  40
1.5 Pretrained CNN:ResNet50 (CNN) idea skip connection deeper network (neural network ) optimum H(x) network residual function F(x) = H(x) – x 34 ResNet50 feed-forward network
34 residual function ResNet50 (: He et al. (2015))
ResNet50 He et al. (2016 ) ILSVRC 2015
Top 5 Test Error Rate 7.8 ImageNet learnable parameter 23 pretrain 1,000 class 35 layer 177 layer [11]
  41
  42
  43
36 SqueezeNet (: Ioffe & Szegedy (2015))
  44
  45
38 transfer learning (: https://ch.mathworks.com/?requestedDomain=en)
2. (Hardware)
2.1 OKER (088) BLACK 1 39 1 7
39 OKER (088) BLACK
  46
1 OKER (088) BLACK
WEBCAM OKER (088) BLACK Brand oker Resolution 10 M Pixels Frame rate 30 fps Focus range 30 mm-infinitive Interface USB System Requirements Windows ME, 2000, XP, Vista, 10 Info noise rate 48 dB
2.2
2 2
System Information Operating System Windows 10 Pro 64-bit Processor Intel® Core™ i7-3440 CPU @ 3.40GHz Memory 16384 MB RAM Display Device Name NVIDIA GeForce GTX 1070 Approx. Total Memory 16256 MB Display Memory (VRAM) 8088 MB Memory Interface GDDR5 Memory Interface Width 256-bit
  47

Kaggle convert convert 2 netlayer input Convert dataset 2
1.1 Convert dataset Kaggle convert 224×224
close all clear clc
Clear run Clear Workspace, Clear Command Window
lab = dlmread('Kaggle dataset\Kaggle_column1.csv',' ',0,0); dat = dlmread('Kaggle dataset\Kaggle_column2.csv',' ',0,0);
Kaggle Dataset MATLAB 2018a lab Dat
for n = 1:35887; n for i = 1:48; for j = 1:48; for k = 1:3;
  48
pic(i,j,k) = dat(n,(i-1)*48+j); end end end pic = pic/255;
run 35887 for loop Dataset pic 48×48×3 0 1 pic/255
if lab(n) == 0; moo = 'Angry'; elseif lab(n) == 1; moo = 'Disgust'; elseif lab(n) == 2; moo = 'Fear'; elseif lab(n) == 3; moo = 'Happy'; elseif lab(n) == 4; moo = 'Sad'; elseif lab(n) == 5; moo = 'Surprise'; else moo = 'Neutral'; end
moo label lab KaggleDataset
fln = strcat('Kaggle dataset\','images\',moo,'\',moo,num2str(n),'.jpg'); imwrite(A,filename);
save file pic folder moo
%resize A = imread(fln); B = imresize(A,[224, 224]); imwrite(B,fln);
end
save 48×48 224×224 save file
1.2 Convert dataset Kaggle convert 227×227
close all clear
Clear run Clear Workspace, Clear Command Window
lab = dlmread('Kaggle dataset\Kaggle_column1.csv',' ',0,0); dat = dlmread('Kaggle dataset\Kaggle_column2.csv',' ',0,0);
Kaggle Dataset MATLAB 2018a lab Dat
for n = 1:35887; n for i = 1:48; for j = 1:48; for k = 1:3; pic(i,j,k) = dat(n,(i-1)*48+j); end end end pic = pic/255;
run 35887 for loop Dataset pic 48×48×3 0 1 pic/255
if lab(n) == 0; moo = 'Angry'; elseif lab(n) == 1; moo = 'Disgust'; elseif lab(n) == 2; moo = 'Fear'; elseif lab(n) == 3; moo = 'Happy'; elseif lab(n) == 4; moo = 'Sad'; elseif lab(n) == 5; moo = 'Surprise'; else moo = 'Neutral'; end
moo label lab KaggleDataset
fln = strcat('Kaggle
  50
end
save 48×48 227×227 save file 2. transfer learning pretrained CNN
transfer learning MATLAB 2018a pretrained CNN 4 GoogLeNet, VGG16, ResNet50 SqueezeNet
2.1 transfer learning pretrained CNN GoogLeNet
close all clear clc
Clear run Clear Workspace, Clear Command Window
imds = imageDatastore(fullfile('Kaggle
figure; perm = randperm(35887,20); for i = 1:20 subplot(4,5,i); imshow(imds.Files{perm(i)}); end
20 35,887 Kaggle dataset plot 40 ( 20 )
  51
imds.countEachLabel
label 7 35,887 41
41 label 7
[trainingImages,validationImages] = splitEachLabel(imds,0.8,'randomize');
  52
net = googlenet; net.Layers
pretrained CNN GoogLeNet layer GoogLeNet 144
if isa(net,'SeriesNetwork') lgraph = layerGraph(net.Layers); else lgraph = layerGraph(net); end figure('Units','normalized','Position',[0.1 0.1 0.8 0.8]); plot(lgraph)
Plot layer graph GoogLeNet 42
42 Plot layer graph GoogLeNet
lgraph = removeLayers(lgraph, {'loss3-
classifier','prob','output'});
transfer learning 3 pretrained CNN GoogLeNet loss3-classifier, prob output
numClasses = numel(categories(trainingImages.Labels)); newLayers =
  53
connect 3 pretrained CNN GoogLeNet plot layer graph 43
43 connect fully connected softmax classification
options = trainingOptions('sgdm',... 'MiniBatchSize',10,... 'MaxEpochs',10,... 'InitialLearnRate',0.0001,... 'VerboseFrequency',50,... 'ValidationData',validationImages,... 'ValidationFrequency',10000,... 'ExecutionEnvironment','auto',... 'Plots','training-progress');
train CNN transfer learning design parameter 1) algorithm backpropagation stochastic gradient descent with
momen-tum (sgdm) 2) MiniBatchSize Size of mini-batch train
iteration 0.8 × 35,887 = 28,710 Mini- BatchSize 10 iteration 2,871 iterations 1 epoch
  54
3) MaxEpochs Maximum number of epochs 10 epoch 28,710 10
4) ValidationFrequency Frequency of network validation iteration plot validate net accuracy loss Validation- Frequency 10,000 10,000 interation plot ValidationFrequency 1
net = trainNetwork(trainingImages,lgraph,options);
CNN object net workspace MATLAB 2018a
2.2 transfer learning pretrained CNN VGG16
close all clear clc
Clear run Clear Workspace, Clear Command Window
imds = imageDatastore(fullfile('Kaggle
figure; perm = randperm(35887,20); for i = 1:20 subplot(4,5,i); imshow(imds.Files{perm(i)}); end
20 35,887 Kaggle dataset plot 44 ( 20 )
  55
imds.countEachLabel
label 7 35,887 45
45 label 7
[trainingImages,validationImages] =
splitEachLabel(imds,0.8,'randomize');
  56
net = vgg16; net.Layers
pretrained CNN VGG16 layer VGG16 41
if isa(net,'SeriesNetwork') lgraph = layerGraph(net.Layers); else lgraph = layerGraph(net); end figure('Units','normalized','Position',[0.1 0.1 0.8 0.8]); plot(lgraph)
Plot layer graph VGG16 46
46 Plot layer graph VGG16
lgraph = removeLayers(lgraph, {'fc8','prob','output'});
transfer learning 3 pretrained CNN VGG16 fc8, prob output
numClasses = numel(categories(trainingImages.Labels)); newLayers =
  57
lgraph = connectLayers(lgraph,'drop7','fc'); figure('Units','normalized','Position',[0.3 0.3 0.4 0.4]); plot(lgraph) ylim([0,10])
connect 3 pretrained CNN VGG16 plot layer graph 47
47 connect fully connected softmax classification
options = trainingOptions('sgdm',... 'MiniBatchSize',10,... 'MaxEpochs',10,... 'InitialLearnRate',0.0001,... 'VerboseFrequency',50,... 'ValidationData',validationImages,... 'ValidationFrequency',10000,... 'ExecutionEnvironment','auto',... 'Plots','training-progress');
train CNN transfer learning design parameter 1) algorithm backpropagation stochastic gradient descent with
momen-tum (sgdm) 2) MiniBatchSize Size of mini-batch train
iteration 0.8 × 35,887 = 28,710 Mini- BatchSize 10 iteration 2,871 iterations 1 epoch
3) MaxEpochs Maximum number of epochs 10 epoch 28,710 10
  58
4) ValidationFrequency Frequency of network validation iteration plot validate net accuracy loss Validation- Frequency 10,000 10,000 interation plot ValidationFrequency 1
net = trainNetwork(trainingImages,lgraph,options);
CNN object net workspace MATLAB 2018a

clc clear close all
Clear run Clear Workspace, Clear Command Window
imds = imageDatastore(fullfile('Kaggle
figure; perm = randperm(35887,20); for i = 1:20 subplot(4,5,i); imshow(imds.Files{perm(i)}); end
20 35,887 Kaggle dataset plot 48 ( 20 )
  59
imds.countEachLabel
label 7 35,887 49
49 label 7
[trainingImages,validationImages] =
splitEachLabel(imds,0.8,'randomize');
  60
net = resnet50; net.Layers
pretrained CNN ResNet50 layer ResNet50 177
if isa(net,'SeriesNetwork') lgraph = layerGraph(net.Layers); else lgraph = layerGraph(net); end figure('Units','normalized','Position',[0.1 0.1 0.8 0.8]); plot(lgraph)
Plot layer graph ResNet50 50
50 Plot layer graph ResNet50
lgraph = removeLayers(lgraph,
{'fc1000','fc1000_softmax','ClassificationLayer_fc1000'});
transfer learning 3 pretrained CNN ResNet50 fc1000, fc1000_softmax ClassificationLayer_fc1000
numClasses = numel(categories(trainingImages.Labels)); newLayers =
  61
connect 3 pretrained CNN ResNet50 plot layer graph 51
51 connect fully connected softmax classification
options = trainingOptions('sgdm',... 'MiniBatchSize',10,... 'MaxEpochs',10,... 'InitialLearnRate',0.0001,... 'VerboseFrequency',50,... 'ValidationData',validationImages,... 'ValidationFrequency',10000,... 'ExecutionEnvironment','auto',... 'Plots','training-progress');
train CNN transfer learning design parameter 1) algorithm backpropagation stochastic gradient descent with
momen-tum (sgdm) 2) MiniBatchSize Size of mini-batch train
iteration 0.8 × 35,887 = 28,710 Mini- BatchSize 10 iteration 2,871 iterations 1 epoch
3) MaxEpochs Maximum number of epochs 10 epoch 28,710 10
  62
4) ValidationFrequency Frequency of network validation iteration plot validate net accuracy loss Validation- Frequency 10,000 10,000 interation plot ValidationFrequency 1
netTransfer = trainNetwork(trainingImages,layers,options);
CNN object net workspace MATLAB 2018a
2.4 transfer learning pretrained CNN SqueezeNet
clc clear close all
Clear run Clear Workspace, Clear Command Window
imds = imageDatastore(fullfile('Kaggle
figure; perm = randperm(35887,20); for i = 1:20 subplot(4,5,i); imshow(imds.Files{perm(i)}); end
20 35,887 Kaggle dataset plot 52 ( 20 )
  63
imds.countEachLabel
label 7 35,887 53
53 label 7
[trainingImages,validationImages] =
splitEachLabel(imds,0.8,'randomize');
  64
net = squeezenet; net.Layers
pretrained CNN SqueezeNet layer SqueezeNet 68
if isa(net,'SeriesNetwork') lgraph = layerGraph(net.Layers); else lgraph = layerGraph(net); end figure('Units','normalized','Position',[0.1 0.1 0.8 0.8]); plot(lgraph)
Plot layer graph SqueezeNet 54
54 Plot layer graph SqueezeNet
lgraph = removeLayers(lgraph,
transfer learning 3 pretrained CNN SqueezeNet fc1000, fc_1000 ClassificationLayer_fc1000
numClasses = numel(categories(trainingImages.Labels)); newLayers =
lgraph = connectLayers(lgraph,'relu_conv10','fc'); figure('Units','normalized','Position',[0.3 0.3 0.4 0.4]); plot(lgraph) ylim([0,10])
connect 3 pretrained CNN SqueezeNet plot layer graph 55
55 connect fully connected softmax classification
options = trainingOptions('sgdm',... 'MiniBatchSize',10,... 'MaxEpochs',10,... 'InitialLearnRate',0.0001,... 'VerboseFrequency',50,... 'ValidationData',validationImages,... 'ValidationFrequency',10000,... 'ExecutionEnvironment','auto',... 'Plots','training-progress');
train CNN transfer learning design parameter 1) algorithm backpropagation stochastic gradient descent with
momen-tum (sgdm)
2) MiniBatchSize Size of mini-batch train iteration 0.8 × 35,887 = 28,710 Mini- BatchSize 10 iteration 2,871 iterations 1 epoch
3) MaxEpochs Maximum number of epochs 10 epoch 28,710 10
4) ValidationFrequency Frequency of network validation iteration plot validate net accuracy loss Validation- Frequency 10,000 10,000 interation plot ValidationFrequency 1
net = trainNetwork(trainingImages,lgraph,options);
CNN object net workspace MATLAB 2018a 3. pretrained CNN transfer learning
MATLAB 2018a pretrained CNN transfer learning Kaggle dataset
3.1 pretrained CNN GoogleNet
clc clear close all
Clear run Clear Workspace, Clear Command Window
camera = webcam;
load ('GoogLeNetValidition0.0001Epoch10Feq10000.mat');
  67
format figure
keepRolling = true; set(gcf,'CloseRequestFcn','keepRolling = false; closereq');
figure
while keepRolling
im = snapshot(camera);
im = imresize(im,inputSize);
[label,score] = classify(net,im);
title(ax1,{char(label),num2str(max(score),2)});
  68
drawnow
update figure 3.2 pretrained CNN VGG16

Clear run Clear Workspace, Clear Command Window
camera = webcam;
load ('VGG16Validation0.0001Epoch10Feq10000.mat');
inputSize = net.Layers(1).InputSize(1:2);
format figure
keepRolling = true; set(gcf,'CloseRequestFcn','keepRolling = false; closereq');
figure while keepRolling
While loop figure (keepRolling false) while loop frame rate
  69
im = imresize(im,inputSize);
[label,score] = classify(net,im);
title(ax1,{char(label),num2str(max(score),2)});
classNames = net.Layers(end).ClassNames; barh(ax2,score) xlabel(ax2,'Probability') xlim(ax2,[0 1]) yticklabels(ax2,classNames) ax2.YAxisLocation = 'right';
drawnow
update figure 3.3 pretrained CNN ResNet50

Clear run Clear Workspace, Clear Command Window
camera = webcam;
  70
inputSize = net.Layers(1).InputSize(1:2);
format figure
keepRolling = true; set(gcf,'CloseRequestFcn','keepRolling = false; closereq');
figure
while keepRolling
im = snapshot(camera);
im = imresize(im,inputSize);
CNN (label) (score)
title(ax1,{char(label),num2str(max(score),2)});
classNames = net.Layers(end).ClassNames;
histogram ax1 figure
drawnow
update figure 3.4 pretrained CNN SqueezeNet

Clear run Clear Workspace, Clear Command Window
camera = webcam;
load ('SqueezeNetValidation0.001Epoch10Feq10000.mat');
inputSize = net.Layers(1).InputSize(1:2);
format figure
keepRolling = true; set(gcf,'CloseRequestFcn','keepRolling = false; closereq');
figure
while keepRolling
im = snapshot(camera);
im = imresize(im,inputSize);
[label,score] = classify(net,im);
title(ax1,{char(label),num2str(max(score),2)});
classNames = net.Layers(end).ClassNames; barh(ax2,score) xlabel(ax2,'Probability') xlim(ax2,[0 1]) yticklabels(ax2,classNames) ax2.YAxisLocation = 'right';
drawnow
  73


Kaggle Dataset transfer learning Preteained CNN GoogLeNet 56
  74
CNN GoogLeNet
noise
3) Validation accuracy ( trainingOptions)
4) Training loss, smoothed training loss, and validation loss loss mini-batch, loss smoothed training loss validation loss function cross entropy loss
webcam Preteained CNN GoogLeNet 7
  76
  77
  78
63 Preteained CNN GoogLeNet 2. Kaggle Dataset transfer learning Preteained CNN VGG16
Kaggle Dataset transfer learning Preteained CNN VGG16 64
  79
CNN VGG16
noise
3) Validation accuracy ( trainingOptions)
4) Training loss, smoothed training loss, and validation loss loss mini-batch, loss smoothed training loss validation loss function cross entropy loss
webcam Preteained CNN VGG16 7
1. Preteained CNN VGG16 97% 65
65 Preteained CNN VGG16 2. Preteained CNN VGG16
96% 66
  82
  83
71 Preteained CNN VGG16 3. Kaggle Dataset transfer learning Preteained CNN ResNet50
Kaggle Dataset transfer learning Preteained CNN ResNet50 72
  84
CNN ResNet50
noise
3) Validation accuracy ( trainingOptions)
4) Training loss, smoothed training loss, and validation loss loss mini-batch, loss smoothed training loss validation loss function cross entropy loss
webcam Preteained CNN ResNet50 7
2. Preteained CNN ResNet50 43% 74
  86
  87
  88
4. Kaggle Dataset transfer learning Preteained CNN SqueezeNet
Kaggle Dataset transfer learning Preteained CNN SqueezeNet 80
  89
CNN SqueezeNet
noise
3) Validation accuracy ( trainingOptions)
4) Training loss, smoothed training loss, and validation loss loss mini-batch, loss smoothed training loss validation loss function cross entropy loss
webcam Preteained CNN SqueezeNet 7
  91
  92
  93
  94
3 3 Pretrained CNN layer
Kaggle Dataset
transfer learning Preteained CNN 1. Preteained CNN GoogLeNet
61.67% 547 35,887

..

4 M ImageNet: 6.7 22
VGG16 2014 Alex-Net kernals 3 × 3
138 M ImageNet: 7.3 16
ResNet50 2016 Residual idea skip connection
23 M ImageNet: 7.8 50
SqueezeNet 2017 Fire module
1.2 M ImageNet: 19.7 18
  95
  96


GoogLeNet 1. Multiscale
Filter 2. split, transform merge 3. auxiliary classifiers convergence rate
1. 2. bottleneck
4. bottleneck layer, global average- pooling layer Sparse Connections
VGG16 1. receptive field 2.
1. fully connected
ResNet50 1. residual learning(skip connections) 2. vanishing gradient
1. 2. feature-map 3. feature maps
SqueezeNet 1. CNN 2. 3. CNN - AlexNet 50 AlexNet ImageNet
1. Complex by-pass simple by- pass

2. Matlab code C Python code Raspberry pi
3.
1. Shalev-Shwartz, S. and S. Ben-David, Understanding Machine Learning: From Theory to Algorithms. 2014: Cambridge University Press.
2. McCulloch and Pitts, The Statistical of Neural networks in Social Science Research. 1943. 3. Bishop, Pattern Recognition and Machine Learning. 2006: Cambridge. 4. LeCun, Y., et al., Gradient-Based Learning Applied to Document Recognition. 1998. 5. Szegedy, C., et al., Going Deeper with Convolutions. 2015: University of North Carolina. 6. al, S.e., Rethinking the Inception Architecture for Computer Vision. 2016: Zbigniew
Wojna, University College London.
7. Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. 2015.
8. David, O.E. and N.S. Netanyahu, DeepPainter: Painter Classication Using Deep Convolutional Autoencoders. 2017: Department of Computer Science, Bar-Ilan University, Ramat-Gan, Israel.
9. Savoiu, A. and J. Wong, Recognizing Facial Expressions Using Deep Learning. 2017: Stanford University.
10. Simonyan, K. and A. Zisserman, VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION. 2014: University of Oxford.
11. He, K., et al., Deep Residual Learning for Image Recognition. 2015. 12. Iandola, F.N., et al., SQUEEZENET: ALEXNET-LEVEL ACCURACY WITH 50X
FEWER PARAMETERS AND <0.5MB MODEL SIZE. 2016: Stanford University. 13. al., Y.e., How transferable are features in convolutional neural network acoustic models
across languages? 2014: University of Montreal, Canada.
  99

  100
MATLAB2018a convert dataset image files
  101
MATLAB 2018a convert Kaggle dataset image files 224×224 close all clear clc
%Kaggle dataset
lab = dlmread('Kaggle dataset\Kaggle_column1.csv',' ',0,0); dat = dlmread('Kaggle dataset\Kaggle_column2.csv',' ',0,0);
for n = 1:35887; n for i = 1:48; for j = 1:48; for k = 1:3; pic(i,j,k) = dat(n,(i-1)*48+j); end end end pic = pic/255;
if lab(n) == 0; moo = 'Angry'; elseif lab(n) == 1; moo = 'Disgust'; elseif lab(n) == 2; moo = 'Fear'; elseif lab(n) == 3; moo = 'Happy'; elseif lab(n) == 4; moo = 'Sad'; elseif lab(n) == 5; moo = 'Surprise'; else moo = 'Neutral'; end
fln = strcat('Kaggle
end
  102
MATLAB 2018a convert Kaggle dataset image files 227×227 close all clear clc
%Kaggle dataset
lab = dlmread('Kaggle dataset\Kaggle_column1.csv',' ',0,0); dat = dlmread('Kaggle dataset\Kaggle_column2.csv',' ',0,0);
for n = 1:35887; n for i = 1:48; for j = 1:48; for k = 1:3; pic(i,j,k) = dat(n,(i-1)*48+j); end end end pic = pic/255;
if lab(n) == 0; moo = 'Angry'; elseif lab(n) == 1; moo = 'Disgust'; elseif lab(n) == 2; moo = 'Fear'; elseif lab(n) == 3; moo = 'Happy'; elseif lab(n) == 4; moo = 'Sad'; elseif lab(n) == 5; moo = 'Surprise'; else moo = 'Neutral'; end
fln = strcat('Kaggle
end
  103
MATLAB2018a transfer learning
pretrained CNN
imds = imageDatastore(fullfile('Kaggle
imds.countEachLabel
[trainingImages,validationImages] =
splitEachLabel(imds,0.8,'randomize');
lgraph = removeLayers(lgraph, {'loss3-classifier','prob','output'}); numClasses = numel(categories(trainingImages.Labels)); newLayers =
options = trainingOptions('sgdm',... 'MiniBatchSize',10,... 'MaxEpochs',10,... 'InitialLearnRate',0.0001,... 'VerboseFrequency',50,...
imds = imageDatastore(fullfile('Kaggle
imds.countEachLabel
[trainingImages,validationImages] =
splitEachLabel(imds,0.8,'randomize');
lgraph = removeLayers(lgraph, {'fc8','prob','output'}); numClasses = numel(categories(trainingImages.Labels)); newLayers =
imds = imageDatastore(fullfile('Kaggle
imds.countEachLabel
[trainingImages,validationImages] =
splitEachLabel(imds,0.8,'randomize');
lgraph = removeLayers(lgraph,
{'fc1000','fc1000_softmax','ClassificationLayer_fc1000'}); numClasses = numel(categories(trainingImages.Labels));
options = trainingOptions('sgdm',... 'MiniBatchSize',10,... 'MaxEpochs',10,... 'InitialLearnRate',0.0001,... 'VerboseFrequency',50,... 'ValidationData',validationImages,... 'ValidationFrequency',10000,... 'ExecutionEnvironment','auto',... 'Plots','training-progress');
imds = imageDatastore(fullfile('Kaggle
imds.countEachLabel
[trainingImages,validationImages] =
splitEachLabel(imds,0.8,'randomize');
lgraph = removeLayers(lgraph,
{'pool10','prob','ClassificationLayer_predictions'}); numClasses = numel(categories(trainingImages.Labels)); newLayers =
options = trainingOptions('sgdm',... 'MiniBatchSize',10,... 'MaxEpochs',10,... 'InitialLearnRate',0.0001,... 'VerboseFrequency',50,... 'ValidationData',validationImages,... 'ValidationFrequency',10000,... 'ExecutionEnvironment','auto',... 'Plots','training-progress');

  110
camera = webcam; load ('GoogLeNetValidition0.0001Epoch10Feq10000.mat'); inputSize = net.Layers(1).InputSize(1:2);
h = figure; h.Position(3) = 2*h.Position(3); ax1 = subplot(1,2,1); ax2 = subplot(1,2,2); ax2.ActivePositionProperty = 'position';
keepRolling = true; set(gcf,'CloseRequestFcn','keepRolling = false; closereq');
while keepRolling % Display and classify the image im = snapshot(camera); image(ax1,im) im = imresize(im,inputSize); [label,score] = classify(net,im); title(ax1,{char(label),num2str(max(score),2)});
% Plot the histogram classNames = net.Layers(end).ClassNames; barh(ax2,score) xlabel(ax2,'Probability') xlim(ax2,[0 1]) yticklabels(ax2,classNames) ax2.YAxisLocation = 'right';
drawnow end
camera = webcam; load ('VGG16Validation0.0001Epoch10Feq10000.mat'); inputSize = net.Layers(1).InputSize(1:2);
  111
keepRolling = true; set(gcf,'CloseRequestFcn','keepRolling = false; closereq');
while keepRolling % Display and classify the image im = snapshot(camera); image(ax1,im) im = imresize(im,inputSize); [label,score] = classify(net,im); title(ax1,{char(label),num2str(max(score),2)});
% Plot the histogram classNames = net.Layers(end).ClassNames; barh(ax2,score) xlabel(ax2,'Probability') xlim(ax2,[0 1]) yticklabels(ax2,classNames) ax2.YAxisLocation = 'right';
drawnow end
camera = webcam; load ('ResNet50Validation0.0001Epoch10Feq10000.mat'); inputSize = net.Layers(1).InputSize(1:2);
h = figure; h.Position(3) = 2*h.Position(3); ax1 = subplot(1,2,1); ax2 = subplot(1,2,2); ax2.ActivePositionProperty = 'position';
keepRolling = true; set(gcf,'CloseRequestFcn','keepRolling = false; closereq');
while keepRolling % Display and classify the image im = snapshot(camera); image(ax1,im) im = imresize(im,inputSize);
  112
drawnow end
camera = webcam;
keepRolling = true; set(gcf,'CloseRequestFcn','keepRolling = false; closereq');
while keepRolling % Display and classify the image im = snapshot(camera); image(ax1,im) im = imresize(im,inputSize); [label,score] = classify(net,im); title(ax1,{char(label),num2str(max(score),2)});
% Plot the histogram classNames = net.Layers(end).ClassNames; barh(ax2,score) xlabel(ax2,'Probability') xlim(ax2,[0 1]) yticklabels(ax2,classNames) ax2.YAxisLocation = 'right';
drawnow end
33
7. CNN
7.1 Input layers
7.6 Output layers
MATLAB2018a convert dataset image files
MATLAB2018a transfer learning pretrained CNN
MATLAB2018a pretrained CNN