su-a kim 12 th august 2014 convolutional neural networks convnet ● ○ ○ ○ ○ ○ ○ ○ ○...
TRANSCRIPT
Su-A Kim
12th August 2014
Convolutional Neural Networks
ConvNet● ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
Table of contents
Introduce Convolutional Neural Networks
Introduce application paper :“DeepFace: Closing the Gap to Human-Level Performance in Face Verification”, CVPR 2014
ConvNet● ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
Su-A Kim12th August 2014 @CVLAB
History
Yann LeCun
In 1995, Yann LeCun and Yoshua Bengio introduced the concept of convolutional neural networks.
Yoshua Bengio
ConvNet● ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
Recap of Convnet
Neural network with specialized connectivity structure
Feed-forward:- Convolve input- Non-linearity (rectified linear)- Pooling (local max)
Supervised
Train convolutional filters byback-propagating classification error
Feature maps
Pooling
Non-linearity
Convolution(Learned)
Input image
Slide: R.fergusSu-A Kim
12th August 2014 @CVLAB
ConvNet○ ● ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
Connectivity & weight sharing depends on layer
All different weights
Convolution layer has much smaller number of parametersby local connection and weight sharing
All different weights Shared weights
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ● ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
features
Convolution layer
Detect the same feature at different positionsin the input image
Filter(kernel)
Input
Feature mapSlide: R.fergus
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ● ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
Non-linearity
Tanh
Sigmoid: 1/(1+exp(-x))
Rectified linear (ReLU) : max(0,x)- Simplifies backprop- Makes learning faster- Make feature sparse
→ Preferred option
Slide: R.fergusSu-A Kim
12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ● ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
Sub-sampling layer
Spatial Pooling- Average or Max- Boureau et al. ICML’10 for theoretical analysis → Max 가 더 좋다는 연구
Role of Pooling- Invariance to small transformations- reduce the effect of noises and shift or distortion
Slide: R.fergus
Max
Sum
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ● ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
Normalization
Contrast normalization (between/across feature map)- Equalizes the features map → Detail 하지 않은 feature 를 잡아냄
Feature maps Feature mapsafter contrast normalization
Slide: R.fergusSu-A Kim
12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ● ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
LeNet 5
C1,C3,C5 : Convolutional layer. (5 × 5 Convolution matrix.) S2 , S4 : Subsampling layer. (by factor 2) F6 : Fully connected layer.
About 187,000 connection. About 14,000 trainable weight.
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ● ○ ○
DeepFace○ ○ ○ ○ ○ ○ ○
LeNet 5
노이즈에도 강건
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ● ○
DeepFace○ ○ ○ ○ ○ ○ ○
About CNN’s
A special kind of multi-layer neural networks.
Implicitly extract relevant features.
A feed-forward network that can extract topological properties from an image.
Like almost every other neural networks CNNs are trained with a version ofthe back-propagation algorithm.
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ●
DeepFace○ ○ ○ ○ ○ ○ ○
Yaniv Taigman, Ming Yang, Marc’ Aurelio Ranzato, Lior WolfFacebook AI Research, Tel Aviv University
DeepFace: Closing the Gap to Human-Level Performancein Face Verification
Reach an accuracy of 97.35%
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace● ○ ○ ○ ○ ○ ○
Architecture
Face Alignment
Representation(CNN)
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace● ○ ○ ○ ○ ○ ○
Face Alignment
(1) 2D alignment
(2) 3D alignment
얼굴 영역 검출 후 , 기준점 6 개 추출 기준점 추출 : LBP histogram 을 descriptor 로 사용해서
미리 학습된 SVR(Support Vector Regressor) 로 추출
67 개 landmark Landmarkmapping
2D-3D align Frontalization 2D projection
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ● ○ ○ ○ ○ ○
Representation
C1-M2-C3
Low-level feature 추출(simple edges and texture)
Apply max-pooling only to the first convolution layer, why?
Input152x152
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ● ○ ○ ○ ○
Representation
L4-L5-L6(Locally connected)
152x152
Locally connected layer 를 사용한 이유 ?: 각각의 영역들은 서로 다른 localstatistic 을 가짐
All different weights
All different weights
Shared weights
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ● ○ ○ ○
Low-level feature 추출(simple edges and texture)
Apply max-pooling only to the first convolution layer, why?
C1-M2-C3
Representation
L4-L5-L6(Locally connected)
F7-F8(Fully connected)
Low-level feature 추출(simple edges and texture)
Apply max-pooling only to the first convolution layer, why?
152x152
얼굴에서 떨어져 있는 부분에서뽑힌 feature 사이의 correlation 을구할 수 있음
Output of F7 : raw face representation feature vector
Output of F8 :Class labels 의 확률분포를 구하는데 사용됨
Locally connected layer 를 사용한 이유 ?
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ● ○ ○
C1-M2-C3
Training
Correct class 의 확률을 최대화 하는 것이 목적
Back-propagation 해서 파라미터를 최소화하고 , stochastic gradient descent(SGD) 를 사용해서 파라미터를 업데이트
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ● ○
Result
Reduces the error of the previous best methods by more than 50%
Youtube 에 100 개정도 잘못 라벨링 된 것들이 있어서그것까지 치면 92.5% 정도 됨
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ●
Reference
[1] Bouchain, David. "Character recognition using convolutional neural networks.“ Institute for Neural Information Processing 2007 (2006).
[2] Bouvrie, Jake. "Notes on convolutional neural networks." (2006).
[3] Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. "Deep sparse rectifier networks." Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume. Vol. 15. 2011.
[4] Ahonen, Timo, Abdenour Hadid, and Matti Pietikainen. "Face description with local binary patterns: Application to face recognition." Pattern Analysis and Machine Intelligence, IEEE Transactions on 28.12 (2006): 2037-2041.
[5] Bengio, Yoshua. "Learning deep architectures for AI." Foundations and trends® in Machine Learning 2.1 (2009): 1-127.
Su-A Kim12th August 2014 @CVLAB
ConvNet○ ○ ○ ○ ○ ○ ○ ○ ○ ○
DeepFace○ ○ ○ ○ ○ ○ ●