details of lazy deep learning for images recognition in zz photo app
TRANSCRIPT
![Page 1: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/1.jpg)
Details of Lazy Deep Learning for Images
Recognition in ZZ Photo app
Artem Chernodub, George Pashchenko
IMMSP NASU
Kharkov AI Club, 20 June 2015.
ZZ Photo
![Page 2: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/2.jpg)
𝑝 (𝑥|𝑦 )=𝑝 ( 𝑦|𝑥 )𝑝 (𝑥)
𝑝 (𝑦 )
Biological-inspired models
Neuroscience
Machine Learning
2 / 62
![Page 3: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/3.jpg)
Biological Neural Networks
3 / 62
![Page 4: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/4.jpg)
Artificial Neural Networks
Traditional (Shallow) Neural Networks
Deep Neural Networks
Deep Feedforward Neural Networks
Recurrent Neural Networks
4 / 62
![Page 5: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/5.jpg)
Conventional Methods vs Deep Learning
5 / 62
![Page 6: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/6.jpg)
Deep Learning = Learning of Representations (Features)
The traditional model of pattern recognition (since the late 50's):
fixed/engineered features + trainable classifierHand-
crafted Feature
Extractor
TrainableClassifier
Trainable Feature
Extractor
TrainableClassifier
End-to-end learning / Feature learning / Deep learning:
trainable features + trainable classifier
6 / 62
![Page 7: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/7.jpg)
ImageNet
Le et al. “Building high-level features using large-scale unsupervised learning” ICML 2012.
Model # of parameters
Accuracy, %
Deep Net 10M 15.8
best state-of-the-art N/A 9.3
Training data: 16M images, 20K categories
7 / 62
![Page 8: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/8.jpg)
Deep Face (Facebook)
Y. Taigman, M. Yang, M.A. Ranzato, L. Wolf. DeepFace: Closing the Gap to Human-Level Performance in Face Verification // CVPR 2014.
Model # of parameters
Accuracy, %
Deep Face Net 128M 97.35
Human level N/A 97.5
Training data: 4M facial images
8 / 62
![Page 9: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/9.jpg)
TIMIT Phoneme Recognition
Graves, A., Mohamed, A.-R., and Hinton, G. E. (2013). Speech recognition with deep recurrent neural networks // IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6645–6649. IEEE.
Mohamed, A. and Hinton, G. E. (2010). Phone recognition using restricted Boltzmann machines // IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4354–4357.
Model # of parameters
Error
Hidden Markov Model, HMM
N / A 27,3%
Deep Belief Network, DBN ~ 4M 26,7%
Deep RNN 4,3M 17.7%Training data: 462 speakers train / 24 speakers test, 3.16 / 0.14 hrs.
9 / 62
![Page 10: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/10.jpg)
Google Large Vocabulary Speech Recognition
H. Sak, A. Senior, F. Beaufays. Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling // INTERSPEECH’2014.
K. Vesely, A. Ghoshal, L. Burget, D. Povey. Sequence-discriminative training of deep neural networks // INTERSPEECH’2014.
Model # of parameters
Cross-entropy
ReLU DNN 85M 11.3
Deep Projection LSTM RNN
13M 10.7
Training data: 3M utterances (1900 hrs).
10 / 62
![Page 11: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/11.jpg)
Classic Feedforward Neural Networks (before 2006).
• Single hidden layer (Kolmogorov-Cybenko Universal Approximation Theorem as the main hope).
• Vanishing gradients effect prevents using more layers.
• Less than 10K free parameters.• Feature preprocessing stage is often critical. 11 /
62
![Page 12: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/12.jpg)
Training the traditional (shallow) Neural Network: derivative + optimization
12 / 62
![Page 13: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/13.jpg)
1) forward propagation pass
),( )1(i
ijij xwfz
),()1(~ )2(j
jj zwgky
where zj is the postsynaptic value for the j-th hidden neuron, w(1) are the hidden layer’s weights, f() are the hidden layer’s activation functions, w(2) are the output layer’s weights, and g() are the output layer’s activation functions.
13 / 62
![Page 14: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/14.jpg)
2) backpropagation pass
Local gradients calculation:
),1(~)1( kyktOUT.)(' )2( OUT
jjHIDj wzf
,)()2( j
OUT
j
zwkE
.)()1( i
INj
ji
xwkE
Derivatives calculation:
14 / 62
![Page 15: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/15.jpg)
Bad effect of vanishing (exploding) gradients: a problem
,)( )1()(
)(
m
imjm
ji
zwkE
,' )1()()1()( mi
i
mij
mj
mj wf 0
)()(
mjiwkE=> 1mfor
15 / 62
![Page 16: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/16.jpg)
Bad effect of vanishing (exploding) gradients: two hypotheses
1) increased frequency and severity of bad localminima
2) pathological curvature, like the type seen in the well-knownRosenbrock function:222 )(100)1(),( xyxyxf
16 / 62
![Page 17: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/17.jpg)
Deep Feedforward Neural Networks
• 2-stage training process: i) unsupervised pre-training; ii) fine tuning (vanishing gradients problem is beaten!).
• Number of hidden layers > 1 (usually 6-9).• 100K – 100M free parameters.• No (or less) feature preprocessing stage. 17 /
62
![Page 18: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/18.jpg)
Sparse Autoencoders
18 / 62
![Page 19: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/19.jpg)
Dimensionality reduction
• Use a stacked RBM as deep auto-encoder
1. Train RBM with images as input & output
2. Limit one layer to few dimensions
Information has to pass through middle layer
G. E. Hinton and R. R. Salakhutdinov. Reducing the Dimensionality of Data with Neural Networks // Science 313 (2006), p. 504 – 507.19 /
62
![Page 20: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/20.jpg)
Original
Deep RBN
PCA
Dimensionality reduction
Olivetti face data, 25x25 pixel images reconstructed from 30 dimensions (625 30)
G. E. Hinton and R. R. Salakhutdinov. Reducing the Dimensionality of Data with Neural Networks // Science 313 (2006), p. 504 – 507.20 /
62
![Page 21: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/21.jpg)
How to use unsupervised pre-training stage / 1
21 / 62
![Page 22: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/22.jpg)
How to use unsupervised pre-training stage / 2
22 / 62
![Page 23: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/23.jpg)
How to use unsupervised pre-training stage / 3
23 / 62
![Page 24: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/24.jpg)
How to use unsupervised pre-training stage / 4
24 / 62
![Page 25: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/25.jpg)
Unlabeled data
Unlabeled data is readily available
Example: Images from the web
1. Download 10’000’000 images2. Train a 9-layer DNN3. Concepts are formed by DNN
G. E. Hinton and R. R. Salakhutdinov. Reducing the Dimensionality of Data with Neural Networks // Science 313 (2006), p. 504 – 507.25 /
62
![Page 26: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/26.jpg)
Dimensionality reduction
PCA Deep RBN
804’414 Reuters news stories, reduction to 2 dimensions
G. E. Hinton and R. R. Salakhutdinov. Reducing the Dimensionality of Data with Neural Networks // Science 313 (2006), p. 504 – 507.26 /
62
![Page 27: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/27.jpg)
Hierarchy of trained representations
Low-level feature
Middle-level
feature
Top-level feature
Feature visualization of convolutional net trained on ImageNet from [Zeiler & Fergus 2013]
27 / 62
![Page 28: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/28.jpg)
Hessian-Free optimization: Deep Learning with no pre-training stage
J. Martens. Deep Learning via Hessian-free Optimization // Proceedings of the 27th International Conference on Machine Learning (ICML), 2010. 28 /
62
![Page 29: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/29.jpg)
FLOPS comparison
https://ru.wikipedia.org/wiki/FLOPS
Type Name Flops Cost
Mobile Raspberry Pi 1st Gen, 700 Mhz
0,04 Gflops $35
Mobile Apple A8 1,4 Gflops $700 (in iPhone 6)
CPU Intel Core i7-4930K (Ivy Bridge), 3.7 GHz
140 Gflops $700
CPU Intel Core i7-5960X (Haswell), 3.0 GHz
350 Gflops $1300
GPU NVidia GTX 980 4612 Gflops (single precision),
144 Gflops (double precision)
$600 + cost of PC (~$1000)
GPU NVidia Tesla K80 8740 Gflops (single precision),
2910 Gflops (double precision)
$4500 + cost of PC (~1500)
29 / 62
![Page 30: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/30.jpg)
Deep Networks Training time using GPU
• Pretraining – from 2-3 weeks to 2-3 months.
• Fine-tuning (final supervised training) – from 1 day to 1 week.
30 / 62
![Page 31: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/31.jpg)
Tools for training Deep Neural Networks
D. Kruchinin, E. Dolotov, K. Kornyakov, V. Kustikova, P. Druzhkov. The Comparison of Deep Learning Libraries on the Problem of Handwritten Digit Classication // Analysis of Images, Social Networks and Texts (AIST), 2015, April, 9-11th, Yekaterinburg. 31 /
62
![Page 32: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/32.jpg)
Convolutional Neural Networks: Return of Jedi
Andrej Karpathy and Fei-Fei. CS231n: Convolutional Neural Networks for Visual Recognition http://cs231n.github.io/convolutional-networks
Yoshua Bengio, Ian Goodfellow and Aaron Courville. Deep Learning // An MIT Press book in preparation http://www-labs.iro.umontreal.ca/~bengioy/DLbook
32 / 62
![Page 33: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/33.jpg)
AlexNet, CNN-Mega-HiT, results on LSVRC-2012
A. Kryzhevsky, I. Sutskever, G.E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks // Advances in Neural Information Processing Systems 25 (NIPS 2012).
33 / 62
![Page 34: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/34.jpg)
Lazy Deep Learning: idea
A. S. Razavian, H. Azizpour, J. Sullivan, S. Carlsson. CNN Features off-the-shelf: an Astounding Baseline for Recognition //2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 23-28 June 2014, Columbus, USA, p. 512 – 519. 34 /
62
![Page 35: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/35.jpg)
Lazy Deep Learning: bechmark results
A. S. Razavian, H. Azizpour, J. Sullivan, S. Carlsson. CNN Features off-the-shelf: an Astounding Baseline for Recognition //2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 23-28 June 2014, Columbus, USA, p. 512 – 519. 35 /
62
![Page 36: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/36.jpg)
MIT-8 toy problem: formulation
• 8 classes• 2080 images in
total• TRAIN: 2000
images (250 per class)
• TEST: 688 images, 86 per class
S. Banerji, A. Verma, C. Liu. Novel Color LBP Descriptors for Scene and ImageTexture Classification // Cross Disciplinary Biometric Systems, 2012, 15th International Conference on Image Processing, Computer Vision, and Pattern Recognition, Las Vegas, Nevada, pp. 205-225.
36 / 62
![Page 37: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/37.jpg)
MIT-8 toy problem: results
Acc.TRAIN
Acc.TEST
1 LBP + SVM with RBF Kernel 27,2% 19,0%
2 LPQ + SVM with RBF kernel 38,4% 30,5%
3 LBP + SVM with χ2 kernel 94,2% 74,0%
4 LPQ + SVM with χ2 kernel 99,1% 82,2%
5 Deep CNN (AlexNet) + SVM RBF kernel (LAZY DL)
95,1% 91,8%
6 Deep CNN (AlexNet) + SVM with χ2 Kernel (LAZY DL)
100,0% 93,2%
7 Deep CNN (AlexNet) + MLP (LAZY DL) 100,0% 92,3%Original results, to be published. 37 /
62
![Page 38: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/38.jpg)
ZZ Photo – photo organizer
Trial version is available at http://zzphoto.me38 /
62
![Page 39: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/39.jpg)
Viola-Jones Object Detector
• Very popular for Human Face Detection.• May be trained for Cat and Dog Face detection.• Available free in OpenCV library (http://opencv.org).
O. Parkhi, A. Vedaldi, C. V. Jawahar, and A. Zisserman. The Truth about Cats and Dogs // Proceedings of the International Conference on Computer Vision (ICCV), 2011. J. Liu, A. Kanazawa, D. Jacobs, P. Belhumeur. Dog Breed Classification Using Part Localization // Lecture Notes in Computer Science Volume 7572, 2012, pp 172-185.
39 / 62
![Page 40: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/40.jpg)
Images pyramid for Viola-Jones
40 / 62
![Page 41: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/41.jpg)
Viola-Jones Object Detector Classifier Structure
P. Viola, M. Jones. Rapid object detection using a boosted cascade of simple features // Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001.
41 / 62
![Page 42: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/42.jpg)
AlexNet design
A. Kryzhevsky, I. Sutskever, G.E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks // Advances in Neural Information Processing Systems 25 (NIPS 2012).
42 / 62
![Page 43: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/43.jpg)
Pets detection problem (Kaggle Dataset + random Other images)
• Kaggle Dataset + random “other” images;
• 2 classes (cats & dogs VS other);
• TRAIN: 5,000 samples;
• TEST: 12,000 samples. 43 /
62
![Page 44: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/44.jpg)
Pets detection results: FAR vs FRR graphs
Original results, to be published. 44 / 62
![Page 45: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/45.jpg)
Pet detection results : ROC curve
Original results, to be published. 45 / 62
![Page 46: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/46.jpg)
Pets detection results, FAR error is fixed to 0.5%
FRR Error
1 Viola-Jones Face Detector for Cats & Dogs + LBP + SVM
79,73%
2 AlexNet, argmax (STANDARD DL, ImageNet-2012, 1000)
32,05%
3 AlexNet, sum (STANDARD DL, ImageNet-2012, 1000)
26,11%
4 AlexNet + SVM linear (LAZY DL) 4,35%
Original results, to be published. 46 / 62
![Page 47: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/47.jpg)
Development of AleksNet on OpenCV
VGG MatConvNet: CNNs for MATLAB http://www.vlfeat.org/matconvnet/mexopencv:MATLAB-OpenCV interface http://kyamagu.github.io/mexopencv/matlab
MatConvNet, MATLAB +
CUDA
OpenCV app, C++
YAML
YAML, BIN
47 / 62
![Page 48: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/48.jpg)
Convolution Layer
Andrej Karpathy and Fei-Fei. CS231n: Convolutional Neural Networks for Visual Recognition http://cs231n.github.io/convolutional-networks
Yoshua Bengio, Ian Goodfellow and Aaron Courville. Deep Learning // An MIT Press book in preparation http://www-labs.iro.umontreal.ca/~bengioy/DLbook
48 / 62
![Page 49: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/49.jpg)
Pooling layer
Andrej Karpathy and Fei-Fei. CS231n: Convolutional Neural Networks for Visual Recognition http://cs231n.github.io/convolutional-networks
Yoshua Bengio, Ian Goodfellow and Aaron Courville. Deep Learning // An MIT Press book in preparation http://www-labs.iro.umontreal.ca/~bengioy/DLbook
49 / 62
![Page 50: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/50.jpg)
Activation functions
Andrej Karpathy and Fei-Fei. CS231n: Convolutional Neural Networks for Visual Recognition http://cs231n.github.io/convolutional-networks
Yoshua Bengio, Ian Goodfellow and Aaron Courville. Deep Learning // An MIT Press book in preparation http://www-labs.iro.umontreal.ca/~bengioy/DLbook
𝑓 (𝑥)=max (0 ,𝑥 )
𝑓 ′ (𝑥 )={1 ,𝑥 ≥00 ,𝑥<0
ReLU activation function
50 / 62
![Page 51: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/51.jpg)
Implementation tricks: im2col
K. Chellapilla, S. Puri, P. Simard. High Performance Convolutional Neural Networks for Document Processing // International Workshop on Frontiers in Handwriting Recognition, 2006.
51 / 62
![Page 52: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/52.jpg)
Implementation tricks: im2col for convolution
K. Chellapilla, S. Puri, P. Simard. High Performance Convolutional Neural Networks for Document Processing // International Workshop on Frontiers in Handwriting Recognition, 2006.
52 / 62
![Page 53: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/53.jpg)
Matrix multiplicationMatrices’ size C OpenCV
C++ (use STL vector
class)OpenBLAS Matlab
1000×1000 1.45 1.76 1.47 0.062 0.062
2000×2000 11.64 14.2 11.23 0.99 0.54
3000×3000 38.11 47.2 37.99 1.75 1.7
4000×4000 90.84 110.37 90.2 7.91 4.2
5000×5000 180.74 213.4 181.02 10.8 7.3
6000×6000 315.46 376.46 316.3 25.33 12.74
https://4fire.wordpress.com/2012/04/29/matrices-multiplication-on-windows-matlab-is-the-champion-again/
53 / 62
![Page 54: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/54.jpg)
OpenBLAS
• OpenBLAS is an open source implementation of the BLAS (Basic Linear Algebra Subprograms) API with many hand-crafted optimizations for specific processor types.
http://www.openblas.net/54 /
62
![Page 55: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/55.jpg)
Sizes of layers
LAYER 1-4 LAYER 5-8 LAYER 9-10 LAYER 11-12
LAYER 13-15
LAYER 16-17
LAYER 18-19
LAYER 20-21
0.09 1.56 2.25 2.25 2.25
144.02
64.02
15.63
Siz
e,
mb
~ 8,5 mb ~ 223 mb
55 / 62
![Page 56: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/56.jpg)
Pets test #2: data
1 mini-set:- 500 cats- 500 dogs- 1000 negatives
56 / 62
![Page 57: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/57.jpg)
Pets test #2: results
100 200 500 1000 2000 5000 10000 180000.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
15 layer17 layer19 layer
Train size
FR
R,
%
57 / 62
![Page 58: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/58.jpg)
Pets test #2: results - FRR, % (FAR is fixed to 0,5%)
Layer #
Train size 15 16 19
100 30,08 12,61 12,94
500 17,91 10,41 10,72
1000 11,59 7,52 6,80
5000 7,41 3,88 4,13
10000 6,29 3,66 2,71
18000 5,16 2,64 2,54
58 / 62
![Page 59: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/59.jpg)
Calculation speed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
5
10
15
20
25
30
35
40
45
Layer #
Tim
e,
ms
~ 73 ms ~ 60 ms59 /
62
![Page 60: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/60.jpg)
Labeled Faces in the Wild (LFW) Dataset
G. B. Huang, M. Ramesh, T. Berg, E. Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments // University of Massachusetts, Amherst, Technical Report 07-49, October, 2007
• more than 13,000 images of faces collected from the web.
• Pairs comparison, restricted mode.
• test: 10-fold cross-validation, 6000 face pairs.
60 / 62
![Page 61: Details of Lazy Deep Learning for Images Recognition in ZZ Photo app](https://reader035.vdocument.in/reader035/viewer/2022062308/55c9b09fbb61eb95328b488a/html5/thumbnails/61.jpg)
Face Recognition on LWF, results
Y. Taigman, M. Yang, M. Ranzato, L. Wolf. DeepFace: Closing the Gap to Human-Level Performance in Face Verification, 2014, CVPR.
Accuracy, %
1 Principal Component Analysis (EigenFaces) 60,2%
2 Local Binary Pattern Histograms (LBP) 72,4%
3 Deep CNN (AlexNet) + Euclid (LAZY DL) 71,0%
4 DeepFace by Facebook (STANDARD DL) 97,25%
61 / 62