![Page 1: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/1.jpg)
Lecture 3:Neural Network Basics &
Architecture Design
Xiangyu Zhang
Face++ Researcher
![Page 2: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/2.jpg)
Visual Recognition
A fundamental task in computer vision
• Classification
• Object Detection
• Semantic Segmentation
• Instance Segmentation
• Key point Detection
• VQA
…
![Page 3: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/3.jpg)
Why Recognition Difficult?
Multiple Objects
Pose Occlusion
Inter-classSimilarity
![Page 4: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/4.jpg)
Any Silver Bullet?
• Deep Neural Networks
![Page 5: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/5.jpg)
Outline
• Neural Network Basics
• Architecture Design
![Page 6: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/6.jpg)
PART 1: Neural Network Basics
• Motivation
• Deep neural networks
• Convolutional Neural Networks (CNNs)
** Special thanks Marc'Aurelio Ranzato for the tutorial “Large-Scale Visual RecognitionWith Deep Learning” in CVPR 2013. All pictures are owned by the authors.
![Page 7: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/7.jpg)
PART 1: Neural Network Basics
• Motivation
• Deep neural networks
• Convolutional Neural Networks (CNNs)
![Page 8: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/8.jpg)
Features for Recognition
![Page 9: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/9.jpg)
Nonlinear Features vs. Linear Classifiers
Feature extractor should be nonlinear!
![Page 10: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/10.jpg)
Learning Non-Linear Features
• Q: which class of non-linear functions shall we consider?
![Page 11: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/11.jpg)
Shallow or Deep
Shallow
Deep
![Page 12: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/12.jpg)
Linear Combination
Drawbacks:Exponential number of templates required!
• Kernel learning• Boosting• …
![Page 13: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/13.jpg)
Composition
Main Idea of Deep Learning
![Page 14: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/14.jpg)
Concepts Reuse in Deep Learning
Zeiler M D, Fergus R. Visualizing and understanding convolutional networks
![Page 15: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/15.jpg)
Concepts Reuse in Deep Learning (cont’d)
Zeiler M D, Fergus R. Visualizing and understanding convolutional networks
![Page 16: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/16.jpg)
Concepts Reuse in Deep Learning (cont’d)
Efficiency: intermediate concepts can be re-used
![Page 17: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/17.jpg)
Deep Learning Framework
A problem:Optimization is difficult: non-convex, non-linear system
![Page 18: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/18.jpg)
Deep Learning Framework (cont’d)
![Page 19: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/19.jpg)
Deep Learning Framework (cont’d)
![Page 20: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/20.jpg)
Summary: Key Ideas of Deep Learning
❖We need nonlinear system
❖We need to learn it from data
❖ Build feature hierarchies (function composition)
❖ End-to-end learning
![Page 21: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/21.jpg)
PART 1: Neural Network Basics
• Motivation
• Deep neural networks
• Convolutional Neural Networks (CNNs)
![Page 22: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/22.jpg)
How to Build Deep Network?
“Neuron” or “Layer” Design
![Page 23: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/23.jpg)
Shallow Cases
• Linear Case:
• SVM
![Page 24: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/24.jpg)
Shallow Cases (cont’d)
• Linear Case:
• Logistic Regression
Linear transformation + nonlinear activation
![Page 25: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/25.jpg)
Neuron Design
Single Neuron:
Linear Projection + Nonlinear Activation
![Page 26: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/26.jpg)
Deep Neuron Network
![Page 27: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/27.jpg)
Deep Neural Network (cont’d)
![Page 28: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/28.jpg)
Gradient-based Training
• For each iteration:
1. Forward Propagation
2. Backward Propagation
3. Update Parameters (Optimization)
![Page 29: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/29.jpg)
Forward Propagation (FPROP)
![Page 30: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/30.jpg)
Forward Propagation (FPROP)
This is the typical processing at test time. At training time, we need to compute an error measure and tune the parameters to decrease the error.
![Page 31: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/31.jpg)
Loss Function
![Page 32: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/32.jpg)
Loss Function
Q: how to tune the parameters to decrease the loss?A: If loss is (a.e.) differentiable we can compute gradients. We can use chain-rule, a.k.a. back-propagation, to compute the gradients w.r.t. parameters at the lower layers.
![Page 33: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/33.jpg)
Backward Propagation (BPROP)
![Page 34: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/34.jpg)
Backward Propagation (BPROP) (cont’d)
![Page 35: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/35.jpg)
Backward Propagation (BPROP) (cont’d)
![Page 36: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/36.jpg)
Optimization
• Stochastic Gradient Descent (on mini-batches):
• Stochastic Gradient Descent with Momentum:
![Page 37: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/37.jpg)
Summary: Key Ideas of Deep Neural Networks
• Neural Net = stack of feature detectors
• F-Prop / B-Prop
• Learning by SGD
![Page 38: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/38.jpg)
PART 1: Neural Network Basics
• Motivation
• Deep neural networks
• Convolutional Neural Networks (CNNs)
![Page 39: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/39.jpg)
Deep Neural Networks on Images
• How to apply a neural network on 2D or 3D inputs?
![Page 40: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/40.jpg)
Fully-connected Net
![Page 41: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/41.jpg)
Locally-connected Net
STATIONARITY? Statistics are similar at different locations (translation invariance)
![Page 42: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/42.jpg)
Convolutional Net
![Page 43: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/43.jpg)
Convolutional Net (cont’d)
![Page 44: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/44.jpg)
Convolutional Net (cont’d)
![Page 45: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/45.jpg)
Convolutional Net (cont’d)
![Page 46: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/46.jpg)
Convolutional Layer
![Page 47: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/47.jpg)
Convolutional Layer (cont’d)
![Page 48: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/48.jpg)
Summary: Key Ideas of Convolutional Nets
• A standard neural net applied to images:• scales quadratically with the size of the input
• does not leverage stationarity
• Solution:• connect each hidden unit to a small patch of the input
• share the weight across hidden units
• This is called: convolutional network.
![Page 49: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/49.jpg)
Other Layers
• Over the years, some new modules have proven to be very effective when plugged into conv-nets:
![Page 50: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/50.jpg)
Pooling Layer
![Page 51: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/51.jpg)
Pooling Layer
![Page 52: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/52.jpg)
Local Contrast Normalization Layer
![Page 53: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/53.jpg)
Typical ArchitectureQ: Where is the
nonlinearity?
![Page 54: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/54.jpg)
Typical Architecture (cont’d)
![Page 55: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/55.jpg)
Conv Architecture Example (AlexNet)
Krizhevsky et al. “ImageNet Classification with deep CNNs” NIPS 2012
![Page 56: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/56.jpg)
Convolutional Nets: Training
• All layers are differentiable (a.e.). We can use standard back-propagation.
• Algorithm:
Given a small mini-batch1. F-PROP
2. B-PROP
3. PARAMETER UPDATE
![Page 57: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/57.jpg)
Summary: Key Ideas of Conv Nets
• Conv. Nets have special layers like:• – pooling, and
• – local contrast normalization
• Back-propagation can still be applied.
• These layers are useful to:• – reduce computational burden
• – increase invariance
• – ease the optimization
![Page 58: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/58.jpg)
PART 2: Architecture Design
• Overview
• Structure design
• Layer design
• Architecture for special tasks
![Page 59: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/59.jpg)
PART 2: Architecture Design
• Overview
• Structure design
• Layer design
• Architecture for special tasks
![Page 60: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/60.jpg)
Architecture Design
• What?• Network topology
• Layer functions
• Hyper-parameters
• Optimization algorithms
…
• Why?• Difficult to determine the optimal structures
• Requirements of different applications, datasets or limitations
![Page 61: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/61.jpg)
Architecture Design (cont’d)
• How?• Manually
• Automatically
• Objective• Representation capability
• Robustness, anti-overfitting
• Computation or parameter efficiency
• Ease of optimization
…
More accuracy, less complexity
![Page 62: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/62.jpg)
PART 2: Architecture Design
• Overview
• Structure design
• Layer design
• Architecture for special tasks
![Page 63: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/63.jpg)
Benchmark: ImageNet Dataset
• 1K classes (for ILSVRC competition)
• 1.2M+ training images, 50K validation images, 100K test images
• ILSVRC competition
Difficulty
• Fine-grained classes
• Large variation
• Costly training
![Page 64: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/64.jpg)
Benchmark: ImageNet Dataset
• 1K classes (for ILSVRC competition)
• 1.2M+ training images, 50K validation images, 100K test images
• ILSVRC competition
Difficulty
• Fine-grained classes
• Large variation
• Costly training
Walker hound
Beagle
English foxhound?
![Page 65: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/65.jpg)
Benchmark: ImageNet Dataset
• 1K classes (for ILSVRC competition)
• 1.2M+ training images, 50K validation images, 100K test images
• ILSVRC competition
Difficulty
• Fine-grained classes
• Large variation
• Costly training
![Page 66: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/66.jpg)
Benchmark: ImageNet Dataset
• 1K classes (for ILSVRC competition)
• 1.2M+ training images, 50K validation images, 100K test images
• ILSVRC competition
Difficulty
• Fine-grained classes
• Large variation
• Costly training
![Page 67: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/67.jpg)
Recent Nets – ImageNet Classification Scores
8 layers
19 layers
152 layers
8 layers
22 layers
![Page 68: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/68.jpg)
AlexNet
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep
convolutional neural networks
![Page 69: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/69.jpg)
VGGNet
Simonyan K, Zisserman A. Very deep
convolutional networks for large-scale image
recognition
![Page 70: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/70.jpg)
GoogleNet
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions
![Page 71: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/71.jpg)
Deep Residual Network
• Easy to optimize
• Enable very deep structures
-- Over 100 layers for ImageNet model
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition
![Page 72: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/72.jpg)
Deep Residual Network (cont’d)
• “Bottleneck” design
• Increasing depth, less complexity
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition
![Page 73: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/73.jpg)
Xception
Chollet F. Xception: Deep Learning with Depthwise Separable Convolutions
![Page 74: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/74.jpg)
ResNeXt
Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural
networks
![Page 75: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/75.jpg)
ShuffleNet
Zhang X, Zhou X, Lin M, et al. ShuffleNet: An Extremely Efficient Convolutional
Neural Network for Mobile Devices
![Page 76: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/76.jpg)
Densely Connected Convolutional Networks
Huang G, Liu Z, Weinberger K Q, et al. Densely
connected convolutional networks
![Page 77: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/77.jpg)
Squeeze-and-Excitation Networks
Hu J, Shen L, Sun G. Squeeze-and-Excitation Networks
![Page 78: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/78.jpg)
Summary: Ideas of Structure Design
• Deeper and wider
• Ease of optimization
• Multi-path design
• Residual path
• Sparse connection
![Page 79: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/79.jpg)
PART 2: Architecture Design
• Overview
• Structure design
• Layer design
• Architecture for special tasks
![Page 80: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/80.jpg)
Spatial Pyramid Pooling
He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks
for visual recognition
![Page 81: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/81.jpg)
Batch Normalization
Batch normalization: Accelerating deep network training by reducing internal
covariate shift
![Page 82: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/82.jpg)
Parametric Rectifiers
He K, Zhang X, Ren S, et al. Delving deep into rectifiers: Surpassing human-
level performance on imagenet classification
![Page 83: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/83.jpg)
Bilinear CNNs
Lin T Y, RoyChowdhury A, Maji S. Bilinear cnn models for fine-grained visual recognition
![Page 84: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/84.jpg)
PART 2: Architecture Design
• Overview
• Structure design
• Layer design
• Architecture for special tasks
![Page 85: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/85.jpg)
Deepface
Taigman Y, Yang M, Ranzato M A, et al. Deepface: Closing the gap to human-level
performance in face verification
![Page 86: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/86.jpg)
Global Convolutional Networks
Peng C, Zhang X, Yu G, et al. Large Kernel Matters--Improve Semantic Segmentation
by Global Convolutional Network
![Page 87: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/87.jpg)
Hourglass Networks
Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation
![Page 88: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/88.jpg)
Summary: Trends on Architecture Design
• Effectiveness and efficiency
• Task & data specific
• ML & optimization perspective
• Insight & motivation driven
![Page 89: Lecture 3: Neural Network Basics & Architecture Design · Lecture 3: Neural Network Basics & Architecture Design Xiangyu Zhang Face++ Researcher zhangxiangyu@megvii.com](https://reader030.vdocument.in/reader030/viewer/2022040306/5ecaa14a264e2d1550151579/html5/thumbnails/89.jpg)
Thanks