l13 modern cnns training - cc.gatech.edu · case studies •there are several generations of...
TRANSCRIPT
![Page 1: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/1.jpg)
CS 4803 / 7643: Deep Learning
Zsolt KiraGeorgia Tech
Topics: – (Finish) Modern CNNs– Detection & Segmentation architectures
![Page 2: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/2.jpg)
Administrivia• PS2/HW2 out today!
– Due: 03/03, 11:55pm– Less involved than PS1/HW1
• Please send project proposal!– Get feedback before next week
(C) Dhruv Batra and Zsolt Kira 2
![Page 3: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/3.jpg)
Last Time
![Page 4: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/4.jpg)
Handwriting Recognition Example
![Page 5: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/5.jpg)
Translation Invariance
![Page 6: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/6.jpg)
Some Rotation Invariance
![Page 7: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/7.jpg)
Some Scale Invariance
![Page 8: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/8.jpg)
Case Studies• There are several generations of ConvNets
– 2012 – 2014: AlexNet, ZNet, VGGNet• Conv-Relu, Pooling, Fully connected, Softmax• Deeper ones (VGGNet) tend to do better
– 2014• Fully-convolutional networks for semantic segmentation• Matrix outputs rather than just one probability distribution
– 2014-2016• Fully-convolutional networks for classification• Less parameters, faster than comparable Gen1 networks• GoogleNet, ResNet
– 2014-2016• Detection layers (proposals)• Caption generation (combine with RNNs for language)
![Page 9: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/9.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 10: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/10.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 11: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/11.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 12: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/12.jpg)
AlexNet: 60M params
ZNet: 75M
VGG: 138M
GoogleNet: 5M
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 13: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/13.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 14: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/14.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 15: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/15.jpg)
Importance of Depth
• After a while, adding depth decreases performance
• At first, vanishing/exploding gradients
• normalized initialization• Batch normalization• 2nd order methods
• Then, optimization limitation– Deeper network should
be able to mimic shallow ones
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 16: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/16.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 17: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/17.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 18: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/18.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 19: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/19.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 20: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/20.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 21: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/21.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 22: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/22.jpg)
Localization and Detection
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 23: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/23.jpg)
Class ScoresCat: 0.9Dog: 0.05Car: 0.01...
So far: Image Classification
This image is CC0 public domain Vector:4096
Fully-Connected:4096 to 1000
Figure copyright Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, 2012. Reproduced with permission.
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 24: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/24.jpg)
Computer Vision Tasks
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 25: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/25.jpg)
Computer Vision Tasks
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 26: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/26.jpg)
Classification + Localization
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 27: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/27.jpg)
CLS - ImageNet
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 28: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/28.jpg)
Idea 1: Localization as Regression
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 29: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/29.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 30: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/30.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 31: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/31.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 32: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/32.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 33: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/33.jpg)
Per-Class vs. Class Agnostic
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 34: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/34.jpg)
Where to attach?
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 35: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/35.jpg)
Multiple Objects
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 36: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/36.jpg)
Human Pose Estimation
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 37: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/37.jpg)
Sliding Window: Overfeat
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 38: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/38.jpg)
Sliding Window: Overfeat
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 39: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/39.jpg)
Sliding Window: Overfeat
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 40: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/40.jpg)
Sliding Window: Overfeat
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 41: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/41.jpg)
Sliding Window: Overfeat
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 42: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/42.jpg)
Sliding Window: Overfeat
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 43: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/43.jpg)
Sliding Window: Overfeat
Why aren’t boxes across grid?Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 44: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/44.jpg)
Sliding Window: Overfeat
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 45: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/45.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 46: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/46.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 47: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/47.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 48: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/48.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 49: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/49.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 50: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/50.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 51: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/51.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 52: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/52.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 53: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/53.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 54: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/54.jpg)
Detection as Classification
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 55: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/55.jpg)
Detection as Classification
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 56: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/56.jpg)
Detection as Classification
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 57: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/57.jpg)
Detection as Classification
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 58: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/58.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 59: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/59.jpg)
Detection as Classification
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 60: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/60.jpg)
R-CNN
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 61: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/61.jpg)
R-CNN Training
![Page 62: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/62.jpg)
R-CNN Training
![Page 63: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/63.jpg)
R-CNN Training
![Page 64: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/64.jpg)
R-CNN Training
![Page 65: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/65.jpg)
R-CNN Training
![Page 66: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/66.jpg)
R-CNN Training
![Page 67: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/67.jpg)
Datasets
![Page 68: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/68.jpg)
Evaluation Metric
![Page 69: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/69.jpg)
R-CNN Results
![Page 70: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/70.jpg)
R-CNN Results
![Page 71: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/71.jpg)
R-CNN Results
![Page 72: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/72.jpg)
R-CNN Results
![Page 73: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/73.jpg)
R-CNN Results
![Page 74: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/74.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 75: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/75.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 76: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/76.jpg)
Region of Interest (ROI) Pooling
![Page 77: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/77.jpg)
Region of Interest (ROI) Pooling
![Page 78: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/78.jpg)
Region of Interest (ROI) Pooling
![Page 79: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/79.jpg)
Region of Interest (ROI) Pooling
![Page 80: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/80.jpg)
Region of Interest (ROI) Pooling
![Page 81: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/81.jpg)
Fast R-CNN Results
![Page 82: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/82.jpg)
Fast R-CNN Results
![Page 83: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/83.jpg)
Fast R-CNN Results
![Page 84: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/84.jpg)
Fast R-CNN Results
![Page 85: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/85.jpg)
Fast R-CNN Results
![Page 86: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/86.jpg)
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 87: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/87.jpg)
Slide Credit: Yacov Hel-Or
![Page 88: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/88.jpg)
Slide Credit: Yacov Hel-Or
![Page 89: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/89.jpg)
https://www.youtube.com/watch?v=KYNDzlcQMWA
Joint Coordinates
Also used for pose estimation
![Page 90: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/90.jpg)
![Page 91: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/91.jpg)
Computer Vision Tasks
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 92: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/92.jpg)
Semantic Segmentation
Cow
Grass
Sky
Label each pixel in the image with a category label
Don’t differentiate instances, only care about pixels
This image is CC0 public domain
Grass
Cat
Sky
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 93: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/93.jpg)
Semantic Segmentation Idea: Sliding Window
Full image
Extract patchClassify center pixel with CNN
Cow
Cow
Grass
Farabet et al, “Learning Hierarchical Features for Scene Labeling,” TPAMI 2013Pinheiro and Collobert, “Recurrent Convolutional Neural Networks for Scene Labeling”, ICML 2014
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 94: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/94.jpg)
Semantic Segmentation Idea: Sliding Window
Full image
Extract patchClassify center pixel with CNN
Cow
Cow
GrassProblem: Very inefficient! Not reusing shared features between overlapping patches Farabet et al, “Learning Hierarchical Features for Scene Labeling,” TPAMI 2013
Pinheiro and Collobert, “Recurrent Convolutional Neural Networks for Scene Labeling”, ICML 2014
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 95: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/95.jpg)
Semantic Segmentation Idea: Fully Convolutional
Input:3 x H x W
Convolutions:D x H x W
Conv Conv Conv Conv
Scores:C x H x W
argmax
Predictions:H x W
Design a network as a bunch of convolutional layers to make predictions for pixels all at once!
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 96: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/96.jpg)
Semantic Segmentation Idea: Fully Convolutional
Input:3 x H x W
Convolutions:D x H x W
Conv Conv Conv Conv
Scores:C x H x W
argmax
Predictions:H x W
Design a network as a bunch of convolutional layers to make predictions for pixels all at once!
Problem: convolutions at original image resolution will be very expensive ...
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 97: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/97.jpg)
Dilated Convolutions
(C) Dhruv Batra 98Figure Credit: Dumoulin and Visin, https://arxiv.org/pdf/1603.07285.pdf
![Page 98: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/98.jpg)
Dilated Convolutions (Examples)
(C) Dhruv Batra 99
![Page 99: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/99.jpg)
(C) Dhruv Batra 100Figure Credit: Yu and Koltun, ICLR16
![Page 100: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/100.jpg)
Different Idea: Still Classify, Change Test Time
• Can run on an image of any size!
(C) Dhruv Batra 101Figure Credit: [Long, Shelhamer, Darrell CVPR15]
![Page 101: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/101.jpg)
Better Semantic Segmentation Idea: Fully Convolutional with Bottleneck
Input:3 x H x W Predictions:
H x W
Design network as a bunch of convolutional layers, with downsampling and upsampling inside the network!
High-res:D1 x H/2 x W/2
High-res:D1 x H/2 x W/2
Med-res:D2 x H/4 x W/4
Med-res:D2 x H/4 x W/4
Low-res:D3 x H/4 x W/4
Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015Noh et al, “Learning Deconvolution Network for Semantic Segmentation”, ICCV 2015
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 102: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/102.jpg)
Semantic Segmentation Idea: Fully Convolutional
Input:3 x H x W Predictions:
H x W
Design network as a bunch of convolutional layers, with downsampling and upsampling inside the network!
High-res:D1 x H/2 x W/2
High-res:D1 x H/2 x W/2
Med-res:D2 x H/4 x W/4
Med-res:D2 x H/4 x W/4
Low-res:D3 x H/4 x W/4
Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015Noh et al, “Learning Deconvolution Network for Semantic Segmentation”, ICCV 2015
Downsampling:Pooling, strided convolution
Upsampling:???
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 103: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/103.jpg)
In-Network upsampling: “Unpooling”
1 2
3 4
Input: 2 x 2 Output: 4 x 4
1 1 2 2
1 1 2 2
3 3 4 4
3 3 4 4
Nearest Neighbor
1 2
3 4
Input: 2 x 2 Output: 4 x 4
1 0 2 0
0 0 0 0
3 0 4 0
0 0 0 0
“Bed of Nails”
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 104: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/104.jpg)
In-Network upsampling: “Max Unpooling”
Input: 4 x 4
1 2 6 3
3 5 2 1
1 2 2 1
7 3 4 8
1 2
3 4
Input: 2 x 2 Output: 4 x 4
0 0 2 0
0 1 0 0
0 0 0 0
3 0 0 4
Max UnpoolingUse positions from pooling layer
5 6
7 8
Max PoolingRemember which element was max!
… Rest of the network
Output: 2 x 2
Corresponding pairs of downsampling and upsampling layers
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 105: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/105.jpg)
Learnable Upsampling: Transpose Convolution
Recall:Typical 3 x 3 convolution, stride 1 pad 1
Input: 4 x 4 Output: 4 x 4
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 106: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/106.jpg)
Learnable Upsampling: Transpose Convolution
Recall: Normal 3 x 3 convolution, stride 1 pad 1
Input: 4 x 4 Output: 4 x 4
Dot product between filter and input
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 107: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/107.jpg)
Learnable Upsampling: Transpose Convolution
Input: 4 x 4 Output: 4 x 4
Dot product between filter and input
Recall: Normal 3 x 3 convolution, stride 1 pad 1
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 108: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/108.jpg)
Input: 4 x 4 Output: 2 x 2
Learnable Upsampling: Transpose Convolution
Recall: Normal 3 x 3 convolution, stride 2 pad 1
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 109: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/109.jpg)
Input: 4 x 4 Output: 2 x 2
Dot product between filter and input
Learnable Upsampling: Transpose Convolution
Recall: Normal 3 x 3 convolution, stride 2 pad 1
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 110: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/110.jpg)
Learnable Upsampling: Transpose Convolution
Input: 4 x 4 Output: 2 x 2
Dot product between filter and input
Filter moves 2 pixels in the input for every one pixel in the output
Stride gives ratio between movement in input and output
Recall: Normal 3 x 3 convolution, stride 2 pad 1
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 111: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/111.jpg)
Learnable Upsampling: Transpose Convolution
3 x 3 transpose convolution, stride 2 pad 1
Input: 2 x 2 Output: 4 x 4
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 112: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/112.jpg)
Input: 2 x 2 Output: 4 x 4
Input gives weight for filter
Learnable Upsampling: Transpose Convolution
3 x 3 transpose convolution, stride 2 pad 1
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 113: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/113.jpg)
Input: 2 x 2 Output: 4 x 4
Input gives weight for filter
Sum where output overlaps
Learnable Upsampling: Transpose Convolution
3 x 3 transpose convolution, stride 2 pad 1
Filter moves 2 pixels in the output for every one pixel in the input
Stride gives ratio between movement in output and input
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 114: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/114.jpg)
Input: 2 x 2 Output: 4 x 4
Input gives weight for filter
Sum where output overlaps
Learnable Upsampling: Transpose Convolution
3 x 3 transpose convolution, stride 2 pad 1
Filter moves 2 pixels in the output for every one pixel in the input
Stride gives ratio between movement in output and input
Other names:-Deconvolution (bad)-Upconvolution-Fractionally stridedconvolution-Backward stridedconvolution
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 115: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/115.jpg)
Transpose Convolution: 1D Example
a
b
x
y
z
ax
ay
az + bx
by
bz
Input FilterOutput
Output contains copies of the filter weighted by the input, summing at where at overlaps in the output
Need to crop one pixel from output to make output exactly 2x input
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 116: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/116.jpg)
Transposed Convolution• https://distill.pub/2016/deconv-checkerboard/
(C) Dhruv Batra 117
![Page 117: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/117.jpg)
Semantic Segmentation Idea: Fully Convolutional
Input:3 x H x W Predictions:
H x W
Design network as a bunch of convolutional layers, with downsampling and upsampling inside the network!
High-res:D1 x H/2 x W/2
High-res:D1 x H/2 x W/2
Med-res:D2 x H/4 x W/4
Med-res:D2 x H/4 x W/4
Low-res:D3 x H/4 x W/4
Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015Noh et al, “Learning Deconvolution Network for Semantic Segmentation”, ICCV 2015
Downsampling:Pooling, strided convolution
Upsampling:Unpooling or strided transpose convolution
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
![Page 118: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/118.jpg)
skip layers
interp + sum
interp + sum
dense output119
end-to-end, joint learningof semantics and location
Slide credit: Jonathan Long
![Page 119: L13 modern cnns training - cc.gatech.edu · Case Studies •There are several generations of ConvNets –2012 –2014: AlexNet, ZNet, VGGNet •Conv-Relu, Pooling, Fully connected,](https://reader034.vdocument.in/reader034/viewer/2022050717/5e160ff4ff94bd64f512736a/html5/thumbnails/119.jpg)
Next Time• Guest lecture by Zhile Ren (Dhruv Batra post-doc) on 3D
120