fast r-cnnweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-winter2018/fast_rcnn.… · confidence produce...
TRANSCRIPT
Fast R-CNNRoss Girshick
Facebook AI Research (FAIR)Work done at Microsoft Research
Presented by: Nick Joodi ⎸ Doug Sherman
Fast Region-based ConvNets (R-CNNs)
Fast Sorry about the black BG, Girshick’s slides were all black. 2
The Pascal Visual Object Classes Challenge● Overview
○ Classification, Detection, Segmentation○ For each image:
■ Does it contain the class? → classification■ Where is it? → detection via bounding box
● Evaluation○ Mean Average Precision (mAP)
■ Participants submitted results in the form of confidence
■ Produce Precision Recall curves● Average precision for each class - ● Take mean to get mAP
3
Object detection renaissance (2013-Present)
Adapted from Fast R-CNN [R. Girshick (2015)]4
Object detection renaissance (2013-Present)
Adapted from Fast R-CNN [R. Girshick (2015)]5
Object detection renaissance (2013-Present)
Adapted from Fast R-CNN [R. Girshick (2015)]6
1. Pre-existing Modelsa. “Slow” R-CNNb. SPP-net
2. Ways to improvea. SGD Mini-Batchb. New Loss Function
3. Fast R-CNNa. Architectureb. Results & Future Work
Agenda
7
● R-CNN (aka “slow R-CNN”) [Girshick et al. CVPR14]● SPP-net [He et al. ECCV14]
Region-based convnets (R-CNNs)
8
Adapted from Fast R-CNN [R. Girshick (2015)]Adapted from Fast R-CNN [R. Girshick (2015)]9
Adapted from Fast R-CNN [R. Girshick (2015)]10
Adapted from Fast R-CNN [R. Girshick (2015)]11
Adapted from Fast R-CNN [R. Girshick (2015)]12
Adapted from Fast R-CNN [R. Girshick (2015)]13
Adapted from Fast R-CNN [R. Girshick (2015)]14
What’s wrong with slow R-CNN?● Ad hoc training objectives
○ Fine-tune network with softmax classifier (log loss)○ Train post-hoc linear SVMs (hinge loss)○ Train post-hoc bounding-box regressors (L2 loss)
15
What’s wrong with slow R-CNN?● Ad hoc training objectives
○ Fine-tune network with softmax classifier (log loss)○ Train post-hoc linear SVMs (hinge loss)○ Train post-hoc bounding-box regressors (L2 loss)
● Training is slow (84h), takes a lot of disk space
16
What’s wrong with slow R-CNN?● Ad hoc training objectives
○ Fine-tune network with softmax classifier (log loss)○ Train post-hoc linear SVMs (hinge loss)○ Train post-hoc bounding-box regressors (L2 loss)
● Training is slow (84h), takes a lot of disk space● Inference (detection) is slow
○ 47s / image with VGG16 [Simonyan & Zisserman. ICLR15]○ Fixed by SPP-net [He et al. ECCV14]
17
1. Pre-existing Modelsa. “Slow” R-CNNb. SPP-net
2. Ways to improvea. SGD Mini-Batchb. New Loss Function
3. Fast R-CNNa. Architectureb. Results & Future Work
Agenda
18
Adapted from Fast R-CNN [R. Girshick (2015)]19
Adapted from Fast R-CNN [R. Girshick (2015)]20
Adapted from Fast R-CNN [R. Girshick (2015)]21
Adapted from Fast R-CNN [R. Girshick (2015)]22
Adapted from Fast R-CNN [R. Girshick (2015)]23
Adapted from Fast R-CNN [R. Girshick (2015)]24
Pyramid Pooling Layer
25
1 2
3 4
RegionOutput of Pooling Concatenated
(w/2 x h/2)(4 x 2)
(w/4 x h/4)(2 x 1)
(w/1 x h/1)(8 x 2)
To FC
Stride/Window Size
8
4
1
2
3
4
Adapted from Fast R-CNN [R. Girshick (2015)]26
What’s wrong with SPP-net?● Inherits the rest of R-CNN’s problems
○ Ad hoc training objective○ Training is slow (25h), takes a lot of disk space
● Introduces a new problem: cannot update parameters below SPP layer during training
27
Adapted from Fast R-CNN [R. Girshick (2015)]28
Agenda1. Pre-existing Models
a. “Slow” R-CNNb. SPP-net
2. Ways to improvea. SGD Mini-Batchb. New Loss Function
3. Fast R-CNNa. Architectureb. Results & Future Work
29
SGD Mini-Batch Method for RoIs
Adapted from Fast R-CNN [R. Girshick (2015)]30
SGD Mini-Batch Method for RoIs
Adapted from Fast R-CNN [R. Girshick (2015)]31
SGD Mini-Batch Method for RoIs
Adapted from Fast R-CNN [R. Girshick (2015)]32
Input size for SPP-net
SGD Mini-Batch Method for RoIs
Adapted from Fast R-CNN [R. Girshick (2015)]33
SGD Mini-Batch Method for RoIs
Adapted from Fast R-CNN [R. Girshick (2015)]34
SGD Mini-Batch Method for RoIs
Adapted from Fast R-CNN [R. Girshick (2015)]35
SGD Mini-Batch Method for RoIs
Adapted from Fast R-CNN [R. Girshick (2015)]36
SGD Mini-Batch Method for RoIs
Adapted from Fast R-CNN [R. Girshick (2015)]37
Agenda1. Pre-existing Models
a. “Slow” R-CNNb. SPP-net
2. Ways to improvea. SGD Mini-Batchb. New Loss Function
3. Fast R-CNNa. Architectureb. Resultsc. Future Work
38
Revised loss function For the classification For the bounding box
p: Predicted RoI Classificationu: True RoI Classification
tu = (tx,ty,tw,th): Predicted Bounding Boxv = (vx,vy,vw,vh): True Bounding Box
ƛ : Controls the balance between the two losses
39
Revised loss function
40
Revised loss function
Smooth: Continuously Differentiable41
1. Pre-existing Modelsa. “Slow” R-CNNb. SPP-net
2. Ways to improvea. SGD Mini-Batchb. New Loss Function
3. Fast R-CNNa. Architectureb. Results & Future Work
Agenda
42
Fast R-CNN● Fast test-time, like SPP-net● One network, trained in one stage● Higher mean average precision than slow R-CNN and SPP-net
43
Adapted from Fast R-CNN [R. Girshick (2015)]44
Adapted from Fast R-CNN [R. Girshick (2015)]45
Adapted from Fast R-CNN [R. Girshick (2015)]46
Adapted from Fast R-CNN [R. Girshick (2015)]47
Adapted from Fast R-CNN [R. Girshick (2015)]48
Adapted from Fast R-CNN [R. Girshick (2015)]49
Adapted from Fast R-CNN [R. Girshick (2015)]50
Agenda1. Pre-existing Models
a. “Slow” R-CNNb. SPP-net
2. Ways to improvea. SGD Mini-Batchb. New Loss Function
3. Fast R-CNNa. Architectureb. Results & Future Work
51
Adapted from Fast R-CNN [R. Girshick (2015)]52
Adapted from Fast R-CNN [R. Girshick (2015)]53
Adapted from Fast R-CNN [R. Girshick (2015)]54
● Out-of-network region proposals○ Selective search: 2s / img; EdgeBoxes: 0.2s / img
● Fortunately, this has already been solved
S. Ren, K. He, R. Girshick & J. Sun. “Faster R'CNN: Towards Real'Time Object Detection with Region Proposal Networks.” NIPS (2015).
What’s still wrong?
55
Fast R-CNN take-aways● End-to-end training of deep ConvNets for object detection● Fast training times● Open source for easy experimentation● A large number of ImageNet detection and COCO detection methods are
built on Fast R-CNN
56