the viola/jones face detector a “paradigmatic” method for real-time object detection training is...
TRANSCRIPT
![Page 1: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/1.jpg)
The Viola/Jones Face Detector
• A “paradigmatic” method for real-time object detection
• Training is slow, but detection is very fast• Key ideas
• Integral images for fast feature evaluation• Boosting for feature selection• Attentional cascade for fast rejection of non-face windows
P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.
Slides by Robert Fergus
![Page 2: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/2.jpg)
Image Features
“Rectangle filters”
Value =
∑ (pixels in white area) – ∑ (pixels in black area)
![Page 3: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/3.jpg)
Example
Source
Result
![Page 4: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/4.jpg)
Fast computation with integral images
• The integral image computes a value at each pixel (x,y) that is the sum of the pixel values above and to the left of (x,y), inclusive
• This can quickly be computed in one pass through the image
(x,y)
![Page 5: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/5.jpg)
Computing sum within a rectangle
• Let A,B,C,D be the values of the integral image at the corners of a rectangle
• Then the sum of original image values within the rectangle can be computed as: sum = A – B – C + D
• Only 3 additions are required for any size of rectangle!• This is now used in many areas
of computer vision
D B
C A
![Page 6: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/6.jpg)
Example
-1 +1+2
-1
-2
+1
Integral Image
(x,y)(x,y)
![Page 7: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/7.jpg)
Feature selection
• For a 24x24 detection region, the number of possible rectangle features is ~180,000!
![Page 8: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/8.jpg)
Feature selection
• For a 24x24 detection region, the number of possible rectangle features is ~180,000!
• At test time, it is impractical to evaluate the entire feature set
• Can we create a good classifier using just a small subset of all possible features?
• How to select such a subset?
![Page 9: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/9.jpg)
Boosting
• Boosting is a classification scheme that works by combining weak learners into a more accurate ensemble classifier
• Weak learner: classifier with accuracy that need be only better than chance
• We can define weak learners based on rectangle features:
Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.
![Page 10: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/10.jpg)
AdaBoost
• Given a set of weak classifiers
– None much better than random
• Iteratively combine classifiers– Form a linear combination
– Training error converges to 0 quickly– Test error is related to training margin
}1,1{)( :originally xjh
t
t bxhxC )()(
Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.
![Page 11: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/11.jpg)
60,000 features to choose from60,000 features to choose from
Boosted Face Detection: Image Features
“Rectangle filters”
Similar to Haar wavelets Papageorgiou, et al.
otherwise
)( if )(
t
tittit
xfxh
t
t bxhxC )()(
![Page 12: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/12.jpg)
Boosting outline
• Initially, give equal weight to each training example
• Iterative training procedure• Find best weak learner for current weighted training set• Raise the weights of training examples misclassified by current
weak learner
• Compute final classifier as linear combination of all weak learners (weight of each learner is related to its accuracy)
Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.
![Page 13: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/13.jpg)
Boosting
Weak Classifier 1
![Page 14: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/14.jpg)
Boosting
WeightsIncreased
![Page 15: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/15.jpg)
Boosting
Weak Classifier 2
![Page 16: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/16.jpg)
Boosting
WeightsIncreased
![Page 17: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/17.jpg)
Boosting
Weak Classifier 3
![Page 18: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/18.jpg)
Boosting
Final classifier is linear combination of weak classifiers
![Page 19: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/19.jpg)
• For each round of boosting:• Evaluate each rectangle filter on each example• Select best threshold for each filter • Select best filter/threshold combination• Reweight examples
• Computational complexity of learning: O(MNT)• M filters, N examples, T thresholds
Boosting for face detection
![Page 20: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/20.jpg)
First two features selected by boosting
![Page 21: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/21.jpg)
Cascading classifiers
• We start with simple classifiers which reject many of the negative sub-windows while detecting almost all positive sub-windows
• Positive results from the first classifier triggers the evaluation of a second (more complex) classifier, and so on
• A negative outcome at any point leads to the immediate rejection of the sub-window
FACEIMAGESUB-WINDOW
Classifier 1T
Classifier 3T
F
NON-FACE
TClassifier 2
T
F
NON-FACE
F
NON-FACE
![Page 22: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/22.jpg)
Cascading classifiers
• Chain classifiers that are progressively more complex and have lower false positive rates: vs false neg determined by
% False Pos
% D
etec
tion
0 50
50
100
FACEIMAGESUB-WINDOW
Classifier 1T
Classifier 3T
F
NON-FACE
TClassifier 2
T
F
NON-FACE
F
NON-FACE
Receiver operating characteristic
![Page 23: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/23.jpg)
Training the cascade
• Adjust weak learner threshold to minimize false negatives (as opposed to total classification error)
• Each classifier trained on false positives of previous stages• A single-feature classifier achieves 100% detection rate and
about 50% false positive rate• A five-feature classifier achieves 100% detection rate and
40% false positive rate (20% cumulative)• A 20-feature classifier achieve 100% detection rate with 10%
false positive rate (2% cumulative)
1 Feature 5 Features
F
50%20 Features
20% 2%
FACE
NON-FACE
F
NON-FACE
F
NON-FACE
IMAGESUB-WINDOW
![Page 24: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/24.jpg)
The implemented system
• Training Data• 5000 faces
– All frontal, rescaled to 24x24 pixels
• 300 million non-faces– 9500 non-face images
• Faces are normalized– Scale, translation
• Many variations• Across individuals• Illumination• Pose
(Most slides from Paul Viola)
![Page 25: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/25.jpg)
System performance
• Training time: “weeks” on 466 MHz Sun workstation
• 38 layers, total of 6061 features• Average of 10 features evaluated per window
on test set• “On a 700 Mhz Pentium III processor, the
face detector can process a 384 by 288 pixel image in about .067 seconds” • 15 Hz• 15 times faster than previous detector of comparable
accuracy (Rowley et al., 1998)
![Page 26: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/26.jpg)
Output of Face Detector on Test Images
![Page 27: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/27.jpg)
Other detection tasks
Facial Feature Localization
Male vs. female
Profile Detection
![Page 28: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/28.jpg)
Profile Detection
![Page 29: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/29.jpg)
Profile Features
![Page 30: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/30.jpg)
Summary: Viola/Jones detector
• Rectangle features• Integral images for fast computation• Boosting for feature selection• Attentional cascade for fast rejection of
negative windows
![Page 31: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/31.jpg)
Overview
Face RecognitionBrief review of EigenfacesActive Appearance modelsFace DetectionViola & Jones real-time face detectorConvolutional Neural NetworksSpecific Object RecognitionSIFT based recognition
![Page 32: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/32.jpg)
• Application of Convolutional Neural Networks to Face Detection
Osadchy, Miller, LeCun. Face Detection and Pose Estimation, 2004
![Page 33: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/33.jpg)
• Non-linear dimensionality reduction
Osadchy, Miller, LeCun. Face Detection and Pose Estimation, 2004
![Page 34: The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images](https://reader031.vdocument.in/reader031/viewer/2022013004/5697bf901a28abf838c8df89/html5/thumbnails/34.jpg)
Osadchy, Miller, LeCun. Face Detection and Pose Estimation, 2004