object recognition methods for classification and image representation
TRANSCRIPT
![Page 1: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/1.jpg)
Object recognition
Methods for classification and image representation
![Page 2: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/2.jpg)
Credits
• Paul Viola, Michael Jones, Robust Real-time Object Detection, IJCV 04
• Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
• Kristen Grauman, Gregory Shakhnarovich, and Trevor Darrell, Virtual Visual Hulls: Example-Based 3D Shape Inference from Silhouettes
• S. Lazebnik, C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories.
• Yoav Freund Robert E. Schapire, A Short Introduction to Boosting
![Page 3: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/3.jpg)
Object recognition
• What is it?– Instance– Category– Something with a tail
• Where is it?– Localization– Segmentation
• How many are there?
![Page 4: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/4.jpg)
Object recognition
• What is it?– Instance– Category– Something with a tail
• Where is it?– Localization– Segmentation
• How many are there?
![Page 5: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/5.jpg)
Face detection
features? classify+1 face
-1 not face
• We slide a window over the image• Extract features for each window• Classify each window into face/non-face
x F(x) y
? ?
![Page 6: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/6.jpg)
What is a face?
• Eyes are dark (eyebrows+shadows)• Cheeks and forehead are bright.• Nose is bright
Paul Viola, Michael Jones, Robust Real-time Object Detection, IJCV 04
![Page 7: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/7.jpg)
Basic feature extraction
• Information type:– intensity
• Sum over:– gray and white rectangles
• Output: gray-white• Separate output value for
– Each type– Each scale– Each position in the window
• FEX(im)=x=[x1,x2,…….,xn]
Paul Viola, Michael Jones, Robust Real-time Object Detection, IJCV 04
x120x357
x629 x834
![Page 8: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/8.jpg)
Face detection
features? classify+1 face
-1 not face
• We slide a window over the image• Extract features for each window• Classify each window into face/non-face
x F(x) y
![Page 9: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/9.jpg)
Classification
• Examples are points in Rn
• Positives are separated from negatives by the hyperplane w
• y=sign(wTx-b)
+ +++
+
++
+-
-
--
-
--
-
w
![Page 10: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/10.jpg)
Classification
• x Rn - data points• P(x) - distribution of the data• y(x) - true value of y for each x• F - decision function:
y=F(x, ) - parameters of F,
e.g. =(w,b)• We want F that makes few
mistakes
+ +++
+
++
+-
-
--
-
--
-
w
![Page 11: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/11.jpg)
Loss function
• Our decision may have severe implications
• L(y(x),F(x, )) - loss functionHow much we pay for predicting F(x,), when the true value is y(x)
• Classification error:
• Hinge loss
+ +++
+
++
+-
-
--
-
--
-
w
POSSIBLE CANCER
ABSOLUTELY NORISK OF CANCER
![Page 12: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/12.jpg)
Learning
• Total loss shows how good a function (F, ) is:
• Learning is to find a function to minimize the loss:
• How can we see all possible x?
![Page 13: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/13.jpg)
Datasets
• Dataset is a finite sample {xi} from P(x)• Dataset has labels {(xi,yi)}• Datasets today are big to ensure the sampling is
fair
#images #classes #instances
Caltech 256 30608 256 30608
Pascal VOC 4340 20 10363
LabelMe 176975 ??? 414687
![Page 14: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/14.jpg)
Overfitting
• A simple dataset.
• Two models
+ +
+
+
+
+
+
+-
-
-
-
-
--
-
+
+ +
+
++
+
+-
-
-
-
-
--
-
+
+Linear Non-linear
![Page 15: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/15.jpg)
Overfitting
+ +
+
+
+
+
+
+-
-
-
-
-
--
-
+
+
++ +-
--
-
--
--
--
++
+
+
+ +
+
+
+
+
+
+-
-
-
-
-
--
-
+
+
++ +-
--
-
--
--
--
++
+
+
• Let’s get more data.• Simple model has better generalization.
![Page 16: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/16.jpg)
Overfitting
• As complexity increases, the model overfits the data
• Training loss decreases
• Real loss increases• We need to penalize
model complexity = to regularize
Model complexity
Training loss
Real loss
Loss
![Page 17: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/17.jpg)
Overfitting
• Split the dataset– Training set– Validation set– Test set
• Use training set to optimize model parameters
• Use validation test to choose the best model
• Use test set only to measure the expected loss
Model complexity
Training set loss
Test set loss
Loss
Validation set loss
Stopping point
![Page 18: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/18.jpg)
Classification methods
• K Nearest Neighbors
• Decision Trees
• Linear SVMs
• Kernel SVMs
• Boosted classifiers
![Page 19: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/19.jpg)
K Nearest Neighbors
• Memorize all training data
• Find K closest points to the query
• The neighbors vote for the label:
Vote(+)=2
Vote(–)=1
+ +
+
+
+
+
+
+-
-
-
-
-
-
-
+
+o
-- - -
-
-
![Page 20: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/20.jpg)
K-Nearest Neighbors
Kristen Grauman, Gregory Shakhnarovich, and Trevor Darrell,Virtual Visual Hulls: Example-Based 3D Shape Inference from Silhouettes
Nearest Neighbors (silhouettes)
![Page 21: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/21.jpg)
K-Nearest Neighbors
Kristen Grauman, Gregory Shakhnarovich, and Trevor Darrell,Virtual Visual Hulls: Example-Based 3D Shape Inference from Silhouettes
Silhouettes from other views3D Visual hull
![Page 22: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/22.jpg)
Decision tree
o
X1>2
X2>1
V(+)=8V(-)=2
V(+)=2V(-)=8
V(+)=0V(-)=4
YesNo
YesNo+
V(+)=8
V(-)=4
V(-)=8
![Page 23: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/23.jpg)
Decision Tree Training
• Partition data into pure chunks
• Find a good rule• Split the training data
– Build left tree– Build right tree
• Count the examples in the leaves to get the votes: V(+), V(-)
• Stop when– Purity is high– Data size is small– At fixed level
+ +
+
+
+
+
+
+-
-
-
-
-
--
-
+
--
-
-
V(-)=57%V(-)=80% V(+)=64%V(+)=80%
V(-)=100%
![Page 24: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/24.jpg)
Decision trees
• Stump: – 1 root– 2 leaves
• If xi > a
then positive
else negative
• Very simple• “Weak classifier”
Paul Viola, Michael Jones, Robust Real-time Object Detection, IJCV 04
x120x357
x629 x834
![Page 25: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/25.jpg)
Support vector machines
• Simple decision• Good classification• Good generalization
+ +
+
+
+
+
+
+-
-
-
-
-
--
-
+
--
-
-
wmargin
![Page 26: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/26.jpg)
Support vector machines
+ +
+
+
+
+
+
+-
-
-
-
-
--
-
+
--
-
-
w
Support vectors:
![Page 27: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/27.jpg)
How do I solve the problem?
• It’s a convex optimization problem– Can solve in Matlab (don’t)
• Download from the web– SMO: Sequential Minimal Optimization– SVM-Light http://svmlight.joachims.org/– LibSVM http://www.csie.ntu.edu.tw/~cjlin/libsvm/– LibLinear http://www.csie.ntu.edu.tw/~cjlin/liblinear/
– SVM-Perf http://svmlight.joachims.org/– Pegasos http://ttic.uchicago.edu/~shai/
![Page 28: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/28.jpg)
Linear SVM for pedestrian detection
![Page 29: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/29.jpg)
Slides by Pete Barnum Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
![Page 30: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/30.jpg)
uncentered
centered
cubic-corrected
diagonal
Sobel
Slides by Pete Barnum Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
![Page 31: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/31.jpg)
• Histogram of gradient orientations-Orientation
Slides by Pete Barnum Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
![Page 32: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/32.jpg)
X=
15x7 cells
8 orientations
Slides by Pete Barnum Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
![Page 33: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/33.jpg)
pedestrian
Slides by Pete Barnum Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
![Page 34: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/34.jpg)
Kernel SVMDecision function is a linear combination of support vectors:
Prediction is a dot product:
Kernel is a function that computes the dot product of data points in some unknown space:
We can compute the decision without knowing the space:
![Page 35: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/35.jpg)
Useful kernels
• Linear!
• RBF
• Histogram intersection
• Pyramid match
![Page 36: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/36.jpg)
Histogram intersection
Assign to texture cluster
+1
Count
S. Lazebnik, C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories.
![Page 37: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/37.jpg)
(Spatial) Pyramid Match
S. Lazebnik, C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories.
![Page 38: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/38.jpg)
Boosting
• Weak classifierClassifier that is slightly better than random guessing
• Weak learner builds weak classifiers
![Page 39: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/39.jpg)
Boosting
• Start with uniform distribution
• Iterate:1. Get a weak classifier fk
2. Compute it’s 0-1 error
3. Take
4. Update distribution
• Output the final “strong” classifier
Yoav Freund Robert E. Schapire, A Short Introduction to Boosting
![Page 40: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/40.jpg)
Face detection
features? classify+1 face
-1 not face
• We slide a window over the image• Extract features for each window• Classify each window into face/non-face
x F(x) y
![Page 41: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/41.jpg)
Face detection
• Use haar-like features
• Use decision stumps as week classifiers
• Use boosting to build a strong classifier
• Use sliding window to detect the face
x120x357
x629 x834
X234>1.3
-1Non-face
+1 Face
YesNo
![Page 42: Object recognition Methods for classification and image representation](https://reader035.vdocument.in/reader035/viewer/2022081519/56649f585503460f94c7ccff/html5/thumbnails/42.jpg)