machine learning & category recognition
DESCRIPTION
Machine learning & category recognition. Cordelia Schmid Jakob Verbeek. Content of the course. Visual object recognition Robust image description Machine learning. Visual recognition - Objectives. Particular objects and scenes, large databases. …. Visual recognition - Objectives. - PowerPoint PPT PresentationTRANSCRIPT
Machine learning &
category recognition
Cordelia Schmid
Jakob Verbeek
Content of the course
• Visual object recognition
• Robust image description
• Machine learning
Visual recognition - Objectives
• Particular objects and scenes, large databases
…
Visual recognition - Objectives
• Object classes and categories (intra-class variability)
Visual object recognition
glass
candle
person
drinking
indoors
car
car
car
person
kidnappinghouse
street
outdoors
person
car
street
outdoors car enter
person
car
roadfield
countryside
car crash
Visual object recognition
exit through a doorbuilding
car
people
outdoors
Visual recognition - Objectives
• Human motion and actions
Difficulties: within object variations
Variability: Camera position, Illumination,Internal parameters
Within-object variations
Difficulties: within-class variations
Visual recognition
• Robust image description – Appropriate descriptors for objects and categories
• Statistical modeling and machine learning for vision– Selection and adaptation of existing techniques
Robust image description
• Invariant detectors and descriptors• Scale and affine-invariant keypoint detectors
Matching of descriptors
Significant viewpoint change
Basis:contour segment network
edgel-chains partitioned into straight contour segments
segments connected at edgel-chains’ endpoints and junctions
Ferrari et al. ECCV 2006
Contour features
[Ferrari, Fevrier, Jurie & Schmid, Pami’07]
Localization of “shape” categories
Window descriptor + SVM Horse localization
Why machine learning?
• Early approaches: simple features + handcrafted models• Can handle only few images, simples tasks
L. G. Roberts, Machine Perception of Three Dimensional Solids,
Ph.D. thesis, MIT Department of Electrical Engineering, 1963.
Why machine learning?
• Early approaches: manual programming of rules• Tedious, limited and does not take into accout the data
Y. Ohta, T. Kanade, and T. Sakai, “An Analysis System for Scenes Containing objects with Substructures,” International Joint Conference on Pattern Recognition, 1978.
Why machine learning?
• Today lots of data, complex tasks
Internet images, personal photo albums
Movies, news, sports
Why machine learning?
• Today lots of data, complex tasks
Surveillance and security Medical and scientific images
Why machine learning?
• Today: Lots of data, complex tasks
• Instead of trying to encode rules directly, learn them from examples of inputs and desired outputs
Types of learning problems
• Supervised– Classification– Regression
• Unsupervised• Semi-supervised• Reinforcement learning• Active learning• ….
Supervised learning
• Given training examples of inputs and corresponding outputs, produce the “correct” outputs for new inputs
• Two main scenarios:
– Classification: outputs are discrete variables (category labels). Learn a decision boundary that separates one class from the other
– Regression: also known as “curve fitting” or “function approximation.” Learn a continuous input-output mapping from examples (possibly noisy)
Unsupervised Learning
• Given only unlabeled data as input, learn some sort of structure
• The objective is often more vague or subjective than in supervised learning. This is more of an exploratory/descriptive data analysis
Unsupervised Learning
• Clustering– Discover groups of “similar” data points
Unsupervised Learning
• Quantization– Map a continuous input to a discrete (more compact) output
1
2
3
Unsupervised Learning
• Dimensionality reduction, manifold learning– Discover a lower-dimensional surface on which the data lives
Unsupervised Learning
• Density estimation– Find a function that approximates the probability density of the
data (i.e., value of the function is high for “typical” points and low for “atypical” points)
– Can be used for anomaly detection
Other types of learning
• Semi-supervised learning: lots of data is available, but only small portion is labeled (e.g. since labeling is expensive)
Other types of learning
• Semi-supervised learning: lots of data is available, but only small portion is labeled (e.g. since labeling is expensive)– Why is learning from labeled and unlabeled data better than
learning from labeled data alone?
?
Other types of learning
• Active learning: the learning algorithm can choose its own training examples, or ask a “teacher” for an answer on selected inputs
Other types of learning
• Reinforcement learning: an agent takes inputs from the environment, and takes actions that affect the environment. Occasionally, the agent gets a scalar reward or punishment. The goal is to learn to produce action sequences that maximize the expected reward (e.g. driving a robot without bumping into obstacles)
• Image classification: assigning label to the image
Visual object recognition - tasks
Car: presentCow: presentBike: not presentHorse: not present…
• Image classification: assigning label to the image
Tasks
Car: presentCow: presentBike: not presentHorse: not present…• Object localization: define the location and the category
Car CowLocatio
n
Category
Visual object recognition - tasks
Bag-of-features for image classification
• Excellent results in the presence of background clutter
bikes books building cars people phones trees
Bag-of-features for image classification
Classification
SVM
Extract regions Compute descriptors
Find clusters and frequencies
Compute distance matrix
[Nowak,Jurie&Triggs,ECCV’06], [Zhang,Marszalek,Lazebnik&Schmid,IJCV’07]
Spatial pyramid matching
Perform matching in 2D image space
[Lazebnik, Schmid & Ponce, CVPR’06]
Retrieval examples
Query
Localization of object categories
Localization approach
Histogram of oriented image gradients as image descriptor
SVM as classifier, importance weighted descriptors
Unsupervised learning using Markov field aspect models[Verbeek & Triggs, CVPR’07]
• Goal: automatic interpretation of natural scenes– assign pixels in images to visual categories– learn models from image-wide labeling, without localization
• Per training image a list of present categories
• Approach: capture local and image-wide correlations– Markov fields capture local label contiguity– Aspect models capture image-wide label correlation– Interleave:
• Region-to-category assignments using Loopy Belief Propagation and labeling• Category model estimation
Example scene interpretation of training image
Localization based on shape
[Ferrari, Jurie & Schmid, CVPR’07] [Marzsalek & Schmid, CVPR’07]
Master Internships
• Internships are available in the LEAR group– Object localization (C. Schmid)– Video recognition (C. Schmid)– Semi-supervised / text-based learning (J. Verbeek)
• If you are interested send an email to us