unsupervised category modeling, recognition and segmentation sinisa todorovic and narendra ahuja

Unsupervised Category Modeling, Recognition and Segmentation

Sinisa Todorovic and Narendra Ahuja

What is Common in a Set of Images?

Images possibly containan object of interest Which objects appear

frequently in the set?

What properties are shared by similar objects in the set?

Where are the objects?

Objective: Car Example

occlusion no car multiple cars

learn car model

unseen image

segment

all cars

RESULT

occlusion

Training

Problem DefinitionGIVEN

Images possibly containing frequent occurrences of similar objects

DETERMINE

If similar objects are present

AND IF YES LEARN

The model of similar objects

GIVEN

An unseen image

DETECT, RECOGNIZE AND SEGMENT

All occurrences of the learned category

Testing

Prior Work Dominated By:

• Statistical modeling of local features: patches or curve fragments

• Trend: Object detection = Image classification

• Trend: Object segmentation = Object localization

• Trend: Object segmentation = Binary thresholding of a probabilistic map

• Hypothesize the number of objects and their parts

• Hypothesize the topology of object parts

• Each training image must contain a category of interest

• Modeling background

• Require typically hundreds of training images

• Explicit modeling of recursive embedding of object subparts

• Regions vs. local features open questions:

• More informative?

• More stable and robust to noise?

• Regions allow:

• simultaneous object detection and segmentation

• explicit representation of the recursive embedding property

Category Modeling is Very Difficult

Our Approach

SIMILAR OBJECTS PRESENT IN THE SET

MANY SUBIMAGES WITH SIMILAR REGION PROPERTIES

ABUNDANT DATA

ROBUST LEARNING IS FEASIBLE

find ?

do ?

image matching

structural learning

Region Properties

• Geometric

• Region area

• Boundary shape

• Photometric

• Gray-level contrast with the surround

• Topology

• Recursive containment of regions

• Layout - relative region locations

Feature Extraction = Image Segmentation

Image

Homogeneous regions at ALL

contrasts and sizes

segmentation

[N. Ahuja TPAMI ‘96, Tabb & Ahuja TIP ‘97, Arora & Ahuja ICPR ‘06]

Example segmentations for several contrasts

Multiscale Segmentation to Segmentation TreeSample cutsets

Segmentation tree

Contrast level ≠ Tree level

Example segmentations

Image = Tree and Object = Subtree

Outline of Our Approach

Images = Trees

Category present = Many similar subtrees

Extracting similar subtrees = Tree matching

Category model = Union of similar subtrees

Simultaneous detection, recognition and segmentation of

ALL category instancesby

Matching the model with an image

Tree Matching: Structural Noise

Edit-distance tree matching

[Pelillo et a. PAMI‘99, Torsello&Hancock ECCV’02, PRL’03]

Matching Algorithm

Input trees

Matched subtrees

which MAXIMIZES their similarity measure

Matching Algorithm

FIND bijection

while PRESERVING ancestor-descendant relationships

GIVEN two trees:

node salienc

y

cost of node matching

Matching Algorithm: Recursive Solution

Maximum clique over all

descendant pairs

descendants

SOLUTIONSelect all pairs with >

threshold.

Outline

LEARNING

Category Model = Tree Union

Learning algorithm estimates:1) Model structure 2) Model parameters

Tree intersection:

Tree union:

Simultaneous Detection and Segmentation

MATCHING

Performance Evaluation Criteria

Ground Truth (GT)

Matched Subtrees

(MST)

DETECTION ERROR

False positive: “intersection of MST and GT” < 0.5 “union of MST and GT”

SEGMENTATION ERROR

Matched Subtrees

(MST)

Ground Truth (GT)

“XOR of MST and GT”

10 positive out of 20 training images 5 positive out of 10 training images

Results on test images:

Results: UIUC Cars Side View

6 positive out of 12 training images


Results on test images:

Results: Faces -- Caltech 101 Database

Results: Caltech Cars Rear View


Recall-Precision

Training from a small-size dataset

Varying tradeoff recall vs. precision

Complexity and Runtime on 2.4GHZ 2GB RAM PC

Training on 20 images of UIUC CARS: < 2 hours

# of tree nodes

Extracting similar subtrees:

per image pair

# of subtree nodes

Learning on 32 subtrees extracted for UIUC CARS: < 1 hour

Learning:

# of model nodes

Processing time for UIUC CARS: < 10 sec, regardless of the total number of target

objects

Detection, recognition and segmentation:

Summary and Conclusion• Unsupervised learning of an unknown category

frequently occurring in a given set of images

• Region-based, structural approach

• Simultaneous detection, recognition, and segmentation of all category instances in unseen images

• NO multiple detections on the same object

• NO hypotheses on the number of objects and their parts

• NO hypotheses on the topology of object parts

• Small number of training images

• Complexity comparable with standard methods

Acknowledgment

THANK YOU!

{sintod, n-ahuja}@uiuc.edu

http://vision.ai.uiuc.edu

http://vision.ai.uiuc.edu/

unsupervised category modeling, recognition and segmentation sinisa todorovic and narendra ahuja

Documents

image slide

contrasts slide

object segmentation

prl03 slide

difficult slide

subtree slide

structural learning

outline learning slide