supervised learning of edges and object boundaries piotr dollár zhuowen tu serge belongie

Supervised Learning ofEdges

and Object Boundaries

Piotr DollárZhuowen Tu

Serge Belongie

The problem

Outline

• I. Motivation• II. Problem formulation• III. Learning architecture (BEL)• IV. Results

Outline

• I. Motivation– Why edges?– Why not edges?– Why learning?

• II. Problem formulation• III. Learning architecture (BEL)• IV. Results

Why edges?

• Reduce dimensionality of data

• Preserve content information

• Useful in applications such as:– object detection– structure from motion– tracking

Why not edges?

But, not that useful, why?

Difficulties:1. Modeling assumptions

2. Parameters

3. Multiple sources of information (brightness, color, texture, …)

4. Real world conditions

Is edge detection even well defined?

Canny edge detection

1. smooth

2. gradient

3. thresh, suppress, link

Canny is optimal w.r.t. some model.

Canny edge detection

1. smooth

2. gradient

3. thresh, suppress, link

And yet…

1. Modeling assumptionsStep edges, junctions, etc.

2. ParametersScales, threshold, etc.

3. Multiple sources of informationOnly handles brightness

4. Real world conditionsGaussian iid noise? Texture…

Canny difficulties

1. Modeling assumptionsComplex models, computationally prohibitive

2. ParametersMany, may use learning to help tune

3. Multiple sources of informationTypically brightness, color, and texture cues

4. Real world conditionsAimed at real images

Modern methods

Modern methods (Pb)

Pb – Martin et al. PAMI04

1. Modeling assumptionsminimal

2. Parametersnone

3. Multiple sources of informationAutomatically incorporated

4. Real world conditionstraining data

Why learning?

Outline

Problem formulation (general)

scene interpretation that can include spatial location and extent of objects, regions, object boundaries, curves, etc.

0/1 function that encodes spatial extent of a component of W

Obtaining optimal or likely W or SW can be difficult. Let:

We seek to learn this distribution directly from image data. To further reduce complexity, we can discard the absolute coordinates of S:

where N(c) is the neighborhood of I centered at c.

Problem formulation (edges)

image segmentation

1 on boundaries of segments, 0 elsewhere

Discriminative framework

Sample positive and negative patches according to above:

Given an image I and n interpretations W obtained by manual annotation, we can compute:

Goal is to learn from human labeled images

Finally train a classifier!

Edge point present in center?

NO YES

Discriminative framework

Outline

Learning architecture

• Large training set O(108) – but correlated– very variable data

• Want generic, efficient features– applicability to any domain– fast computation essential

• Boosting a natural choice

AdaBoost

Taken from tutorial by Jiri Matas and Jan Sochman

Decision Stumps•Weak learners:

(where f is some feature of x)

AdaBoost (decision stumps)

Cascaded classifiers

• Minimize computation during testing

• Especially useful for skewed prior

• Viola-Jones face/pedestrian detection

Cascade (AdaBoost)

Probabilistic boosting trees

• Expected amount of computation decreases significantly

• Once a mistake is made, it cannot be undone

• Cascade also made problem easier! Ideally, splitting data creates two sub-problems each much easier than original…

Probabilistic boosting trees

• Retain efficiency of cascades

• Add power when necessary

• Prone to overfitting

• Tree was necessary to obtain good results.

……

Haar features:

• Feature response:(image response to green squares) –

(image response to red squares)

• Applied to many ‘views’ of the data– grayscale, color, Gabor filter outputs, etc.– at many orientations, locations, etc

• Fast computation using integral images• Hundreds of thousands of candidate features

Outline

– Gestalt laws– Natural images– Road detection– Object Boundaries

Results

• Boosted edge learning (BEL)

• Compare to method with best known performance (Pb), and also to Canny

• Comparison not quite fair…

Pb – Martin et al. PAMI04

Gestalt laws

• Gestalt laws of perceptual organization– Symmetry, closure, parallelism, etc.– Govern how component parts are organized into overall

patterns

• The “hard” part of edge detection

• What can and cannot be achieved in our framework?

Analogies

A:B :: C : ?

Gestalt laws: parallelism

Gestalt laws: modal completion

Gestalt laws: alternate interpretation

Outline

Natural Images

• Berkeley Segmentation Dataset and Benchmark– Standard dataset for edge detection with 300 manually

annotated images– Modern benchmark for comparing edge detection algorithms

• Notes:– Edge detection in natural images is hard– Possibly ill-defined problem– Evil but necessary comparison

Natural Images: results

image human BEL Pb

Natural Images: probabilities

Outline

Road detection

location of roads in scene

1 if pixel is on the road, 0 elsewhere

•Road detection is not edge detection

•But same learning architecture

•Ground truth obtained from map data

Road detection (training)(the 2 training images)

Road detection (testing)

(the testing image)

(`Winchester Dr.’ was not detected)

Outline

Object boundaries

location and extent of object of interest

1 on boundaries of object, 0 elsewhere

•Must tune to specific ‘type’ of edge

•Algorithms that model edges not applicable

•Potentially most useful application

Object boundaries (context)

Object boundaries (training)

Object boundaries (ground truth)

Object boundaries (Canny)

F-score = .10

Object boundaries (Pb)

F-score = .13

Object boundaries (BEL)

F-score = .79

Algorithm roundup

Accurate Adaptable Fast

Summary

• Define edges only in terms of labeled data, minimal modeling assumptions

• Minimize human effort in adapting algorithm to particular domain

• Fast, affordable edge detection for the masses!

Thank you!

supervised learning of edges and object boundaries piotr dollár zhuowen tu serge belongie

learning architecture

problem formulationiii

problem easier

supervised learning

resultswhy edges

motivationwhy edges

computation essentialboosting

object detectionstructure

Documents

carolina galleguillos and serge belongie department of...

the fastest pedestrian detector in the...

boris babenko 1, ming-hsuan yang 2, serge belongie 1 1....

1 2 arxiv:1603.07810v3 [cs.cv] 10 apr 2017 ·...

counting crowded moving objects vincent rabaud and serge...

deep learning for videoxgwang/video.pdf · 2014. 7. 12. ·...

a new correspondence algorithm jitendra malik computer...

boris babenko, steve branson, serge belongie university of...

computer vision group university of california berkeley...

shape matching and registration by data-driven...

fundamentals and applications of vacuum...

microsoft coco: common objects in...

submission to ieee transactions on...

antón r. escobedo cse 252c behavior recognition via sparse...

computer vision group university of california berkeley...

machine learning for brain image...

objects in context - carnegie mellon school of … ·...

carolina galleguillos and serge belongie department of...

spectral grouping using the nystrom method - pattern...

package ‘snftool’ - r · 2018. 4. 24. · author bo...