multiple component learning for object detection · pedestrian detection multiple component...

1
Pedestrian Detection Multiple Component Learning (MCL) Piotr Dollár Piotr Dollár 1,2 1,2 Boris Babenko Boris Babenko 2 Pietro Perona Pietro Perona 1 Serge Belongie Serge Belongie 1,2 1,2 Zhuowen Tu Zhuowen Tu 3 Multiple Component Learning for Object Detection Multiple Component Learning for Object Detection 2 Computer Science and Engineering University of California, San Diego {bbabenko,sjb}@cs.ucsd.edu IGERT 1 Electrical Engineering, California Institute of Technology [email protected] 3 Lab of Neuro Imaging University of California, Los Angeles [email protected] Goals Derivation Low-level features 1. Learn part-based classifier with weak supervision (object labels provided, but no part labels) 2. Part models are classifiers from rich hypothesis class (rather than Gaussian distributions, templates, etc.) 3. No complex inference since model is discriminative Boosting Input: N training examples with and Combines T weak classifiers to learn strong classifier: Excellent generalization and strong theoretical foundation No assumptions about input space Only requires weak classifiers for arbitrary For example, can use (sets) in place of Incorporating Spatial Relations C T C 1 C 2 = + Other Applications Overview Person 1 (2) Learning diverse parts D Results on INRIA Data Algorithm Writer Identification (1) Learning a single part (3) Combining part detectors Weakly supervised learning Object location in positive images unknown Developed for learning objects, use for parts We use Multiple Instance Learning (MIL) What prevents learning same part repeatedly? Different weighting of data Not all parts expressed in all images Boosting offers way of combining multiple diverse classifiers Train one weak (part) classifier using MIL Re-weight samples according to current error Features for overall classifier Densely compute component responses C i Final classifier retrained with Haars over C i For example, can use (sets) in place of Can therefore use MIL to train weak classifier MIL learns a function Specifically, learns f in Only need to adjust MIL to take weights Features for component classifiers Haars over multiple channels: gray (1), grad (1), quantized grad (6) learned representation MCL part-based = + Single Scale Detection Results We achieve state of the art results. Set Learning Person 2 Person 2 Person 1 Speaker Identification Text independent handwriting identification 2 people, 2 pages of text each Haar features on 25x25 patches Content independent speaker identification John Kerry vs. George W. Bush, 2004 Standard MFCC features Robustness to Occlusion Artificial 45x45 occ: Artificial 30x30 occ: Re-weight samples according to current error Repeat until training error sufficiently low Training input Goal Standard MIL MCL , , , , , , , , , , , , , , MCL degrades gracefully w occ. Role of Alignment Aligned data higher performance Can’t simultaneously align articulated object without part based model Summary Advantages: 1. General notion of parts (components) 2. Component learning weakly supervised 3. Principled, general algorithm 4. State of the art results with simple features Disadvantages: 1. Large amount of data needed 2. Evaluating all components slow (currently working on improving speed) Learned components (first 5)

Upload: others

Post on 08-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiple Component Learning for Object Detection · Pedestrian Detection Multiple Component Learning (MCL) Piotr Dollár11,,22 Boris Babenko 22 Pietro Perona 11 Serge Belongie 11,,22

Pedestrian Detection Multiple Component Learning (MCL)

Piotr DollárPiotr Dollár1,21,2 Boris BabenkoBoris Babenko22 Pietro PeronaPietro Perona11 Serge BelongieSerge Belongie1,21,2 Zhuowen TuZhuowen Tu33

Multiple Component Learningfor Object Detection

Multiple Component Learningfor Object Detection

2 Computer Science and Engineering

University of California, San Diego

{bbabenko,sjb}@cs.ucsd.edu

IGERT

1 Electrical Engineering,

California Institute of Technology

[email protected]

3 Lab of Neuro Imaging

University of California, Los Angeles

[email protected]

Goals

De

riv

ati

on

Low-level features1. Learn part-based classifier with weak supervision

(object labels provided, but no part labels)

2. Part models are classifiers from rich hypothesis class

(rather than Gaussian distributions, templates, etc.)

3. No complex inference since model is discriminative

• Boosting

• Input: N training examples with and

• Combines T weak classifiers to learn strong classifier:

• Excellent generalization and strong theoretical foundation

• No assumptions about input space

• Only requires weak classifiers for arbitrary

• For example, can use (sets) in place of

Incorporating Spatial Relations

CTC1 C2 …

= +

Other Applications

Overview

Pe

rso

n 1

(2) Learning diverse parts

De

riv

ati

on

Results on INRIA Data

Alg

ori

thm

Writer Identification

(1) Learning a single part

(3) Combining part detectors

Weakly supervised learning

• Object location in positive images unknown

• Developed for learning objects, use for parts

• We use Multiple Instance Learning (MIL)

What prevents learning same part repeatedly?

• Different weighting of data

• Not all parts expressed in all images

Boosting offers way of combining multiple diverse classifiers

• Train one weak (part) classifier using MIL

• Re-weight samples according to current error

• Features for overall classifier

• Densely compute component responses Ci

• Final classifier retrained with Haars over Ci

• For example, can use (sets) in place of

• Can therefore use MIL to train weak classifier

• MIL learns a function

• Specifically, learns f in

• Only need to adjust MIL to take weights

• Features for component classifiers

• Haars over multiple channels:

gray (1), grad (1), quantized grad (6)learned

representationMCL part-based

= +

Single Scale Detection Results

We achieve state of the art results.

Set Learning

Pe

rso

n 2

Pe

rso

n 2

Pe

rso

n 1

Speaker Identification

• Text independent handwriting identification

• 2 people, 2 pages of text each

• Haar features on 25x25 patches

• Content independent speaker identification

• John Kerry vs. George W. Bush, 2004

• Standard MFCC features

Robustness to Occlusion

Artificial 45x45 occ:

Artificial 30x30 occ:

• Re-weight samples according to current error

• Repeat until training error sufficiently low

Training

input

Goal

Standard MIL MCL

, , , , , ,

, , , ,

, , , ,

MCL degrades gracefully w occ.

Role of Alignment

• Aligned data � higher performance

• Can’t simultaneously align articulated

object without part based model

SummaryAdvantages:

1. General notion of parts (components)

2. Component learning weakly supervised

3. Principled, general algorithm

4. State of the art results with simple features

Disadvantages:

1. Large amount of data needed

2. Evaluating all components slow

(currently working on improving speed)

Learned components (first 5)