learning to segment from diverse data

Learning to Segment from Diverse Data

M. Pawan Kumar

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Daphne KollerHaithem Turki Dan Preston

AimLearn accurate parameters for a segmentation model

- Segmentation without generic foreground or background classes

- Train using both strongly and weakly supervised data

Data in Vision“Strong” Supervision

“Car”

“Weak” Supervision

“One hand tied behind the back…. “

Data for Vision

“Car”

“Strong” Supervision “Weak” Supervision

Types of DataSpecific foreground classes, generic background class

PASCALVOC

SegmentationDatasets

Types of DataSpecific background classes, generic foreground class

StanfordBackground

Dataset

Types of DataBounding boxes for objects

PASCAL VOC Detection Datasets

Thousands of freely available images

Current methods only use small, controlled datasets

Types of DataImage-level labels

ImageNet, Caltech …

“Car”

Types of DataNoisy data from web search

Google Image, Flickr, Picasa …..

Millions of freely available images

Outline• Region-based Segmentation Model

• Problem Formulation

• Inference

• Results

Region-based Segmentation Model

ObjectModels

Pixels

Regions

• Inference

• Results

Problem FormulationTreat missing information as latent variables

Joint FeatureVector

Image x Annotation y Complete Annotation (y,h)

Region featuresDetection features

Pairwise contrastPairwise context

(x,y,h)

Problem FormulationTreat missing information as latent variables

Image x Annotation y Complete Annotation (y,h)

(y*,h*) = argmax wT(x,y,h)

Latent Structural SVM

Trained by minimizing overlap loss ∆

Self-Paced Learning

Start with an initial estimate w0

Update wt+1 by solving a biconvex problem

min ||w||2 + C∑i vii - K∑i vi

wT(xi,yi,hi) - wT(xi,y,h)≥ (yi, y, h) - i

Update hi = maxhH wtT(xi,yi,h)

Kumar, Packer and Koller, 2010

AnnotationConsistentInference

Loss Augmented Inference

• Inference

• Results

Generic Classes

DICTIONARYOF

REGIONSD

MERGE AND INTERSECT WITH SEGMENTS TO FORM

PUTATIVE REGIONS

SELECT REGIONS

ITERATE UNTILCONVERGENCE

Current Regions Over-Segmentations

min Ty s.t. y SELECT(D)

Kumar and Koller, 2010

Generic ClassesBinary yr(0) = 1 iff r is not selected Binary yr(1) = 1 iff r is selected

miny ∑ r(i)yr(i) + ∑ rs(i,j)yrs(i,j)

s.t. yr(0) + yr(1) = 1 Assign one label to r from L

yrs(i,0) + yrs(i,1) = yr(i) Ensure yrs(i,j) = yr(i)ys(j)

∑r “covers” u yr(1) = 1 Each super-pixel is coveredby exactly one selected region

yr(i), yrs(i,j) {0,1} Binary variables

Minimize the energy

yrs(0,j) + yrs(1,j) = ys(j)

Generic Classes

DICTIONARYOF

REGIONSD

MERGE AND INTERSECT WITH SEGMENTS TO FORM

PUTATIVE REGIONS

SELECT REGIONS

ITERATE UNTILCONVERGENCE

Current Regions Over-Segmentations

min Ty s.t. y SELECT(D)

Kumar and Koller, 2010∆new ≤ ∆prev

Simultaneous region selection and labeling

ExamplesIteration 1 Iteration 3 Iteration 6

Bounding Boxes

min Ty

y SELECT(D)∆new ≤ ∆prev

za {0,1}za ≤ r “covers” a yr(c)

+ Ka (1-za)

Each row and each column of bounding box is covered

ExamplesIteration 1 Iteration 2 Iteration 4

Image-Level Labels

min Ty

y SELECT(D)∆new ≤ ∆prev

z {0,1}z ≤ yr(c)

+ K (1-z)

Image must contain the specified object

• Inference

• Results

DatasetStanford Background

Generic background class20 foreground classes

Generic foreground class7 background classes

PASCAL VOC 2009

Dataset

Train - 572 imagesValidation - 53 images

Test - 90 images

Test - 750 images

Stanford BackgroundPASCAL VOC 2009

Baseline: Closed-loop learning (CLL), Gould et al., 2009

Results

-202468

-10123456

PASCAL VOC 2009

Improvement over CLL

CLL - 24.7%LSVM - 26.9%

CLL - 53.1%LSVM - 54.3%

DatasetStanford BackgroundPASCAL VOC 2009 + 2010

Test - 90 images

Test - 750 imagesBounding Boxes - 1564 images

ResultsPASCAL VOC 2009

-202468

LSVMBOX

-2-10123456

LSVMBOX

CLL - 24.7%LSVM - 26.9%BOX - 28.3%

CLL - 53.1%LSVM - 54.3%BOX - 54.8%

DatasetStanford BackgroundPASCAL VOC 2009 + 2010

Test - 90 images

Test - 750 imagesBounding Boxes - 1564 images

+ 1000 image-level labels (ImageNet)

ResultsPASCAL VOC 2009

-2-10123456

LSVMBOXLABEL

-202468

101214

LSVMBOXLABEL

CLL - 24.7%LSVM - 26.9%BOX - 28.3%LABEL - 28.8%

CLL - 53.1%LSVM - 54.3%BOX - 54.8%LABEL - 55.3%

Examples

Failure Modes

Examples

Types of DataSpecific foreground classes, generic background class

PASCALVOC

SegmentationDatasets

Types of DataSpecific background classes, generic foreground class

StanfordBackground

Dataset

Types of DataBounding boxes for objects

PASCAL VOC Detection Datasets

Types of DataImage-level labels

ImageNet, Caltech …

“Car”

Types of DataNoisy data from web search

Google Image, Flickr, Picasa …..

Millions of freely available images

Two ProblemsThe “Noise” Problem

Self-Paced Learning

The “Size” Problem

Self-Paced Learning

Questions?

learning to segment from diverse data

y selectdkumar

background classes train

yriysj r

j yrs1

z image

yr0 yr1

yriensure yrsi

generic classesbinary

Documents

mixpoet: diverse poetry generation via learning

learning segment: matter author: grade...

diverse learning environments - higher education research...

diverse learning environments - heri · pdf filediverse...

learning to segment actions from observation and narration

learning speciﬁc-class segmentation from diverse data

diverse segment - charlie parker - bb

teaching and learning science in linguistically diverse...

learning diverse-structured networks for adversarial

learning segment plan 227

diverse ensembles for active learning

facilitating the learning of diverse students

unit plan-integrated learning segment

learning diverse stochastic human-action generators by...

learning to segment under various forms of weak

barriers to learning in diverse classrooms: teachers’ …

deep learning models reveal internal structure and diverse...

diverse learning cop putting the puzzle together:

fishnet: learning to segment the silhouettes of...

diverse learners: addressing the needs of learning...