building desert field sky mountain sea sky vert horz...

SuperParsing: Scalable Nonparametric Image Parsing with SuperpixelsJoseph Tighe and Svetlana Lazebnik Dept. of Computer Science, University of North Carolina at Chapel Hill http://www.cs.unc.edu/SuperParsing

Overview

• “Open universe” system: no training required, easy to accommodate an evolv-ing dataset as new classes or new training exemplars are added

• State of the art performance on the SIFT Flow dataset (Liu et al., CVPR 2009)

• New large-scale baseline for image parsing: per-pixel recognition results on asubset of LabelMe consisting of 15k images, 170 labels

Query Image Retrieval set of similar images

Per-class likelihood

Building

Road

Sky

Car

SkyVertical

Horizontal

Superpixels

Building

Car

Road

Sky Semantic Classes Geometric Classes

Image Parsing MethodGiven a query (test) image:

• Find a retrieval set of 200 similar images by taking the minimum per-feature rankof four global image features.

• For each test superpixel si described by multiple local features {fki }, compute alikelihood ratio score for each class c found in the retrieval set:

L(si, c) =P (si|c)P (si|¬c)

=∏k

P (fki |c)P (fki |¬c)

.

The feature likelihood P (fki |c) is given by a nonparametric density estimate:

P (fki | c) =#(retrieval set features of class c within a fixed radius offki )

#(total features of class c in the training set).

• Use MRF inference to solve for the label field c = {ci} over the entire test image:

J(c) =∑

si∈SP

−wi logL(si, ci) + λ∑

(si,sj)∈A

Eedge(ci, cj) ,

where wi is a weight based on the superpixel size, and edge penalty Eedge isbased on the co-occurrence of adjacent labels in the training set:

Eedge(ci, cj) = − log[(P (ci|cj) + P (cj |ci))/2]× δ[ci 6= cj ] .

Road

Sky Sky

Sea Sea

Sand Sand

Tree

Max. Likelihood Ratio Edge Penality MRF LabelingQuery Image

Joint Semantic and Geometric LabelingSimultaneously solve for a field of semantic labels (c) and geometric labels (g) overthe image by optimizing

H(c,g) = J(c) + J(g) + µ∑

si∈SP

ϕ(ci, gi) ,

where ϕ(ci, gi) is a coherence term between the semantic and geometric label of thesame superpixel.

72.0 68.6 77.6

97.2 97.6 94.8

Road

WindowDoor

SidewalkBuildingAwning

SignPerson

HorzVertSky

InitialLabeling

Joint Semanticand Geometric

Query GroundTruth Labels

SemanticMRF

SemanticClasses

GeometricClasses

Results on Large-Scale Datasets

• SIFT Flow dataset (Liu et al., CVPR 2009): 2,488 training images, 200 test images,33 labels

• Barcelona dataset (a new large-scale benchmark): 14,871 training images, 279test images, 170 labels

020406080

# of

Sup

erpi

xels

(x10

00)

264,945 SIFT Flow Dataset Barcelona DatasetLabel Frequency in Dataset

Per-pixel classification rates (with average per-class rates in parentheses):

SIFT Flow BarcelonaSemantic Geometric Semantic Geometric

Liu et al. (CVPR 2009) 74.75 N/A N/A N/ALocal labeling 73.2 (29.1) 89.8 62.5 (8.0) 89.9MRF 76.3 (28.8) 89.9 66.6 (7.6) 90.2Joint semantic/geometric 76.9 (29.4) 90.8 66.9 (7.6) 90.7

0%25%50%75%

100%SIFT Flow Dataset Barcelona DatasetPer-class Performance

Timing

SIFT Flow BarcelonaTraining set size 2,488 14,871Image size 256× 256 640× 480Ave. # superpixels 63.9 307.9Feature extraction ∼ 4 sec ∼ 5 minRetrieval set search 0.04 ± 0.0 0.21 ± 0.0Superpixel search 4.4 ± 2.3 34.2 ± 13.4MRF solver 0.005 ± 0.003 0.03 ± 0.02Total (excluding features) 4.4 ± 2.3 34.4 ± 13.4 0 100 200 300 400 500

0

20

40

60

80

Number of Superpixels

Sec

onds

Full System Times

SIFT Flow Dataset

Barcelona Dataset

Sample Output on SIFT Flow DatasetInitial

LabelingFinal

LabelingGeometricLabeling

Query GroundTruth Labels

EdgePenalties

HorzUnlabeledTreeSkyRoad

98.7 98.6 99.2

CarBridge VertSky

TreeSkySea HorzVertSkyRoadMountainGrass

95.2 97.1 97.1

FieldDesert

TreeSkySea HorzVertSkySunMountain

86.6 88.4 88.5

Sea HorzVertSkyMountain

68.8 94.2 94.2

FieldDesert SkyBuilding

Tree HorzVertSkyRoad

85.4 86.2 94.2

SkySidewalkBuilding Car

Horz

Window VertSkyRoad SkySidewalk

73.2 77.2 93.3

BuildingBalcony Door

TreeSky HorzVertSkyMountain

57.9 73.2 81.3

Building

Small DatasetsResults on two small-scale datasets using trained boosted decision tree classifiers(instead of retrieval set and superpixel search):

Stanford Dataset Geometric Context Dataset715 images, 8 classes 300 images, 7 classes

Semantic Geometric Sub-classes Main classesGould et al. (ICCV 2009) 76.4 91.0 N/A 86.9Hoiem et al. (IJCV 2007) N/A N/A 61.5 88.1Local labeling 76.9 90.5 57.6 87.8MRF 77.5 90.6 61.0 88.2Joint semantic/geometric 77.5 90.6 61.0 88.1

FundingThis research was supported in part by NSF CAREER award IIS-0845629, Microsoft ResearchFaculty Fellowship, and Xerox.

1

building desert field sky mountain sea sky vert horz...

Documents