composite statistical modeling in segmentation

Composite Statistical Modeling

in Segmentation

Fuxin Li

Georgia Institute of Technology

http://www.cc.gatech.edu/~fli/

CollaboratorsJoao Carreira Cristian Sminchisescu Guy Lebanon

Ahmad Humayun David Tsai James M. RehgTaeyoung Kim

Outline

• Composite statistical modeling in semantic

segmentation

– Learning

– Inference

• in video segmentation

– Learning

Recognizing Objects in a Scene

• Given an image, identify the category and spatial

extent of all relevant objects

– a.k.a. Semantic segmentation (Shotton et al. 2006, 2008, Csurka

and Perronin 2010, Boix et al. 2010, Ladicky et al. 2010, Bourdev and Malk 2009,

Bourdev et al. 2010, Xia et al. 2013, Yalladopour et al. 2013, Z. Li et al. 2013)

Person

Image Category Label Object Label

Semantic Segmentation 4

Multiple Segmentation Hypotheses

- First used in the Bonn entry (Carreira, Li, Sminchisescu) winning

PASCAL VOC Segmentation Challenge 2009- (CVPR 2014) New algorithm RIGOR that can achieve CPMC

accuracy in 2-4 seconds per image (CPU-only)

Semantic Segmentation: Learning 5

SVRSEGM:Regression on overlap

• Regress on maximal class-specific overlap

Overlap:

Overlap with Horse class: (maximize over 2 horses)80.8% 36.5% 4.7%

Semantic Segmentation: Learning

Li, Carreira, Sminchisescu, CVPR 2010, IJCV2012

SVRSEGM• 1-vs-all class-specific overlap regression

on many segment hypotheses

• Heuristic sequential post-processing

Semantic Segmentation: Learning 7

Composite Statistical ModelingTraining set For each Generate (Bottom-up)

Extract features on segments,

learn models that predict statistics (overlap)

Testing image Generate Predict Inference

Recover pixel labels

from prediction

Segment Statistics

(overlap)

Semantic Segmentation 8

Composite Statistical Learning

Composite Statistical Inference

Open Inference Problem

• Resolve noisy predictions on noisy segments

• Identify complicated object interactions,

especially occluded/disconnected objects

Category: Object:

Semantic Segmentation: Inference 9

Li, Carreira, Lebanon, Sminchisescu,

CVPR 2013

Idea #1: Break and Recombine

• Break the segments apart and recombine them

– Initial enumerations are constrained

• e.g. continuity, boundary adherence

– Interactions among objects• Create occlusions!

Dissecting Segments

Seg #1: Chair 0.53

Person 0.29

Seg #2: Chair 0.36

Person 0.47

Seg #3: Chair 0.34

Person 0.54Superpixels:Seg #4: Chair 0.19

Person 0.43

Semantic Segmentation: Inference

Generating the Overlap Statistic

• Parametrize on superpixels:

𝜃𝑖𝑗|𝑆𝑖| = num. of category 𝑐𝑗ground truth pixels in 𝑆𝑖

V𝑗 𝐴1; 𝜃 =|𝐴1∩𝐺𝑇|

|𝐴1∪𝐺𝑇|=

𝐴𝑙𝑙 𝐺𝑇 𝑝𝑖𝑥𝑒𝑙𝑠 𝑖𝑛 𝐴1

|𝐴1|+𝐴𝑙𝑙 𝐺𝑇 𝑜𝑢𝑡𝑠𝑖𝑑𝑒 𝐴1=

𝑆𝑖∈𝐴1𝜃𝑖𝑗|𝑆𝑖|

𝑆𝑖∈𝐴1|𝑆𝑖|+ 𝑆𝑖∉𝐴1

𝜃𝑖𝑗|𝑆𝑖|

𝜃11: % of chair

𝜃12: % of person

𝜃21: % of chair

𝜃22: % of person

𝜃31: % of chair

𝜃32: % of person

𝜃41: % of chair

𝜃42: % of person

𝑆1 𝑆2

𝑆3 𝑆4

𝐴1: Chair 0.53

Person 0.29

Idea #2: Composite Statistical

Inference• MCLE, moment matching:

• 𝐴𝑖: segments

• 𝑉𝑗: Predicted overlap with category 𝐶𝑗

min𝜃

𝑗=1

𝑖=1

𝑉𝑗 𝐴𝑖; 𝜃 − 𝑉𝑗 𝐴𝑖2

“Generated” statistic Predicted from the regressor

Jointly over all categories + all segments!

Joint optimization

• 𝜃 map after joint optimization on all objects:

Person

Idea #3: Separating Multiple Objects

from the Same Category

• MAP within each category to determine number

of objects– Geometric prior favors less objects

𝑛ℎ𝑜𝑟𝑠𝑒 = 1

𝜃 map:

Posterior: -40.133 𝑛ℎ𝑜𝑟𝑠𝑒 = 2 Posterior: -35.889

𝑛ℎ𝑜𝑟𝑠𝑒 = 3 Posterior: -47.600

Joint optimization

• 𝜃 map after joint optimization on all objects:Horse Person Horse

Person

PersonPerson

HorseHorse

Obj1Obj4

Obj2Obj3

Final Result:

Results: PASCAL 2012

• CSI does especially well on high-interaction objects such

as bike, person, chair, sofa, etc.

SVRSEGM JSL CSI

46.8% 47.0% 47.5%

Xia et al. 2013 Yadollahpour et al. 2013 Li et al. 2013

48.0% 48.1% 48.3%

+ mix of models,

more data:

with only PASCAL

training data, only overlap

Person PersonPerson

Person

PlantChair

TableChair

DogChair

Person

PASCAL: noise-free case

• Supply ground truth overlap to different

algorithms

– Upper bound performance with perfect regressor,

noisy segments

• Recombination is important!

SVRSEGM CPMC Best CSI Superpixel Best

79.0% 81.8% 90.2% 95.1%

Person

PersonMotorbike

Composite Statistical Modeling

Composite Statistical Learning

Composite Statistical Inference

Training set For each Generate (Bottom-up)

Extract features on segments,

perform regression to learn model

Testing image Generate Predict Inference

(Top-down)

Class-specific overlap

Break and recombine

Video Segmentation

Video Segmentation 20

Li, Kim, Humayun, Tsai, Rehg, ICCV 2013

Approach

• Track all segments from each frame

– Long-term appearance model for each track

– Every segment starts a track (1000+ tracks)

– Training: Use all segments, regress against overlap

with each track (0-1 segment per frame)

Video Segmentation

Track 1:

Track 3:

Track 2:

Least squares make wonders

𝐗⊤𝐗

Store one vector per appearance model

plus a global covariance matrix

Enables learning/optimal online updating

1000+ appearance models

𝐖 = −1

min𝐖

𝐖⊤𝐗 − 𝐘 2 + 𝜆 𝐖 𝐹2

How to use that in video?

• Always use the whole segment pool to train

– If we go from 1st – 20th frame, our training set is

always all the segments in all the frames, for ANY

target

– Online update: At each frame, add all segments from

the frame to 𝐗⊤𝐗 and 𝐗⊤𝒚

Greedy Trimming of Tracks

• Test on the next frame

– Obtain the regression result of every segment against

every track

– Choose best-scoring segment to match

Results

• Automatically reduce number of tracks from

1200 (CPMC) to 60 per sequence

Numbers

• We beat closest competitor by 14%

• CSI Refinement improves 3%

• Purely automatic, no user input

Video Segmentation

SPT SPT

Kim et

al. 2011

Grundmann

et al. 2010

Oracle

Segment

Mean per

object

62.7 65.9 55.4 45.3 51.8 78.6

Mean per

sequence

68.0 71.2 58.6 57.3 50.8 81.5

Avg. number

of tracks

60.0 60.0 702.8 10.6 336.6 1219.3

Conclusion

• Composite statistical modeling

– Holistic segments, object-scale models

– Training is a breeze (regression)

– Least squares offer additional benefits

– Breaking down + recombine segments for refinement

– Refinement will be needed when we are going from

85% to 90%

• Or inferring about higher-level semantics, occlusion, etc.

Thanks!

Person

Bottle

http://www.cc.gatech.edu/~fli/SegTrack2/

Video Segmentation

Code available!

Semantic Segmentation

http://www.cc.gatech.edu/~fli/CSI_tr2.pdf

composite statistical modeling in segmentation

Documents

segmentation and alignment of parallel text for...

statistical cues for domain speci c image segmentation

statistical issues in the use of composite endpoints in...

statistical region-based segmentation of ultrasound images

mostly-unsupervised statistical segmentation of japanese...

segmentation-free statistical image reconstruction...

jseg algorithm and statistical image segmentation

simultaneous segmentation and correspondence improvement...

review of medical image segmentation with …€¦ · cite...

biomedical data exploration meets...

statistical region-based segmentation of ultrasound...

statistical region-based segmentation of ultrasound...

composite analysis of phase resolved partial discharge...

image segmentation of cattle muzzle using region merging...

statistical asymmetry-based brain tumor segmentation from

seeded binary segmentation: a general methodology for fast...

statistical shape models - understanding and mastering...

an interactive java statistical image segmentation system...

dbp: image analysis for atrial arrhythmias · • mit:...

a faster graph-based segmentation algorithm with statistical