computer vision segmentation marc pollefeys comp 256 some slides and illustrations from d. forsyth,...

Post on 22-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ComputerVision

Segmentation

Marc PollefeysCOMP 256

Some slides and illustrations from D. Forsyth, T. Darrel,...

ComputerVision

Aug 26/28 - Introduction

Sep 2/4 Cameras Radiometry

Sep 9/11 Sources & Shadows Color

Sep 16/18 Linear filters & edges

(hurricane Isabel)

Sep 23/25 Pyramids & Texture Multi-View Geometry

Sep30/Oct2 Stereo Project proposals

Oct 7/9 Tracking (Welch) Optical flow

Oct 14/16 - -

Oct 21/23 Silhouettes/carving (Fall break)

Oct 28/30 - Structure from motion

Nov 4/6 Project update Proj. SfM

Nov 11/13 Camera calibration Segmentation

Nov 18/20 Fitting Prob. segm.&fit.

Nov 25/27 Matching templates (Thanksgiving)

Dec 2/4 Matching relations Range data

Dec ? Final project

Tentative class schedule

ComputerVision Segmentation and Grouping

• Motivation: not information is evidence

• Obtain a compact representation from an image/motion sequence/set of tokens

• Should support application

• Broad theory is absent at present

• Grouping (or clustering)– collect together

tokens that “belong together”

• Fitting– associate a model

with tokens– issues

• which model?• which token goes

to which element?• how many

elements in the model?

ComputerVision General ideas

• tokens– whatever we need to

group (pixels, points, surface elements, etc., etc.)

• top down segmentation– tokens belong

together because they lie on the same object

• bottom up segmentation– tokens belong

together because they are locally coherent

• These two are not mutually exclusive

ComputerVision

Why do these tokens belong together?

ComputerVision

ComputerVision

Basic ideas of grouping in humans

• Figure-ground discrimination– grouping can be

seen in terms of allocating some elements to a figure, some to ground

– impoverished theory

• Gestalt properties– elements in a

collection of elements can have properties that result from relationships (Muller-Lyer effect)

• gestaltqualitat

– A series of factors affect whether elements should be grouped together

• Gestalt factors

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

Technique: Shot Boundary Detection

• Find the shots in a sequence of video– shot boundaries

usually result in big differences between succeeding frames

• Strategy:– compute interframe

distances– declare a boundary

where these are big

• Possible distances– frame differences– histogram differences– block comparisons– edge differences

• Applications:– representation for

movies, or video sequences

• find shot boundaries• obtain “most

representative” frame

– supports search

ComputerVision

Technique: Background Subtraction

• If we know what the background looks like, it is easy to identify “interesting bits”

• Applications– Person in an office– Tracking cars on a

road– surveillance

• Approach:– use a moving

average to estimate background image

– subtract from current frame

– large absolute values are interesting pixels

• trick: use morphological operations to clean up pixels

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision Segmentation as clustering

• Cluster together (pixels, tokens, etc.) that belong together

• Agglomerative clustering– attach closest to

cluster it is closest to– repeat

• Divisive clustering– split cluster along best

boundary– repeat

• Point-Cluster distance– single-link clustering– complete-link

clustering– group-average

clustering

• Dendrograms– yield a picture of

output as clustering process continues

ComputerVision Simple clustering algorithms

ComputerVision

ComputerVision K-Means

• Choose a fixed number of clusters• Choose cluster centers and point-cluster allocations

to minimize error

• can’t do this by search, because there are too many possible allocations.

x j i2

jelements of i'th cluster

iclusters

ComputerVision

K-means clustering using intensity alone and color alone

Image Clusters on intensity Clusters on color

ComputerVision

K-means using color alone, 11 segments

Image Clusters on color

ComputerVision

K-means usingcolor alone,11 segments.

ComputerVision

K-means using colour andposition, 20 segments

ComputerVision Graph theoretic clustering

• Represent tokens using a weighted graph.– affinity matrix

• Cut up this graph to get subgraphs with strong interior links

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision

ComputerVision Measuring Affinity

Intensity

Texture

Distance

aff x, y exp 12 i

2

I x I y 2

aff x, y exp 12 d

2

x y

2

aff x, y exp 12 t

2

c x c y 2

ComputerVision Scale affects affinity

ComputerVision Eigenvectors and cuts

• Simplest idea: we want a vector a giving the association between each element and a cluster

• We want elements within this cluster to, on the whole, have strong affinity with one another

• We could maximize

• But need the constraint

• This is an eigenvalue problem - choose the eigenvector of A with largest eigenvalue

aTAa

aTa1

ComputerVision Example eigenvector

points

matrix

eigenvector

ComputerVision

ComputerVision More than two segments

• Two options– Recursively split each side to get a tree,

continuing till the eigenvalues are too small

– Use the other eigenvectors

ComputerVision More than two segments

ComputerVision Normalized cuts

• Current criterion evaluates within cluster similarity, but not across cluster difference

• Instead, we’d like to maximize the within cluster similarity compared to the across cluster difference

• Write graph as V, one cluster as A and the other as B

• Maximize

• i.e. construct A, B such that their within cluster similarity is high compared to their association with the rest of the graph

assoc(A,A)assoc(A,V )

assoc(B,B)assoc(B,V )

ComputerVision

ComputerVision

ComputerVision Normalized cuts

• Write a vector y whose elements are 1 if item is in A, -b if it’s in B

• Write the matrix of the graph as W, and the matrix which has the row sums of W on its diagonal as D, 1 is the vector with all ones.

• Criterion becomes

• and we have a constraint

• This is hard to do, because y’s values are quantized

miny

yT D W yyTDy

yTD1 0

ComputerVision

ComputerVision Normalized cuts

• Instead, solve the generalized eigenvalue problem

• which gives

• Now look for a quantization threshold that maximises the criterion --- i.e all components of y above that threshold go to one, all below go to -b

maxy yT D W y subject to yTDy 1

D W yDy

ComputerVision

Figure from “Image and video segmentation: the normalised cut framework”, by Shi and Malik, copyright IEEE, 1998

ComputerVision

Figure from “Normalized cuts and image segmentation,” Shi and Malik, copyright IEEE, 2000

ComputerVision Image parsing:

Unifying Segmentation, Detection and RecognitionTu, Chen, Yuille & Zhu ICCV’03 (Marr prize!)

ComputerVision Image parsing

• Generative models for each classsome random samples for 5 and H

some random face samples (PCA model)

ComputerVision DDMCMC

Data Driven Markov Chain Monte Carlo

ComputerVision Image parsing

ComputerVision Image parsing

ComputerVision Next class:

Fitting

Reading: Chapter 15

top related