image segmentation - cvg · image segmentation • segmentation as clustering ! k-means ! feature...
TRANSCRIPT
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 0/
09
Image Segmentation
Luc Van Gool, ETH Zurich With important contributions by Vittorio Ferrari, Un. of Edinburgh Slide credits:
K. Grauman, B. Leibe, S. Lazebnik, S. Seitz, Y Boykov, W. Freeman, P. Kohli TexPoint fonts used in EMF.
Read the TexPoint manual before you delete this box.: AAAAAAAAA
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Topics of This Lecture
• Introduction Ø Gestalt principles Ø Image segmentation
• Segmentation as clustering Ø k-Means Ø Feature spaces
Ø Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with path search
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
Grouping in Vision
Slide credit: Kristen Grauman
Fast, bottom-up mechanisms to determine regions that belong together… … stepping stone between pixels/retina cell responses and scene interpretation. Psychophysics has listed features that seem to provoke perceptual grouping
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Similarity in appearance
http://chicagoist.com/attachments/chicagoist_alicia/GEESE.jpg, http://wwwdelivery.superstock.com/WI/223/1532/PreviewComp/SuperStock_1532R-0831.jpg Slide adapted from Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Symmetry
http://seedmagazine.com/news/2006/10/beauty_is_in_the_processingtim.php Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Common Fate
Image credit: Arthus-Bertrand (via F. Durand)
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Proximity
http://www.capital.edu/Resources/Images/outside6_035.jpg Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
The Gestalt School
• Grouping is key to visual perception • `Belonging together’ is inferred from relationships
Ø “The whole is greater than the sum of its parts”
Illusory/subjective contours
Occlusion
Familiar configuration
http://en.wikipedia.org/wiki/Gestalt_psychology Slide credit: Svetlana Lazebnik Image source: Steve Lehar
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Gestalt Theory
• Gestalt: whole or group Ø Whole is greater than sum of its parts Ø Relationships among parts can yield new properties/features
• Psychologists identified series of factors that predispose set of elements to be grouped (by human visual system)
Untersuchungen zur Lehre von der Gestalt, Psychologische Forschung, Vol. 4, pp. 301-350, 1923 http://psy.ed.asu.edu/~classics/Wertheimer/Forms/forms.htm
“I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour. Do I have "327"? No. I have sky, house, and trees.”
Max Wertheimer (1880-1943)
Slide credit: B. Leibe
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Gestalt Factors
These factors make intuitive sense, but are very difficult to translate into algorithms.
Image source: Forsyth & Ponce Slide credit: B. Leibe
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Figure-Ground Discrimination
Slide credit: B. Leibe
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
The Ultimate Gestalt test
Slide adapted from B. Leibe
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Image Segmentation
• Goal: identify groups of pixels that go together
Slide credit: Steve Seitz, Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
The Goals of Segmentation
• Separate image into objects
Image Human segmentation
Slide credit: Svetlana Lazebnik
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Topics of This Lecture
• Introduction Ø Gestalt principles Ø Image segmentation
• Segmentation as clustering Ø k-Means Ø Feature spaces
Ø Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with path search
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Image Segmentation: Toy Example
• These intensities define the three groups. • We could label every pixel in the image according to
which of these it is. Ø i.e. segment the image based on the intensity feature.
• What if the image isn’t quite so simple?
intensity input image
black pixels gray pixels
white pixels
1 2 3
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Pixe
l cou
nt
Input image
Input image Intensity
Pixe
l cou
nt
Intensity
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09 • Now how to determine the three main intensities that
define our groups? • We need to cluster.
Input image Intensity
Pixe
l cou
nt
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
• Goal: choose three “centers” as the representative intensities, and label every pixel according to which of these centers it is nearest to.
• Best cluster centers are those that minimize SSD between all points and their nearest cluster center ci:
Slide credit: Kristen Grauman
0 190 255
1 2 3
Intensity
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Clustering
• With this objective, it is a “chicken and egg” problem: Ø If we knew the cluster centers, we could allocate points to
groups by assigning each to its closest center.
Ø If we knew the group memberships, we could get the centers by computing the mean per group.
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
K-Means Clustering
• Basic idea: randomly initialize the k cluster centers, and iterate between the two steps we just saw. 1. Randomly initialize the cluster centers, c1, ..., cK
2. Given cluster centers, determine points in each cluster – For each point p, find the closest ci. Put p into cluster i
3. Given points in each cluster, solve for ci – Set ci to be the mean of points in cluster i
4. If ci have changed, repeat Step 2
• Properties Ø Will always converge to some solution Ø Can be a “local minimum”
– Does not always find the global minimum of objective function:
Slide credit: Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Segmentation as Clustering
K=2
K=3
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Feature Space
• Depending on what we choose as the feature space, we can group pixels in different ways.
• Grouping pixels based on intensity similarity
• Feature space: intensity value (1D)
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Feature Space
• Depending on what we choose as the feature space, we can group pixels in different ways.
• Grouping pixels based on color similarity
• Feature space: color value (3D)
R=255 G=200 B=250
R=245 G=220 B=248
R=15 G=189 B=2
R=3 G=12 B=2 R
G B
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Segmentation as Clustering
• Depending on what we choose as the feature space, we can group pixels in different ways.
• Grouping pixels based on texture similarity
• Feature space: filter bank responses (e.g. 24D)
Filter bank of 24 filters
F24
F2
F1
…
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Spatial coherence
• Assign a cluster label per pixel à possible discontinuities
• How can we ensure they are spatially smooth?
1 2 3
?
Original Labeled by cluster center’s intensity
Slide adapted from Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Spatial coherence
• Depending on what we choose as the feature space, we can group pixels in different ways.
• Grouping pixels based on intensity+position similarity
Way to encode both similarity and proximity.
Slide adapted from Kristen Grauman
X
Intensity
Y
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Summary K-Means
• Pros Ø Simple, fast to compute Ø Converges to local minimum
of within-cluster squared error
• Cons/issues Ø Setting k? Ø Sensitive to initial centers Ø Sensitive to outliers Ø Detects spherical clusters only
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Probabilistic Clustering
• Basic questions Ø What’s the probability that a point x is in cluster m? Ø What’s the shape of each cluster?
• K-means doesn’t answer these questions.
• Basic idea Ø Instead of treating the data as a bunch of points, assume that
they are all generated by sampling a continuous function. Ø This function is called a generative model.
Ø Defined by a vector of parameters θ Ø No hard (as with K-means) but a soft assignment to different
clusters each with their probability
Slide credit: Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Mixture of Gaussians
• One generative model is a mixture of Gaussians (MoG) Ø K Gaussian blobs with means µb covariance matrices Vb, dimension d
– Blob b defined by:
Ø Blob b is selected with probability ( ) Ø The likelihood of observing x is a weighted mixture of Gaussians
,
Slide adapted from Steve Seitz
αbb=1
K
∑ =1
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Expectation Maximization (EM)
• Goal Ø Find blob parameters θ that maximize the likelihood function
over all all datapoints
• Approach: 1. E-step: given current guess of blobs, compute probabilistic ownership
of each point 2. M-step: given ownership probabilities, update blobs to maximize
likelihood function 3. Repeat until convergence
Slide adapted from Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
EM Details
• E-step Ø Compute probability that point x is in blob b, given current
guess of θ
• M-step Ø Compute overall probability that blob b is selected
Ø Mean of blob b
Ø Covariance of blob b
(N data points)
Slide adapted from Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Segmentation with EM
Image source: Serge Belongie Slide credit: B. Leibe
K = 3
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Summary: Mixtures of Gaussians, EM
• Pros Ø Probabilistic interpretation Ø Soft assignments between data points and clusters Ø Generative model, can predict novel data points Ø Relatively compact storage
• Cons Ø Initialization
– often a good idea to start from output of k-means
Ø Local minima Ø Need to know number of components K
– solutions: add a cost for model complexity
Ø Need to choose generative model (math form of a cluster ?)
Slide adapted from B. Leibe
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Topics of This Lecture
• Introduction Ø Gestalt principles Ø Image segmentation
• Segmentation as clustering Ø k-Means Ø Feature spaces
Ø Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with GraphCuts
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Topics of This Lecture
• Introduction Ø Gestalt principles Ø Image segmentation
• Segmentation as clustering Ø k-Means Ø Feature spaces
Ø Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with path search
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Finding Modes in a Histogram
• How many modes are there? Ø Mode = local maximum of a given distribution Ø Easy to see, hard to compute
Slide adapted from Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Mean-Shift Segmentation
• An advanced and versatile technique for clustering-based segmentation
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002.
Slide credit: Svetlana Lazebnik
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Mean-Shift Algorithm
• Iterative Mode Search 1. Initialize random seed center and window W 2. Calculate center of gravity (the “mean”) of W: 3. Shift the search window to the mean 4. Repeat steps 2+3 until convergence
Slide adapted from Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Region of interest
Center of mass
Mean Shift vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Region of interest
Center of mass
Mean Shift vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Region of interest
Center of mass
Mean Shift vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Region of interest
Center of mass
Mean Shift vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Region of interest
Center of mass
Mean Shift vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Region of interest
Center of mass
Mean Shift vector
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Region of interest
Center of mass
Mean-Shift
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Tessellate the space with windows
Run the procedure in parallel
Slide by Y. Ukrainitz & B. Sarel
Real Modality Analysis
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
The blue data points were traversed by the windows towards the mode.
Slide by Y. Ukrainitz & B. Sarel
Real Modality Analysis
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Mean-Shift Clustering
• Cluster: all data points in the attraction basin of a mode • Attraction basin: the region for which all trajectories
lead to the same mode
Slide by Y. Ukrainitz & B. Sarel
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Mean-Shift Clustering/Segmentation
• Choose features (color, gradients, texture, etc) • Initialize windows at individual pixel locations • Start mean-shift from each window until convergence • Merge windows that end up near the same “peak” or
mode
Slide adapted from Svetlana Lazebnik
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Mean-Shift Segmentation Results
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html Slide credit: Svetlana Lazebnik
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Summary Mean-Shift
• Pros Ø General, application-independent tool Ø Model-free, does not assume any prior shape (spherical,
elliptical, etc.) on data clusters Ø Just a single parameter (window size h)
– h has a physical meaning (unlike k-means) == scale of clustering
Ø Finds variable number of modes given the same h Ø Robust to outliers
• Cons Ø Output depends on window size h Ø Window size (bandwidth) selection is not trivial Ø Computationally rather expensive Ø Does not scale well with dimension of feature space
(sparsity problems in high-dimensional spaces…)
Slide adapted from Svetlana Lazebnik
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Topics of This Lecture
• Introduction Ø Gestalt principles Ø Image segmentation
• Segmentation as clustering Ø k-Means Ø Feature spaces
Ø Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with path search
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Images as Graphs
• Fully-connected graph Ø Node (vertex) for every pixel Ø Edge between every pair of pixels (p,q) Ø Affinity weight wpq for each edge
– wpq measures similarity – Similarity is inversely proportional to difference
(in color, texture, position, …)
q
p
wpq
w
Slide adapted from Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Measuring Affinity
• Distance
• Intensity
• Color
• Texture
{ }2
212
( , ) expd
aff x y x yσ
= − −
{ }2
212
( , ) exp ( ) ( )d
aff x y I x I yσ
= − −
(some suitable color space distance)
( ){ }2
212
( , ) exp ( ), ( )d
aff x y dist c x c yσ
= −
Source: Forsyth & Ponce
{ }2
212
( , ) exp ( ) ( )d
aff x y f x f yσ
= − −
(vectors of filter outputs)
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Segmentation by Graph Cuts
• Break Graph into Segments Ø Delete edges crossing between segments Ø Easiest to break edges with low similarity (low weight)
– Similar pixels should be in the same segments – Dissimilar pixels should be in different segments
w
A B C
Slide adapted from Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Graph Cut (GC)
• GC = edges whose removal partitions a graph in two • Cost of a cut
Ø Sum of weights of cut edges:
• A graph cut gives us a segmentation Ø What is a “good” graph cut and how do we find one?
Slide adapted from Steve Seitz
A B
∑∈∈
=BqAp
qpwBAcut,
,),(
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Minimum Cut
• We can do segmentation by finding the minimum cut in a graph Ø Efficient algorithms exist for doing this
• Drawback: Ø Weight of cut proportional to number of edges in the cut Ø Minimum cut tends to cut off very small, isolated components
Ideal Cut
Cuts with lesser weight than the ideal cut
Slide credit: Khurram Hassan-Shafique
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Normalized Cut (NCut)
• Min-cut has bias toward partitioning out small segments • This can be fixed by normalizing for size of segments • The normalized cut cost is:
• The exact solution is NP-hard but an approximation can be computed by solving a generalized eigenvalue problem.
assoc(A,V) = sum of weights from A to all nodes in the graph
�
cut(A,B)assoc(A,V )
+ cut(A,B)assoc(B,V )
J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000
Slide adapted from Svetlana Lazebnik
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Interpretation as a Dynamical System
• Treat the edges as springs and ‘shake’ the system Ø Elasticity proportional to cost Ø Vibration “modes” correspond to segments
– Can compute these by solving a generalized eigenvector problem
Slide adapted from Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
NCuts Example
Image source: Shi & Malik
NCuts segments
Slide credit: B. Leibe
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Color Image Segmentation with NCuts
Image Source: Shi & Malik Slide credit: Steve Seitz
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Results with Color & Texture
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Summary: Normalized Cuts
• Pros: Ø Generic framework, flexible to choice of function that computes
weights (“affinities”) between nodes Ø Does not require any model of the data distribution
• Cons: Ø Time and memory complexity can be high
– Dense, highly connected graphs many affinity computations – Solving eigenvalue problem
Ø Preference for balanced partitions – If a region is uniform, NCuts will find the
modes of vibration of the image dimensions
Slide credit: Kristen Grauman
⇒
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Markov Random Fields
• Allow rich probabilistic models for images • But built in a local, modular way
Ø Learn local effects, get global effects out
Slide credit: William Freeman
Observed evidence
Hidden “true states”
Neighborhood relations
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
MRF Nodes as Pixels (or Patches)
Image
Image pixels
states (e.g. foreground/background)
Slide adapted from William Freeman
( , )i ix yΦ
( , )i jx xΨ
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Network Joint Probability
,
( , ) ( , ) ( , )i i i ji i j
P x y x y x x= Φ Ψ∏ ∏states
Image
Slide adapted from William Freeman
Image-state compatibility
function
state-state compatibility
function
Neighboring nodes
Local observations
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Energy Formulation
• Joint probability
• Maximizing the joint probability is the same as minimizing the -log
• This is similar to free-energy problems in statistical mechanics (spin glass theory). We therefore draw the analogy and call E an energy function.
• and are called potentials.
,
( , ) ( , ) ( , )i i i ji i j
P x y x y x x= Φ Ψ∏ ∏
− logP(x, y) = − log Φ(xi , yi )i∑ − log
i , j∑ Ψ(xi ,x j )
E(x, y) = ϕ(xi , yi )i∑ + ψ (xi ,x j )
i , j∑
Slide credit: B. Leibe
ϕ ψ
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Energy Formulation
• Energy function
• Unary potentials Ø Encode local information about the given pixel/patch Ø How likely is a pixel/patch to be in a certain state ?
(e.g. foreground/background)?
• Pairwise potentials Ø Encode neighborhood information Ø How different is a pixel/patch’s label from that of its neighbor?
(e.g. here independent of image data, but later based on intensity/color/texture difference)
Pairwise potentials
Unary potentials
( , )i ix yϕ
( , )i jx xψ,
( , ) ( , ) ( , )i i i ji i j
E x y x y x xϕ ψ= +∑ ∑
Slide adapted from B. Leibe
ϕ
ψ
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Energy Minimization
• Goal: Ø Infer the optimal labeling of the MRF.
• Many inference algorithms are available, e.g. Ø Gibbs sampling, simulated annealing Ø Iterated conditional modes (ICM) Ø Variational methods Ø Belief propagation Ø Graph cuts
• Recently, Graph Cuts have become a popular tool Ø Only suitable for a certain class of energy functions Ø But the solution can be obtained very fast for typical vision
problems (~1MPixel/sec).
( , )i ix yϕ
( , )i jx xψ
Slide credit: B. Leibe
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Graph Cuts for Optimal Boundary Detection
• Idea: convert MRF into source-sink graph
n-links
s
t a cut hard constraint
hard constraint
Minimum cost cut can be computed in polynomial time
(max-flow/min-cut algorithms)
[Boykov & Jolly, ICCV’01] Slide adapted from Yuri Boykov
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Adding Regional Properties
pqw
n-links
s
t a cut )(tDp
t-lin
k
)(sDp
t-link
Regional bias example
Suppose are given “expected” intensities
of object and background
ts II and ( )22 2/||||exp)( σspp IIsD −−∝
( )22 2/||||exp)( σtpp IItD −−∝
[Boykov & Jolly, ICCV’01] Slide credit: Yuri Boykov
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Adding Regional Properties
• More generally, regional bias can be based on any intensity models of object and background
a cut ( ) log Pr( | )p p p pD L I L= −
given object and background intensity histograms
)(sDp
)(tDp s
t
I)|Pr( sI p
)|Pr( tI p
pI
[Boykov & Jolly, ICCV’01] Slide credit: Yuri Boykov
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
How Does it Work? The s-t-Mincut Problem
Source
Sink
v1 v2
2
5
9
4 2
1 Graph (V, E, C)
Vertices V = {v1, v2 ... vn}
Edges E = {(v1, v2) ....}
Costs C = {c(1, 2) ....}
Slide credit: Pushmeet Kohli
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
The s-t-Mincut Problem
Source
Sink
v1 v2
2
5
9
4 2
1
Slide credit: Pushmeet Kohli
What is an st-cut?
What is the cost of a st-cut?
An st-cut (S,T) divides the nodes between source and sink.
Sum of cost of all edges going from S to T
5 + 2 + 9 = 16
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
The s-t-Mincut Problem
Source
Sink
v1 v2
2
5
9
4 2
1
Slide credit: Pushmeet Kohli
What is an st-cut?
What is the cost of a st-cut?
An st-cut (S,T) divides the nodes between source and sink.
Sum of cost of all edges going from S to T
st-cut with the minimum cost
What is the st-mincut?
2 + 1 + 4 = 7
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
How to Compute the s-t-Mincut?
Source
Sink
v1 v2
2
5
9
4 2
1
Solve the dual maximum flow problem
In every network, the maximum flow equals the cost of the st-mincut
Min-cut/Max-flow Theorem
Compute the maximum flow between Source and Sink
Constraints
Edges: Flow < Capacity
Nodes: Flow in = Flow out
Slide credit: Pushmeet Kohli
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
2
5
9
4 2
1
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 0
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
9
4 2
1
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 0
2
5
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
9
4 2
1
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 0 + 2
5-2
2-2
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
9
4 2
1
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 2
0
3
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
0
3
9
4 2
1
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 2
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
0
3 2
1
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 2
9
4
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
0
3 2
1
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 2 + 4
5
0
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
0
3
5
0 2
1
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 6
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
0
0 2
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 6
3
5 1
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
0
0 2
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 6 + 1
2
4 1-1
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
3
5
2
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 7
0
0
0
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Maxflow Algorithms
Source
Sink
v1 v2
3
5
2
Slide credit: Pushmeet Kohli
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Algorithms assume non-negative capacity
Flow = 7
0
0
0
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Dealing with Non-Binary Cases
• For image segmentation, the limitation to binary energies is a nuisance. Binary segmentation only
• We would like to solve also multi-label problems. Ø NP-hard problem with 3 or more labels
• There exist some approximation algorithms which extend graph cuts to the multi-label case Ø E.g. -Expansion
• They are no longer guaranteed to return the globally optimal result. Ø But -Expansion has a guaranteed approximation quality and
converges in a few iterations.
Slide credit: B. Leibe
α
α
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Summary: Graph Cuts Segmentation
• Pros Ø Powerful technique, based on probabilistic model (MRF). Ø Applicable for a wide range of problems. Ø Very efficient algorithms available for vision problems. Ø Becoming a de-facto standard for many segmentation tasks.
• Cons/Issues Ø Graph cuts can only solve a limited class of models
– Submodular energy functions – Can capture only part of the expressiveness of MRFs
Ø Only approximate algorithms available for multi-label case
Slide credit: B. Leibe
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Segmentation: Caveats
• We’ve looked at bottom-up ways to segment an image into regions, yet finding meaningful segments is intertwined with the recognition problem.
• Often want to avoid making hard decisions too soon
• Difficult to evaluate; when is a segmentation successful?
Slide credit: Kristen Grauman
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Speeding up 1: start from `superpixels’
• Start from an over-segmentation, similar-looking pixels have been grouped together quickly; requires object boundaries to be preserved as part of superpixel edges !
X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003.
“superpixels”
Slide credit: Svetlana Lazebnik
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Speeding up 2: objectness
Trying to draw bounding boxes around objects, without knowing what they are
Figure 7: Best windows proposed by our method (yellow), out of 1000, for the ground-truth annotations (blue). The examples in thebottom right show some limitations of our approach. It fails to merge two parts of the same object that are dissimilar and have a thincommon border or when the superpixel segmentations miss object boundaries.
AcknowledgmentsThe authors gratefully acknowledge support by Toyota.
This work was supported by the European Research Coun-cil (ERC) under the project VarCity (#273940).
References[1] B. Alexe, T. Deselaers, and V. Ferrari. What is an object? In
CVPR, 2010.[2] B. Alexe, T. Deselaers, and V. Ferrari. Measuring the object-
ness of image windows. IEEE Trans. on PAMI, 2012.[3] J. Carreira and C. Sminchisescu. Constrained parametric min
cuts for automatic object segmentation. In CVPR, 2010.[4] K.-Y. Chang, T.-L. Liu, H.-T. Chen, and S.-H. Lai. Fusing
generic objectness and visual saliency for salient object de-tection. In ICCV, 2011.
[5] R. G. Cinbis and S. Sclaroff. Contextual Object Detectionusing Set-based Classification. In ECCV, 2012.
[6] N. Dalal and B. Triggs. Histogram of Oriented Gradients forhuman detection. In CVPR, 2005.
[7] T. Deselaers, B. Alexe, and V. Ferrari. Localizing objectswhile learning their appearance. In ECCV, 2010.
[8] T. Deselaers, B. Alexe, and V. Ferrari. Weakly supervised lo-calization and learning with generic knowledge. IJCV, 2012.
[9] I. Endres and D. Hoiem. Category independent object pro-posals. In ECCV, 2010.
[10] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn,and A. Zisserman. The PASCAL Visual Object ClassesChallenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.
[11] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn,and A. Zisserman. The PASCAL Visual Object ClassesChallenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
[12] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ra-manan. Object detection with discriminatively trained partbased models. IEEE Trans. on PAMI, 32(9), 2010.
[13] P. F. Felzenszwalb and D. P. Huttenlocher. Efficient graph-based image segmentation. Int. J. Comput. Vision, 2004.
[14] J. Feng, Y. Wei, L. Tao, C. Zhang, and J. Sun. Salient objectdetection by composition. In ICCV, 2011.
[15] J. Gall, A. Yao, N. Razavi, L. Van Gool, and V. Lempit-sky. Hough forests for object detection, tracking, and ac-tion recognition. IEEE Trans. on PAMI, 33(11):2188–2202,2011.
[16] C. Gu, P. Arbelaez, Y. Lin, K. Yu, and J. Malik. Multi-component models for object detection. In ECCV. 2012.
[17] M. Guillaumin and V. Ferrari. Large-scale knowledge trans-fer for object localization in ImageNet. In CVPR, 2012.
[18] Y. J. Lee and K. Grauman. Learning the easy things first:Self-paced visual category discovery. In CVPR, 2011.
[19] A. Prest, C. Schmid, and V. Ferrari. Weakly supervised learn-ing of interactions between humans and objects. IEEE Trans.on PAMI, 2012.
[20] R. C. Prim. Shortest connection networks and some general-izations. Bell System Technology Journal, 1957.
[21] E. Rahtu, J. Kannala, and M. Blaschko. Learning a CategoryIndependent Object Detection Cascade. In ICCV, 2011.
[22] F. Sener, C. Bas, and N. Ikizler-Cinbis. On recognizing ac-tions in still images via multiple features. In ECCV Work-shops, 2012.
[23] P. Siva, C. Russell, and T. Xiang. In defence of negativemining for annotating weakly labelled data. In ECCV, 2012.
[24] P. Siva and T. Xiang. Weakly supervised object detectorlearning with model drift detection. In ICCV, 2011.
[25] S. Stalder, H. Grabner, and L. Van Gool. Dynamic Object-ness for Adaptive Tracking. In ACCV, 2012.
[26] J. Sun and H. Ling. Scale and object aware image retargetingfor thumbnail browsing. In ICCV, 2011.
[27] K. Van de Sande, J. Uijlings, T. Gevers, and A. Smeulders.Segmentation as selective search for object recognition. InICCV, 2011.
[28] H. Xiao and T. Stibor. Efficient collapsed gibbs sampling forlatent dirichlet allocation. In ACML, 2010.
[29] J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba.SUN database: Large-scale scene recognition from abbey tozoo. In CVPR, 2010.
25432543
yellow: bb by computer / blue: by human
• Focus on regions that an `objectness’ score indicates as probably containing an object
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Topics of This Lecture
• Introduction Ø Gestalt principles Ø Image segmentation
• Segmentation as clustering Ø k-Means Ø Feature spaces
Ø Mixture of Gaussians, EM
• Model-free clustering: Mean-Shift
• Graph theoretic segmentation: Normalized Cuts
• Interactive Segmentation with path search
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Dynamic path search: principle
Guided by a user-supplied cost function, expressing expectations like good edges to contain pixels with high gradients, edges to be smooth, etc.
find optimal path through the image:
1. having lowest cost 2. satisfying constraints (e.g. given endpoints)
Useful in interactive applications (e.g. medical), or when environment constrained
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Dynamic path search : nomenclature
A graph consists of nodes (pixels) connected by arcs (steps)
Nodes connected by steps are parents and successors
Identifying a node’s successors is expansion of that node
A tree is a graph with 1 parent for the nodes
(our arcs are undirected)
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Dynamic path search : nomenclature cont’d
Often the arcs are assigned a cost
A sequence of nodes n1,n2,…,nk (ni = sucessor
of ni-1) is a path of length k
Usually path cost = Σ arc costs
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Dynamic path search : cost functions
Cost function incorporates problem-specific information
e.g. penalize changes in edge direction e.g. penalize the inclusion of pixels with low intensity gradient
problem is one of optimization :
l 1. gradient descent l 2. path array methods l 3. best-first search
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Gradient descent
Always choose the next pixel that adds the smallest cost
Example :
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Gradient descent : example
angiogram image :
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Gradient descent : example cont’d
Cost :
1
33
−
⎟⎟⎠
⎞⎜⎜⎝
⎛+∇
∑di
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Gradient descent : problem
Gradient descent never looks back :
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Gradient descent : remarks
Gradient descent used for several purposes
Fast but no guarantee that the optimal path is found
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : principle
Illustration of one-pass example : the angiogram
Select cheapest step to a node
?
Remember by putting a pointer back
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : principle
Upon reaching the last column…
Select cheapest node in last column
BACKTRACK
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : principle
From left to right : determine cheapest path + cost for each pixel
set pointer back
Determine pixel with lowest value in right column
“backtrack”
Summary
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : example
Optimal paths are found :
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : remarks
guaranteed to find the optimal path
solves many optimisation problems simultaneously : from each point cheapest path can be backtracked
suboptimal in that sense
program is simple
cannot handle meandering paths, so far
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : principle of F*
Solution to meandering problem via multi-pass algorithm : F*
Path array is constructed iteratively until no changes
F* finds optimal paths
notation : P( i , j ) = cost up to point ( i , j ) C( i’,j’, i, j ) = cost for step from ( i’, j’) to ( i, j )
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : F* algorithm l 1. P initialized to ∞, starting node = 0
l 2. Top to bottom, row by row updating of P :
l 3. Alternating bottom to top and top to bottom
l 4. Stops when no changes
l 5. Optimal path is backtracked
( )
( )( ) ( )( ) ( )( ) ( )( ) ( )
( ) ( ) ( ) ( )( )⎪⎪⎪⎪⎪
⎩
⎪⎪⎪⎪⎪
⎨
⎧
+++=
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
−+−+−++−
−+−−−+−−
=
1,,,1,,, min:, :left right to from then,
1,,,1,,1,1,,1,1
,,1,,,1,1,1,,1,1
,,
min:,
:right left to from first,
jiPjijiCjiPjiP
jiPjijiCjiPjijiC
jiPjijiCjiPjijiC
jiP
jiP
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : F* example
Road detection in aerial image :
Endpoints are given
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : F* example
Cost : C ( i, j ) = (255 - F( i, j ))a
F ( i, j ) is scaled output of matched convolution filter :
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : F* example
Result for a = 1 Result for a = 2.4
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Path array techniques : F* remarks
Choice of cost function is crucial but not trivial ! Note special structure of cost function in the ex.:
C ( i’, j’, i , j ) = C ( i , j ) ⇓
backtracking via cheapest neighbour instead of pointers
F* yields the optimal solution
Invests much effort in finding irrelevant solutions as before
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first search : principle
Strike a balance between the two previous ones
n 1. Uses problem-specific information to guide the process selectively
n 2. Returns the optimal solution if applied properly
New data structures: list OPEN: end nodes of paths list CLOSED: nodes already passed through furthermore: evaluation function f(n)
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : the algorithm BF 1. Start node s on OPEN
3. OPEN node n with f min., put it on CLOSED 4. Expand n
6. For every successor n’ � Calculate f (n’) � If n’ neither on OPEN nor CLOSED :
put n’ on OPEN, set ptr � If n’ already on OPEN or CLOSED, replace f (n’) and redirect ptr if new f (n’) lower, if n’ on CLOSED, back to OPEN
7. Go to step 2.
2. OPEN = empty exit with failure
ñ
5. A successor = a goal node solved
ñ
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : the algorithm BF 1. Start node s on OPEN
3. OPEN node n with f min., put it on CLOSED 4. Expand n
6. For every successor n’ � Calculate f (n’) � If n’ neither on OPEN nor CLOSED :
put n’ on OPEN, set ptr � If n’ already on OPEN or CLOSED, replace f (n’) and redirect ptr if new f (n’) lower, if n’ on CLOSED, back to OPEN
7. Go to step 2.
2. OPEN = empty exit with failure
ñ
5. A successor = a goal node solved
ñ
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : BF*
Risk of missing the optimal solution
In step 5 : quits if successor is a goal node
We should include the cost of the last arc
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : BF*
wait until new arc is part of f (n) exit when node n is a goal node
BF* algorithm
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : cost functions
The BF and BF* algorithms advance slowly. For a typical cost function, short paths will have lower costs and need to be developed before the actual solution path can reach a goal node. We now try to do something about that.
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : cost functions
C (n) = cost for portion from n to the END of the path
C (n) is recursive if for immediate successor ns
C (n) = F (E (n) , C (ns )) where E (n) function of local properties only
If such rollback function F exists, it is possible to evaluate the cost of a path knowing the optimal remaining cost and the costs of steps taken so far
Examples of rollback functions are the maximum or the sum of the arc costs.
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : crystal balls…?
BUT we don’t know the path to the goal...
????? !!!!!!
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : crystal balls…?
We introduce an estimate h (n)
Recursiveness of the roll-back function allows us to effectively use such estimate!
this version of BF is the Z algorithm this version of BF* is the Z* algorithm
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
( ) ( ) ( )nCnncnC nrnr +′=′ ,
( ) ( ) ( )121 ++++= nncnncnC ,,
( ) ( ) ( ) ( )nncnncnncnC ,,, 11211 −+++++=−
Best - first: Z Examples of non-recursive cost functions:
1
2
Largest - smallest node cost
With Cnr(n’) the cost from start node to n’
Suppose n-1 is a start node, n+2 is a goal node
Cannot be retrieved from c(n-1,n) and
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : Z*
Z* finds the optimal path if
1. h (n) underestimates the remaining cost
2. for n’ successor of n :
3. for parents n1 and n2 of a common successor n : ( ) ( )( ) ( ) ( )( )nhnEFnhnEF ≥ ,, 21
⇓( ) ( )( ) ( ) ( )( )nhnEFnhnEF ′ ≥′ ,, 21
Sum and maximum cost are acceptable functions
( ) ( )( ) ( ) ( )( )nhnEFnhnEF ′′≥′ , ,
( ) ( )nhnh ′′≥′⇓
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : A*
A* uses a sum : C(n) = c( n,ns ) + C(ns )
f(n) = g(n) + h(n), with g(n) sum of arc costs up to the current point
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best - first : the A* algorithm
1. Start node on OPEN
3. n with min. f from OPEN, put it on CLOSED
5. expand n � n’ not on OPEN or CLOSED, estimate h(n’), calculate f (n’), set ptrs.
� n’ on OPEN or CLOSED, direct ptr along path with lowest g (n’) � if n’ obtained new ptr + on CLOSED, then put on OPEN
6. Go to step 2.
2. OPEN = empty exit, failure
ñ
4. n = goal node solved
ñ
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best-first : consistency assumption
if ( ) ( ) ( )nmcnhmh ,min≤−
then reconsidering CLOSED is unnecessary
explanation :
If the assumption holds, costs always increase over time. Hence returning to a node with lower cost is impossible
remark :
with h(n) = 0 , the assumption normally applies
Perc
eptu
al a
nd S
enso
ry A
ugm
ente
d Co
mpu
ting
C
ompu
ter V
isio
n W
S 08
/09
Best-search : Z*, A* remarks
Increase efficiency by penalizing path incompleteness
Spend less time on needlessly growing inferior paths
Nevertheless, e.g. F* can be superior if the optimum is not outstanding, due to bookkeeping overhead or if there can be changes in the goal