by: yair weiss and edward h. adelson. presenting: ady ecker and max chvalevsky. em for motion...
TRANSCRIPT
By: Yair Weiss and Edward H. Adelson.
Presenting:
Ady Ecker and Max Chvalevsky.
EM for Motion Segmentation“Perceptually organized EM: A framework that combines information about form and motion”
“A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models”
2
Contents
Motion segmentation. Expectation Maximization. EM for motion segmentation. EM modifications for motion
segmentation. Summery.
Part 1Part 1::
MotionMotionSegmentationSegmentation
4
Motion segmentation problem
Input:1. Sequence of images.2. Flow vector field – output of standard algorithm.
Problem:Find a small number of moving objects in the sequence of images.
vx
vy
v
5
Segmentation Output
Classification of each pixel in each image to its object.
Full velocity field. flow data velocity field
6
Segmentation goal
7
Motion vs. static segmentation
Combination of motion and spatial data.Object can contain parts with different static parameters (several colors).
Object representation in an image can benon-continuous when: There are occlusions. Only parts of the object are captured...
8
Difficulties
Motion estimation.
Integration versus segmentation dilemma.
Smoothing inside the model while keeping models independent.
9
Motion estimation - review
Estimation cannot be done from local measurements only. We have to integrate them.
10
Motion integration
In reality we will not have clear distinction between corners and lines.
11
Integration without segmentation
When there are several motions, we might get false intersection points of velocity constraints at T-junctions.
12
Integration without segmentation
False corners (T-junctions) introduce false dominant directions (upwards).
13
Contour ownership
Most pixels inside the object don’t supply movement information. They move with the whole object.
14
Smoothing
We would like to smooth information inside objects, not between objects.
15
Smoothness in layers
16
Human segmentation
Humans perform segmentation effortlessly. Segmentation may be illusive. Tendency to prefer (and tradeoff):
Small number of models. Slow and smooth motion.
The segmentation depends on factors such as contrast and speed, that effect our confidence in possible motions.
17
Segmentation illusion – The split herringbone
18
Segmentation Illusion - plaids
Part 2Part 2::
ExpectationExpectationMaximizationMaximization
20
Clustering
21
Clustering Problems
Structure: Vectors in high-dimension space belong to
(disjoint) groups (clusters, classes, populations). Given a vector, find its group (label).
Examples: Medical diagnosis. Vector Quantization. Motion Segmentation.
22
Clustering by distance to known centers
23
Finding the centers from known clustering
24
EM: Unknown clusters and centers
Maximization step:Find the center (mean)
of each class
Start with random model parameters
Expectation step:Classify each vectorto the closest center
25
Illustration
26
EM Characteristics
Simple to program. Separates the iterative stage to two
independent simple stages. Convergence is guaranteed, to some local
minimum. Speed and quality depend on:
Number of clusters. Geometric Shape of the real clusters. Initial clustering.
27
Soft EM
Each point is given a probability (weight) to belong to each class.
The E step:The probabilities of each point are updated according to the distances to the centers.
The M step:Class centers are computed as a weighted average over all data points.
28
Soft EM (cont.)
Final E step:classify each point to the nearest (most probable) center.
As a result: Points near a center of a cluster have high
influence on the location of the center. Points near clusters boundaries have small
influence on several centers. Convergence to local minima is avoided as
each point can softly change its group.
29
Perceptual Organization
Neighboring or similar pointsare likely to be of the same class.
Account for this in the computation of weights by prior probabilities.
30
Example: Fitting 2 lines to data points
Input: Data points that where
generated by 2 lines with Gaussian noise.
Output: The parameters of
the 2 lines. The assignment of
each point to its line.
ri
(xi,yi)
y=a1x+b1+v y=a2x+b2+v
v~N(0,1)
31
The E Step
Compute residuals assuming known lines:
Compute soft assignments:
ii
ii
ybxair
ybxair
222
111
)(
)(
222
221
222
222
221
221
/)(/)(
/)(
2
/)(/)(
/)(
1
)(
)(
irir
ir
irir
ir
ee
eiw
ee
eiw
32
Least-Squares review
In case of single line and normal i.i.d. errors, maximum likelihood estimation reduces to least-squares:
The line parameters (a,b) are solutions to the system:
i i
i ii
ii i
i ii i
y
yx
b
a
x
xx
1
2
i ibai iiba rybax 2
,2
, minmin
33
The M Step
In the weighted case we find
i i
i ii
ii i
i ii i
yiw
yxiw
b
a
iwxiw
xiwxiw
)(
)(
)()(
)()(
1
1
1
1
11
12
1
iiba iriwiriw )()()()(min 2
222
11,
i i
i ii
ii i
i ii i
yiw
yxiw
b
a
iwxiw
xiwxiw
)(
)(
)()(
)()(
2
2
2
2
22
22
2
Weighted least squares system is solved twice for (a1,b1) and (a2,b2).
34
Illustrations
Illustration
36
Estimating the number of models
In weighted scenario, additional models will not necessarily reduce the total error.
The optimal number of models is a function of the parameter – how well we expect the model to fit the data.
Algorithm: start with many models. redundant models will collapse.
Illustration
l=log(likelihood)
Part 3Part 3::
EM for MotionEM for MotionSegmentationSegmentation
39
Segmentation of image motion: Input
Products of image sequence: Local flow – output of standard algorithm. Pixel intensities and color. Pixel coordinates. Static segmentation:
Based on the same local data. Problematic as explained before.
40
Segmentation output
segmentation Models:
‘blue’ model ‘red’ model
41
Notations
r - pixel. Or - flow vector at pixel r. k - model id. k - parameters of model k. vk(r) - velocity predicted by model k at location r. Dk(r) = D(r, k) - distance measure. - expected noise variance. gk(r) - probability that pixel ‘r’
is a member of model ‘k’.
42
Segmentation output
Segmented O: Model parameters:
blue red
r Vred(r)O(r) Vblue(r)
43
The E Step
Purpose: determine statistic classification of every pixel to models.
j jj
kkk rD
rDrg
))(exp(
))(exp()(
22
22
k(r) - prior probability granted to model ‘k’.
For classical EM, k(r) are equal for all ‘k’.
44
The E Step (cont)
Alternative representation:
Soft decision enables slow convergence to better minimum instead of finding local minima.
,...))(,)(softmin()( 222
221 rDrDrg
45
Distance measure functionality
Correct physical interpretation of motion data.
If possible – enable analytic solution.
46
Distance measures (1)
Optic flow constraint:
s
krsk t
Irv
r
IrD 22 ))(()(
– window centered at ‘r’. vk(r) – velocity of ‘k’ at location ‘r’.
Quadratic. Provides closed MLE solution for the M-step.
47
Distance measures (2)
Deviation from constant intensity:
s
krsk tsIttrvsIrD 22 )),()),((()(
– window centered at ‘r’. Good for high speed motion. Resolved by successive linearizations.
48
The M step
Purpose: layer optimization(according the soft classification of pixels).
r
kkkk rDrgJ ),()()(minarg 2
Produces weighted ‘average’ of the model. ‘Average’ depends on definition of D. Constrained by J (slow & smooth motion).
49
J (cost) definition
For loosely constrained (typical for image segmentation):
yx n
n
n xaJ
, 0
)(
For highly constrained :(#degrees of freedom < #owned pixels).
0
50
EM: Unknown clusters and centers
Estimation step:Classify each vectorto the closest center
Maximization step:Find the center (mean)
of each class
Start with random model parameters
51
Natural image processing without segmentation
a) Frame of a movie taken from driving car.
b) Flow data along the dotted line.
c) Smooth global approximation of motion (along the line).
52
EM natural image processing
a) The same picture.
b) Rigid-like model segmentation.
c) EM result.
53
Textureless Regions
Homogeneous regions have no clear layer preference(stay gray in ownership plots).
Wrong segmentation decisions for “similar” motions (squares example).
Probabilistic resolution of ambiguities(bars example).
54
Illustration
BARS:Probabilistic solution,vertical vs. horizontal.
2 squares moving diagonally right No segmentation for vertical lines. No segmentation for background. Motion directions identified
correctly. Hand:
Noisy segmentation
55
Energy formulation of EM
kr
kkkr
kkeff rgrgrDrggE,
2
,
2 )(log)()()(),,(
E-step: optimization with respect to gk(r). First term prefers hard decision. Second term (entropy) prefers no decision.
M-step: optimization with respect to (embedded in D).
Part 4Part 4::
EM ModificationsEM Modificationsfor Motion for Motion
Segmentation Segmentation
57
Proposed modifications
POEM – Perceptually Organized EMCombines local & neighbor motionwith static analysis. Regional grouping. Color & intensity data.
Contour ownership. Outlier detection
T-junction points. Statistical outliers.
58
POEM algorithm idea
Determine the segmentation based on: Local pixel flow (standard EM). Neighbor pixel segmentation. Static data (optionally).
Reason: neighboring pixels have higher probability to
belong the same object. Similar pixels have even higher probability to
belong the same object.
59
PO
Window of influence
))()(
exp(),(22
21
sIrIsrsrw
Neighbor votes:
s
kk srwsgrV ),()()(
60
POE step
Basic equation:
j jj
kkk rD
rDrg
))(exp(
))(exp()(
22
22
estimation:k
j j
kk rV
rVr
))(exp(
))(exp()(̂
61
POE step alternative representation
),...)()(),()(softmin()( 2211 rVrDrVrDrg
The solution is computationally intensive.
62
M step in POEM
The M step is unchanged:
r
kkkk rDrgJ ),()()(minarg 2
63
POEM Energy formulation
PO represented by the additional (last) term
kr rskk
krkk
krkkeff
sgrgsrw
rgrgrDrggE
,
,
2
,
2
)()(),(
)(log)()()(),,(
64
Contour ownership
Implemented by modification of PO function: Step 1: preliminary segmentation & border
detection. Step 2: contour ownership determination –
by relative depth,consistent with T-junctions.
Step 3: combining in voting procedure. Equation modification:
window of influence between pixels (w(r,s))gives additional weightto pixels on segment’s borders.
65
Results of POEM
Advantages:
Resolves regions without textureby propagating information from borders.
More robust to noise.
66
Illustration
2 squares moving diagonally right Partially correct segmentation for
vertical lines.
Moving hand: Smooth solution
67
Bars results
Input
Classical EM segmentation
POEM segmentation without contour ownership
POEM segmentation with contour ownership
68
Flow outliers
Segmentation into k layersand additional layer of “outliers”
Probability to be outlier – function of: Prior – e.g. for T-junction. Likelihood – likelihood to be outlier.
Outliers don’t participate directly in PO. Outliers layer is not smooth nor slow.
Part 5Part 5::
SummerySummery
70
Advantages
The system showed relatively good results: For natural images. For artificial, challenging images.
Simple - has few parameters. Universal. Modular - enables improvements:
Utilizing additional data (mostly static). Optimizing parameters. Using advanced convergence methods. Altering priors to fit non-symmetric biological phenomena.
71
Drawbacks
Some images weren’t resolved completely: Edge deviations in ‘hand’ image.
The system includes input-dependant .No process to determine value of was proposed.
The ‘optic flow constraint’ measure is appropriate only for instantaneous motion.
Other distance measures – much more difficult to solve.
72
Conclusions
The system tackles with ambiguity & noise. It estimates the degree of ambiguity. It assumes slow & smooth motions. The system is capable to explain its input by
segmentation into separate layers of motion. It exploits static data to improve the
segmentation.