back to lucas-kanadertc12/cse598g/lkintrocontinued.pdf · cse486 robert collins klt algorithm 1....
TRANSCRIPT
CSE486Robert Collins
Back to Lucas-Kanade
CSE486Robert Collins
Step-by-Step Derivation
PPyxWI ))];,([([
The key to the derivation is Taylor series approximation:
PP
WIPyxWI
))];,([([~~
We will derive this step-by-step. First, we need two background formula:
CSE486Robert Collins
Step-by-Step Derivation
First consider the expansion for a single variable p
CSE486Robert Collins
Step-by-Step Derivation
Note that each variable parameter pi contributes a term of the form
CSE486Robert Collins
Step-by-Step Derivation
Now let’s rewrite the expression as a matrix equation. For each term,we can rewrite:
So that we have:
CSE486Robert Collins
Step-by-Step DerivationFurther collecting the dw/dpi terms into a matrix, we can write:
PPyxWI ))];,([([ PP
WIPyxWI
))];,([([~~
which are the terms in the matrix equation:
CSE486Robert Collins
Let W([x,y]; P) be a warping function. P is the parameter (vector) of the warping function.
Case 1: P = [u, v]. This is the simple point “optic flow” case
Case 2: P represents parameters of an affine transform i.e.
11
1)];,([
642
531 y
x
ppp
pppPyxW
Dr. Ng Teck Khim
2])],([))];,([([ yxTPyxWIpatchimagewithin
y
x
The objective function for finding the best match is
Summary: the forward LK update equations
CSE486Robert Collins
The objective function for finding the best match is
The Lucas-Kanade algorithm minimizes the above objective function in a Gauss-Newton gradient descent non-linear optimization process.
2])],([))];,([([ yxTPyxWIpatchimagewithin
y
x
warptheofJacobiantheisP
W
PyxWatIimageofgradienttheisy
I
x
II
where
yxTPP
WIPyxWI
yxTPPyxWIE
)];,([][
]),())];,([([
]),())];,([([
2
2
Dr. Ng Teck Khim
CSE486Robert Collins
For affine warping, n = 6 and so
1000
0100
642
531
yx
yxP
PyPyxP
PyPxPx
P
W
Dr. Ng Teck Khim
n
yyyy
n
xxxx
yx
P
W
P
W
P
W
P
WP
W
P
W
P
W
P
W
P
W
WWPyxWLet
321
321
],[)];,([
CSE486Robert Collins
})()],())];,([([][{
)],())];,([([][2
PP
WI
P
WIyxTPyxWI
P
WI
yxTPP
WIPyxWI
P
WI
P
E
TT
T
Setting to zero givesP
E
)))];,([(),(()(
)))]];,([(),(()([])([
1
1
PyxWIyxTP
WIH
PyxWIyxTP
WI
P
WI
P
WIP
T
TT
Update P = P+P and iterate
Dr. Ng Teck Khim
yxTPP
WIPyxWI
yxTPPyxWIE
]),())];,([([
]),())];,([([
2
2
Hessian
CSE486Robert Collins
Iterate Warp I to obtain I(W([x y];P))
Compute the error image T(x) – I(W([x y]; P))
Warp the gradient with W([x y]; P)
Evaluate at ([x y]; P)
Compute steepest descent images
Compute Hessian matrix
Compute
Compute P
Update P P + P
I
P
W
P
WI
)()(P
WI
P
WI T
)))];,([(),(()( PyxWIyxTP
WI T
Until P magnitude is negligible
Source: “Lucas-Kanade 20 years on: A unifying framework” Baker and Mathews, IJCV 04
Dr. Ng Teck Khim
CSE486Robert Collins
Source: “Lucas-Kanade 20 years on: A unifying framework” Baker and Mathews, IJCV 04
Algorithm At a Glance
CSE486Robert Collins Steepest Descent Images, Another Look
image gradX gradY
dW/dp1 dW/dp2 dW/dp3
dW/dp4 dW/dp5 dW/dp6
I
I dW/dp1 I dW/dp2 I dW/dp3. . .
I dW/dp4 I dW/dp5 I dW/dp6. . .
CSE486Robert Collins
Approximating Normalized Correlation
We can remove some potential error sources (changes in illumination or camera gain) if we first normalize the template and the local search window by subtracting off mean and dividing by standard deviation.
Traditional LK is sensitive to changes in brightness (SSD makes a constant brightness assumption!)
CSE486Robert Collins
MultiScale Gradient Descent
. • Traditional LK is just refining a position estimate
• We must be close to the right answer to start with, and we are only guaranteed to descend to a local minimum in the SSD function (might not be best)
• To increase range of LK methods, we can apply them in a coarse to fine image pyramid, so at each level a small refinement is sufficient.
CSE486Robert Collins
Baker, Matthews, CMU
State of the Art Lucas Kanade Tracking
QuickTime™ and aYUV420 codec decompressor
are needed to see this picture.
Tracking facial mesh models
CSE486Robert Collins
Using LK to Align the Background
CSE486Robert Collins
LK for Aligning the Background
General idea: We can also apply LK to align the whole image, which amounts to stabilizing the background of a moving camera.
CSE486Robert Collins
Destination image
Recall: Projective Warping
from Hartley & Zisserman
Source Image
H
Usually need at least a projective warp to align the image, because field of view is larger. Assumption: 1) background is approximately planar OR 2) camera motion is mostly rotational (small translation)
CSE486Robert Collins
Summary: Planar Projection
x
y
Point on planeRotation + Translation
Perspectiveprojection
Pixel coordsu
v
Internalparams
Homography
CSE486Robert Collins
Estimating a Homography
Matrix Form:
Equations:
= Wx(x,y; h11, ... , h33)
= Wy(x,y; h11, ... , h33)
CSE486Robert Collins
Degrees of Freedom?
There are 9 numbers h11,…,h33 , so are there 9 DOF?
No. Note that we can multiply all hij by nonzero kwithout changing the equations:
CSE486Robert Collins
Estimating a Homography
Matrix Form:
Equations:
= Wx(x,y; h11, ... , h32)
= Wy(x,y; h11, ... , h32)
Actually, we need to enforce that there are only eight degrees of freedom. Is usually suffices to set h33=1.
1
1
1
CSE486Robert Collins
Application: Stabilization
Given a sequence of video frames, warp theminto a common image coordinate system.
This “stabilizes” the video to appear as if thecamera is not moving.
frame k k+1 k+2k-1k-2
destinationframe
CSE486Robert Collins
Motivation for Stabilization
• Make it easier to distinguish object motionfrom camera motion
• Allows simpler object motion prediction models (e.g. constant velocity) to be more accurate.
• Easier for people to look at
CSE486Robert Collins
Stabilization Example
QuickTime™ and aCinepak decompressor
are needed to see this picture.
QuickTime™ and aCinepak decompressor
are needed to see this picture.
VIVID project
CSE486Robert Collins
Stabilization by Chaining
What if the reference image does not overlap with all the source images? As long as there are pairwise overlaps, wecan chain (compose) pairwise homographies.
frame k k+1 k+2
destination
k+3 k+4
Not recommended for long sequences, as alignment errors accumulate over time.
H43H3
2H21H1
0
H40 = H1
0 * H21 * H3
2 * H43
CSE486Robert Collins
Application: Mosaicing
from Hartley & Zisserman
Destination imageSource 1 Source 2
Mosaic
H1 H2
CSE486Robert Collins
Fixation via Mosaicing
QuickTime™ and aCinepak decompressor
are needed to see this picture.
incoming image is aligned to a background mosiac and adjustments to pan/tilt are made to keep a feature near center of the image.
Sarnoff, VSAM project
CSE486Robert Collins
Note on Background Stabilization
Homography assumes scene is roughly planar.
What if scene isn’t planar? Alignmentwill not be good if significant 3D relief
“parallax”
CSE486Robert Collins
Parallax due to Nonplanarity
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Sarnoff, VSAM project
CSE486Robert Collins
Pan/Tilt Mosiac Tracking
Dellaert and Collins
QuickTime™ and aCinepak decompressor
are needed to see this picture.
http://www.cs.cmu.edu/~rcollins/Frank/framerate/framerate.html
CSE486Robert Collins
An alternate, feature-based approach
CSE486Robert Collins
Idea: Sparse Feature Matching
I(x,y,t) I(x,y,t+1)
General idea: Find corners in one image (because these are
good areas to estimate displacement)Use translation-only LK to estimate displacementFit higher-order parametric motion to the sparse
displacements to estimate the warp
CSE486Robert Collins
KLT Algorithm
Public-domain code by Stan Birchfield.Available from
http://www.ces.clemson.edu/~stb/klt/
Tracking corner features through 2 or more frames
CSE486Robert Collins
KLT Algorithm
1. Find corners in first image2. Extract intensity patch around each corner3. Use Lucas-Kanade algorithm to estimate
constant displacement of pixels in patch
1. Iteration and multi-resolution to handle large motions 2. Subpixel displacement estimates (bilinear interp warp)3. Can track feature through a whole sequence of frames4. Ability to add new features as old features get “lost”
Niceties
CSE486Robert Collins
Fit Homography using Least Squares
We now have a set of point correspondences(x , y) --> (x’, y’)
Assume related by homography:
Solve for h11,...,h32 using least squares.
CSE486Robert Collins
L.S. using Algebraic Distance
Multiplying through by denominator
Rearrange
CSE486Robert Collins
Algebraic Distance, h33=1 (cont)
Point 1
additionalpoints
2N x 8 8 x 1 2N x 1
Point 4
Point 3
Point 2
CSE486Robert Collins
Algebraic Distance, h33=1 (cont)
A h = b2Nx8 8x1 2Nx1Linear
equations
Solve: AT A h = AT b 2Nx8 8x1 2Nx18x2N 8x2N
(AT A) h = (AT b)8x18x8 8x1
h = (AT A) (AT b)-1
Matlab: h = A \ b
CSE486Robert Collins
Caution: Numeric Conditioning
R.Hartley: “In Defense of the Eight Point Algorithm”
Observation: Linear estimation of projective transformation parameters from point correspondences often suffer from poor“conditioning” of the matrices involves. This means the solutionis sensitive to noise in the points (even if there are no outliers).
To get better answers, precondition the matrices by performinga normalization of each point set by:• translating center of mass to the origin • scaling so that average distance of points from origin is sqrt(2).• do this normalization to each point set independently
Read the paper if you care (it is a nice paper). But usually in matlab, using double float, it doesn’t matter.