image manifolds 16-721: learning-based methods in vision alexei efros, cmu, spring 2007 © a.a....
Post on 15-Jan-2016
219 views
TRANSCRIPT
Image Manifolds
16-721: Learning-based Methods in VisionAlexei Efros, CMU, Spring 2007
© A.A. Efros
With slides by Dave Thompson
Images as Vectors
=
m
n
n*m
Importance of Alignment
=
m
n
n*m
=
n*m
=?
Text Synthesis
[Shannon,’48] proposed a way to generate English-looking text using N-grams:• Assume a generalized Markov model
• Use a large text to compute prob. distributions of each letter given N-1 previous letters
• Starting from a seed repeatedly sample this Markov chain to generate new letters
• Also works for whole words
WE NEED TO EAT CAKE
Mark V. Shaney (Bell Labs)
Results (using alt.singles corpus):• “As I've commented before, really relating to
someone involves standing next to impossible.”
• “One morning I shot an elephant in my arms and kissed him.”
• “I spent an interesting evening recently with a grain of salt”
Video TexturesVideo Textures
Arno Schödl
Richard Szeliski
David Salesin
Irfan Essa
Microsoft Research, Georgia Tech
Video texturesVideo textures
Our approachOur approach
• How do we find good transitions?
Finding good transitions Finding good transitions
• Compute L2 distance Di, j between all frames
Similar frames make good transitions
frame ivs.
frame j
Markov chain representationMarkov chain representation
2 3 41
Similar frames make good transitions
Transition costs Transition costs
• Transition from i to j if successor of i is similar to j
• Cost function: Cij = Di+1, j
• i
j
i+ 1
j-1
i j D i+ 1, j
Transition probabilitiesTransition probabilities
•Probability for transition Pij inversely related
to cost:•Pij ~ exp ( – Cij / 2 )
high low
Preserving dynamicsPreserving dynamics
Preserving dynamics Preserving dynamics
Preserving dynamics Preserving dynamics
• Cost for transition ij
• Cij = wk Di+k+1, j+kk = -N
N -1
i
j j+ 1
i+ 1 i+ 2
j-1j-2
i jD i, j - 1 D Di+ 1, j i+ 2, j+ 1
i-1
D i-1, j-2
Preserving dynamics – effect Preserving dynamics – effect
• Cost for transition ij
• Cij = wk Di+k+1, j+kk = -N
N -1
Video sprite extractionVideo sprite extraction
blue screen m attingand velocity estim ation
C i j = + angle C i j
vector tom ouse pointer
S im ilarity term Contro l term
velocity vector
Animation{ {
Video sprite controlVideo sprite control
• Augmented transition cost:
Interactive fishInteractive fish
Advanced Perception David R. Thompson
manifold learning with applications to object recognition
plenoptic function
manifolds in vision
appearance variation
manifolds in vision
images from hormel corp.
deformation
manifolds in vision
images from www.golfswingphotos.com
Find a low-D basis for describing high-D data. X ~= X' S.T. dim(X') << dim(X)
uncovers the intrinsic dimensionality
manifold learning
If we knew all pairwise distances…
Chicago Raleigh Boston Seattle S.F. Austin Orlando
Chicago 0
Raleigh 641 0
Boston 851 608 0
Seattle 1733 2363 2488 0
S.F. 1855 2406 2696 684 0
Austin 972 1167 1691 1764 1495 0
Orlando 994 520 1105 2565 2458 1015 0
Distances calculated with geobytes.com/CityDistanceTool
Multidimensional Scaling (MDS) For n data points, and a distance matrix D,
Dij =
...we can construct a m-dimensional space to preserve inter-point distances by using the top eigenvectors of D scaled by their eigenvalues
j
i
MDS result in 2D
Actual plot of cities
Don’t know distances
Don’t know distnaces
1. data compression
2. “curse of dimensionality”
3. de-noising
4. visualization
5. reasonable distance metrics
why do manifold learning?
reasonable distance metrics
?
reasonable distance metrics
?
linear interpolation
reasonable distance metrics
?
manifold interpolation
Isomap for images
Build a data graph G. Vertices: images (u,v) is an edge iff SSD(u,v) is small For any two images, we approximate the
distance between them with the “shortest path” on G
Isomap
1. Build a sparse graph with K-nearest neighbors
Dg =
(distance matrix issparse)
Isomap
2. Infer other interpoint distances by finding shortest paths on the graph (Dijkstra's algorithm).
Dg =
Isomap shortest-distance on a graph is easy to compute
Isomap results: hands
- preserves global structure
- few free parameters
- sensitive to noise, noise edges
- computationally expensive (dense matrix eigen-reduction)
Isomap: pro and con
Leakage problem
Find a mapping to preserve local linear relationships between neighbors
Locally Linear Embedding
Locally Linear Embedding
1. Find weight matrix W of linear coefficients:
Enforce sum-to-one constraint.
LLE: Two key steps
2. Find projected vectors Y to minimize reconstruction error
must solve for whole dataset simultaneously
LLE: Two key steps
LLE: Result preserves local topology
PCA
LLE
- no local minima, one free parameter - incremental & fast
- simple linear algebra operations
- can distort global structure
LLE: pro and con