computational photography and capture: video texture synthesis€¦ · 1 14-jan intro + more on...
TRANSCRIPT
Computational Photography and Capture:
Video Texture Synthesis
Gabriel Brostow & Tim Weyrich
TA: Frederic Besse
Week Date Topic Hours
1 12-Jan Introduction to Computational Photography and Capture 1
1 14-Jan Intro + More on Cameras, Sensors and Color 2
2 19-Jan No lecture! (Go capture bracketed photos?) -
2 21-Jan Blending, Compositing, Poisson Editing 2
3 26-Jan Time-Lapse 1
3 28-Jan Carving, Warping, and Morphing 2
4 02-Feb High-Dynamic-Range Imaging and Tone Mapping 1
4 04-Feb Hybrid Images, Flash and Multi-Flash Photography 2
5 09-Feb Colourisation and Colour Transfer 1
5 11-Feb Image Inpainting and Texture Synthesis 2
7 23-Feb Rendering a Scene From a Single Photo 1
7 25-Feb Video Based Rendering of Scenes 2
8 02-Mar Video Texture Synthesis 1
8 04-Mar Video Sprites 2
9 09-Mar Deblurring/Dehazing and Coded Aperture Imaging 1
9 11-Mar Image Based Rendering 2
10 16-Mar Motion Capture guest lecture by Doug Griffin
10 18-Mar Capturing Geometry with Active Lighting 2
11 23-Mar Intrinsic Images 1
11 25-Mar Dual Photography and Reflectance Analysis 2
Markov Chains
• Probability of going from state i to state j in n
time steps:
and the single-step transition as:
The n-step transition satisfies the Chapman-Kolmogorov equation,
that for any 0<k<n:
)|Pr( 0
)( iXjXp n
n
ij
)|Pr( 01 iXjXpij
Sr
kn
rj
k
ir
n
ij ppp )()()(
• Regular Markov chain: class of Markov chains where the starting state of the chain has little or no impact on the p(X) after many steps.
Markov Chains
4.04.02.0
3.03.04.0
1.06.03.0
Markov Chain
What if we know today and yesterday’s weather?
Text Synthesis
• [Shannon,’48] proposed a way to generate English-looking text using N-grams:
– Assume a generalized Markov model
– Use a large text to compute prob. distributions of each letter given N-1 previous letters
– Starting from a seed repeatedly sample this Markov chain to generate new letters
– Also works for whole words
WE NEED TO EAT CAKE
Mark V. Shaney (Bell Labs)
• Results (using alt.singles corpus):
– “As I've commented before, really relating to someone involves standing next to impossible.”
– “One morning I shot an elephant in my arms and kissed him.”
– “I spent an interesting evening recently with a grain of salt”
Video Textures(1D predecessor to Graphcut Textures)
Arno SchödlRichard Szeliski
David SalesinIrfan Essa
Microsoft Research, Georgia Tech
Link to local version Gondry Example
Still photos
Video clips
Video textures
Problem statement
video clip video texture
Our approach
• How do we find good transitions?
Finding good transitions
• Compute L2 distance Di, j between all frames
Similar frames make good transitions
frame ivs.
frame j
Markov chain representation
2 3 41
Similar frames make good transitions
Transition costs
• Transition from i to j if successor of i is similar to j
• Cost function: Ci j = Di+1, j
•i
j
i+1
j-1
i j Di+1, j
Transition probabilities
•Probability for transition Pi j inversely related to cost:
•Pi j ~ exp ( – Ci j / 2 )
high low
Preserving dynamics
Preserving dynamics
Preserving dynamics
• Cost for transition i j:
i
j j+1
i+1 i+2
j-1j-2
i jDi, j-1 D Di+1, j i+2, j+1
i-1
Di-1, j-2
1
,1
N
Nk
kjkikji DwC
Preserving dynamics
• Cost for transition i j:1
,1
N
Nk
kjkikji DwC
• Filter with diagonal kernel, weights w.
2 3 41
Dead ends
• No good transition at the end of sequence
2 3 41
Future cost
• Propagate future transition costs backward
• Iteratively compute new cost
• Fi j = Ci j + mink Fj k
2 3 41
Future cost
• Propagate future transition costs backward
• Iteratively compute new cost
• Fi j = Ci j + mink Fj k
2 3 41
Future cost
• Propagate future transition costs backward
• Iteratively compute new cost
• Fi j = Ci j + mink Fj k
2 3 41
Future cost
• Propagate future transition costs backward
• Iteratively compute new cost
• Fi j = Ci j + mink Fj k
2 3 41
• Propagate future transition costs backward
• Iteratively compute new cost
• Fi j = Ci j + mink Fj k
• Q-learning
Future cost
Future cost – effect
Finding good loops• Alternative to random transitions
• Precompute set of loops up front
Visual discontinuities
• Problem: Visible “Jumps”
Crossfading
• Solution: Crossfade from one sequence to the other.
Ai-2
Ai-2
Bj-2
15
…
…
3
1 2
2 1
3
4
4 4
4 4
4+ + +
Ai-1
Ai-1/Bj-2 Ai-1/Bj-2 Ai-1/Bj-2
Bj-1
Ai
Bj
Ai+1
Bj+1
Bj+1
Morphing • Interpolation task:
A2
5B
2
5 C1
5+ +
Morphing • Interpolation task:
• Compute correspondencebetween pixels of all frames
A2
5B
2
5 C1
5+ +
Morphing • Interpolation task:
• Compute correspondence between pixels of all frames
• Interpolate pixel position andcolor in morphed frame
• based on [Shum+Szeliski IJCV 2000]
A2
5B
2
5 C1
5+ +
Results – crossfading/morphing
Results – crossfading/morphing
Jump Cut Crossfade Morph
Crossfading
Frequent jump & crossfading
Video portrait
• Useful for web pages
• Combine with IBR techniques
Video portrait – 3D
Region-based analysis
• Divide video up into regions
• Generate a video texture for each region
Automatic region analysis
User selects target range, S
slow variable fast
),distance(*exp
''
,1Sjw
DP
ji
ij
Video-based animation
• Like sprites incomputer games
• Extract spritesfrom real video
• Interactively control desired motion
©1985 Nintendo of America Inc.
Video sprite extraction
blue screen mattingand velocity estimation
Blue screen matting and
velocity estimation
Ci j = + angle Ci j
vector tomouse pointer
Similarity term Control term
velocity vector
Animation{ {
Video sprite control
Augmented transition cost:
Fi j
Fi j
Fi j Fi j
Fi j
Fi jFi j
SW
W
NW
N
NE
E
SE
S
Goal
Video sprite control
• Need future cost computation
• Precompute future costs for a few angles.
• Switch between precomputed angles according to user input
• Continued in VideoSprites
Interactive fish
Discussion
• Video clips video textures
– define Markov process
– preserve dynamics
– avoid dead-ends
– disguise visual discontinuities
Discussion
• Some things are relatively easy
Discussion
• Some are hard
Siggraph “2000” example
VideoTextures for Motion Capture
• Resulted in 4 papers at SIGGRAPH 2002
– Motion Graphs, by Kovar et al. <link>
– Interactive Motion Generation from Examples, by Arikan & Forsyth <link>
– Interactive Control of Avatars Animated with Human Motion Data, by Lee et al. <link>
– Motion capture assisted animation: Texturing and synthesis, by Pullen and Bregler
Motion GraphsKovar, Gleicher, Pighin, SIGGRAPH 2002
Top: Real data;
Bottom: Synthesized to match yellow line
Motion GraphsKovar, Gleicher, Pighin, SIGGRAPH 2002
Distance Matrix of Mocap frames:Based on point-cloud over 1/3 sec, ground-plane transform T
Motion GraphsKovar, Gleicher, Pighin, SIGGRAPH 2002
Plain path-fitting video
Multi-style path-fitting video
Interactive control video
Remaining Challenges ofVideo-Textures-for-Mocap
• Is this the right distance metric?
• How to interpolate poses?
• How long should the transition be?
• Pose vs. style?
• What to capture?
Flow-based Video Synthesis and EditingK. Bhat et al. SIGGRAPH 2004
<Main video>
Video control using particle systems video
Adding video texture to CG scene <video>
Chemical Brothers’“Star Guitar”
Directed by Michel Gondryhttp://youtube.com/watch?v=qUEs1BwVXGA
Star Guitar (local copy)
Making of Star Guitar (1)
Making of Star Guitar (2)
“Hand” Made Videos by Guillaume Reymond
Computational Photography and Capture:
Video Sprite Animation
Gabriel Brostow & Tim Weyrich
TA: Frederic Besse
Video Sprites
• Web-page
• Local link to Video Sprites summary
Controlled Animation of Video SpritesSchoedl & Essa, SCA 2002
• Optimize animation w.r.t. user-defined costs
• Account for some perspective projection
*Arno Schoedl is now at Think-Cell
Treat object as billboard in 3D
• 1st linear classifier trained on features of:
– Sprite velocity, average color, area, eccentricity
• 2nd (as cascade): alpha and color per pixel
Cost Function
• Start with same transition cost on smoothness
• Total cost of frame sequence S (what we’re optimizing):
Control Cost Function
Iterated Subsequence Replacement
• Q-Learning restricted to overlapping loops and short look-ahead
• Beam-search only adds 1 frame at a time, so no scope for multiple sprites
• Must iterate to optimize frame sequences of multiple sprites jointly + for very long look-ahead!
– Precompute Forward and Backward costs based only on smoothness, every time a subsequence is chosen for replacement
Control Cost CC
• Sum of costs over all constraints and time steps i
• State of sprite at frame i is (p,v,f)i
– p: location of the sprite
– v: sprite’s velocity
– f: input frame current sprite is copied from
• Example of location constraint:
• Constraint types:
– Location, Path, Anti-collision, Frame group
Hamster Path, 2 Locations, Locations & Anti-collision, Location + Group, Formation
Trainable Videorealistic Speech Animation
• SIGGRAPH 2002 paper by Tony Ezzat, GadiGeiger, and Tomaso Poggio
– At Center for Biological and Computational Learning, MIT
• MikeTalk link (from 1998)
• Mary101 (web page of this research)
1. Correct motion for the phonemes
2. Smooth transitions
3. Dynamics of plosives (‘b’ and ‘p’)
4. Co-articulation effects
Process
• Stabilize (all 15 min.)
• Phonemes
• MMM: Multidimensional Morphable Model
– EM-PCA [Roweis98], keep 15 dimensions
– K-means (N=46)
– Flow, via Dijkstra on “corpus graph” made with kNN
• Synthesis
– Trajectory, Render, Composite
Process
• Stabilize
• Phonemes
• Prototype:
– EM-PCA
– K-means
– Flow (via Dijkstra)
• Synthesis
Jump-Off Point to Further Research
Video Textures/Sprites
Direct Manipulationof Video
Free-Viewpoint Characters in 3D
Authoring of Cartoons
Jump-Off Point to Further Research
Video Textures/Sprites
Direct Manipulationof Video
Free-Viewpoint Characters in 3D
Authoring of Cartoons
Starck + Hilton: - Video Based
Character Animation v- de Aguiar et al. 2008-Vlasic et al. 2008-Ballan et al. 2008
Video Puppetry, Barnes et al. 2008 v--------------------------------Cartoon Textures by de Juan + Bodenheimer, 2004--------------------------------Accessible Animation and Customizable Graphics via Simplicial Configuration Modeling, Ngo et al. 2000
Direct Manipulation of Video
• DimP: Video Browsing by Direct Manipulation
– Dragicevic et al., CHI 2008 (v)
• DRAGON: A Direct Manipulation Interface for Frame-Accurate In-Scene Video Navigation
– Karrer et al. CHI 2008 (v)
• Interactive Video Object Annotation
– Goldman et al. 2007 (v1-short, v2-long)
• How to map (2D) gestures to object motions?