computational photography and capture: video texture synthesis€¦ · 1 14-jan intro + more on...

Computational Photography and Capture:

Video Texture Synthesis

Gabriel Brostow & Tim Weyrich

TA: Frederic Besse

Week Date Topic Hours

1 12-Jan Introduction to Computational Photography and Capture 1

1 14-Jan Intro + More on Cameras, Sensors and Color 2

2 19-Jan No lecture! (Go capture bracketed photos?) -

2 21-Jan Blending, Compositing, Poisson Editing 2

3 26-Jan Time-Lapse 1

3 28-Jan Carving, Warping, and Morphing 2

4 02-Feb High-Dynamic-Range Imaging and Tone Mapping 1

4 04-Feb Hybrid Images, Flash and Multi-Flash Photography 2

5 09-Feb Colourisation and Colour Transfer 1

5 11-Feb Image Inpainting and Texture Synthesis 2

7 23-Feb Rendering a Scene From a Single Photo 1

7 25-Feb Video Based Rendering of Scenes 2

8 02-Mar Video Texture Synthesis 1

8 04-Mar Video Sprites 2

9 09-Mar Deblurring/Dehazing and Coded Aperture Imaging 1

9 11-Mar Image Based Rendering 2

10 16-Mar Motion Capture guest lecture by Doug Griffin

10 18-Mar Capturing Geometry with Active Lighting 2

11 23-Mar Intrinsic Images 1

11 25-Mar Dual Photography and Reflectance Analysis 2

Markov Chains

• Probability of going from state i to state j in n

time steps:

and the single-step transition as:

The n-step transition satisfies the Chapman-Kolmogorov equation,

that for any 0<k<n:

)|Pr( 0

)( iXjXp n

n

ij

)|Pr( 01 iXjXpij

Sr

kn

rj

k

ir

n

ij ppp )()()(

http://en.wikipedia.org/wiki/Chapman-Kolmogorov_equation





• Regular Markov chain: class of Markov chains where the starting state of the chain has little or no impact on the p(X) after many steps.

Markov Chains

4.04.02.0

3.03.04.0

1.06.03.0

Markov Chain

What if we know today and yesterday’s weather?

Text Synthesis

• [Shannon,’48] proposed a way to generate English-looking text using N-grams:

– Assume a generalized Markov model

– Use a large text to compute prob. distributions of each letter given N-1 previous letters

– Starting from a seed repeatedly sample this Markov chain to generate new letters

– Also works for whole words

WE NEED TO EAT CAKE

Mark V. Shaney (Bell Labs)

• Results (using alt.singles corpus):

– “As I've commented before, really relating to someone involves standing next to impossible.”

– “One morning I shot an elephant in my arms and kissed him.”

– “I spent an interesting evening recently with a grain of salt”

Video Textures(1D predecessor to Graphcut Textures)

Arno SchödlRichard Szeliski

David SalesinIrfan Essa

Microsoft Research, Georgia Tech

Link to local version Gondry Example

VideoTextures\talk\slides.ppt

Still photos

Video clips

Video textures

Problem statement

video clip video texture

Our approach

• How do we find good transitions?

Finding good transitions

• Compute L2 distance Di, j between all frames

Similar frames make good transitions

frame ivs.

frame j

Markov chain representation

2 3 41

Similar frames make good transitions

Transition costs

• Transition from i to j if successor of i is similar to j

• Cost function: Ci j = Di+1, j

•i

j

i+1

j-1

i j Di+1, j

Transition probabilities

•Probability for transition Pi j inversely related to cost:

•Pi j ~ exp ( – Ci j / 2 )

high low

Preserving dynamics

Preserving dynamics

• Cost for transition i j:

i

j j+1

i+1 i+2

j-1j-2

i jDi, j-1 D Di+1, j i+2, j+1

i-1

Di-1, j-2

1

,1

N

Nk

kjkikji DwC

Preserving dynamics

• Cost for transition i j:1

,1

N

Nk

kjkikji DwC

• Filter with diagonal kernel, weights w.

2 3 41

Dead ends

• No good transition at the end of sequence

2 3 41

Future cost

• Propagate future transition costs backward

• Iteratively compute new cost

• Fi j = Ci j + mink Fj k

2 3 41

• Propagate future transition costs backward

• Iteratively compute new cost

• Fi j = Ci j + mink Fj k

• Q-learning

Future cost

Future cost – effect

Finding good loops• Alternative to random transitions

• Precompute set of loops up front

Visual discontinuities

• Problem: Visible “Jumps”

Crossfading

• Solution: Crossfade from one sequence to the other.

Ai-2

Ai-2

Bj-2

15

…

…

3

1 2

2 1

3

4

4 4

4 4

4+ + +

Ai-1

Ai-1/Bj-2 Ai-1/Bj-2 Ai-1/Bj-2

Bj-1

Ai

Bj

Ai+1

Bj+1

Bj+1

Morphing • Interpolation task:

A2

5B

2

5 C1

5+ +


• Compute correspondencebetween pixels of all frames

A2

5B

2

5 C1

5+ +


• Compute correspondence between pixels of all frames

• Interpolate pixel position andcolor in morphed frame

• based on [Shum+Szeliski IJCV 2000]

A2

5B

2

5 C1

5+ +

Results – crossfading/morphing

Results – crossfading/morphing

Jump Cut Crossfade Morph

Crossfading

Frequent jump & crossfading

Video portrait

• Useful for web pages

• Combine with IBR techniques

Video portrait – 3D

Region-based analysis

• Divide video up into regions

• Generate a video texture for each region

Automatic region analysis

User selects target range, S

slow variable fast

),distance(*exp

''

,1Sjw

DP

ji

ij

Video-based animation

• Like sprites incomputer games

• Extract spritesfrom real video

• Interactively control desired motion

©1985 Nintendo of America Inc.

Video sprite extraction

blue screen mattingand velocity estimation

Blue screen matting and

velocity estimation

Ci j = + angle Ci j

vector tomouse pointer

Similarity term Control term

velocity vector

Animation{ {

Video sprite control

Augmented transition cost:

Fi j

Fi j

Fi j Fi j

Fi j

Fi jFi j

SW

W

NW

N

NE

E

SE

S

Goal

Video sprite control

• Need future cost computation

• Precompute future costs for a few angles.

• Switch between precomputed angles according to user input

• Continued in VideoSprites

Interactive fish

Discussion

• Video clips video textures

– define Markov process

– preserve dynamics

– avoid dead-ends

– disguise visual discontinuities

Discussion

• Some things are relatively easy

Discussion

• Some are hard

Siggraph “2000” example

VideoTextures for Motion Capture

• Resulted in 4 papers at SIGGRAPH 2002

– Motion Graphs, by Kovar et al. <link>

– Interactive Motion Generation from Examples, by Arikan & Forsyth <link>

– Interactive Control of Avatars Animated with Human Motion Data, by Lee et al. <link>

– Motion capture assisted animation: Texturing and synthesis, by Pullen and Bregler

http://www.cs.wisc.edu/graphics/Gallery/kovar.vol/MoGraphs/

http://www.okanarikan.com/papers/s2002/motionSynthesis.php

http://graphics.cs.cmu.edu/projects/Avatar/

Motion GraphsKovar, Gleicher, Pighin, SIGGRAPH 2002

Top: Real data;

Bottom: Synthesized to match yellow line


Distance Matrix of Mocap frames:Based on point-cloud over 1/3 sec, ground-plane transform T


Plain path-fitting video

Multi-style path-fitting video

Interactive control video

MotionGraphs\pathFit.avi

MotionGraphs\pathFitMultiStyle.avi

MotionGraphs\interactive.avi

Remaining Challenges ofVideo-Textures-for-Mocap

• Is this the right distance metric?

• How to interpolate poses?

• How long should the transition be?

• Pose vs. style?

• What to capture?

Flow-based Video Synthesis and EditingK. Bhat et al. SIGGRAPH 2004

<Main video>

Video control using particle systems video

Adding video texture to CG scene <video>

http://graphics.cs.cmu.edu/projects/flow/

bhatSig04HiRes.avi

BhatExample_control.avi

BhatExample_cityMakingOf.avi

Chemical Brothers’“Star Guitar”

Directed by Michel Gondryhttp://youtube.com/watch?v=qUEs1BwVXGA

Star Guitar (local copy)

Making of Star Guitar (1)

Making of Star Guitar (2)

http://youtube.com/watch?v=qUEs1BwVXGA

http://youtube.com/watch?v=qUEs1BwVXGA

Chemical Brothers - Star Guitar (High Quality Ver.).MP4

making of Star Guitar.mp4

MakingOf_starguitar2.MP4

“Hand” Made Videos by Guillaume Reymond

The Original Human SPACE INVADERS Performance_Guillaume REYMOND.MP4

The Original Human TETRIS Performance by Guillaume Reymond.MP4

Computational Photography and Capture:

Video Sprite Animation

Gabriel Brostow & Tim Weyrich

TA: Frederic Besse

Video Sprites

• Web-page

• Local link to Video Sprites summary

http://www.cc.gatech.edu/cpl/projects/videotexture/SCA02/index.html



VideoSprites\Character Animation from Video.htm

VideoSprites\Character Animation from Video.htm

Controlled Animation of Video SpritesSchoedl & Essa, SCA 2002

• Optimize animation w.r.t. user-defined costs

• Account for some perspective projection

*Arno Schoedl is now at Think-Cell

http://www.think-cell.com/




Treat object as billboard in 3D

• 1st linear classifier trained on features of:

– Sprite velocity, average color, area, eccentricity

• 2nd (as cascade): alpha and color per pixel

Cost Function

• Start with same transition cost on smoothness

• Total cost of frame sequence S (what we’re optimizing):

Control Cost Function

Iterated Subsequence Replacement

• Q-Learning restricted to overlapping loops and short look-ahead

• Beam-search only adds 1 frame at a time, so no scope for multiple sprites

• Must iterate to optimize frame sequences of multiple sprites jointly + for very long look-ahead!

– Precompute Forward and Backward costs based only on smoothness, every time a subsequence is chosen for replacement

Control Cost CC

• Sum of costs over all constraints and time steps i

• State of sprite at frame i is (p,v,f)i

– p: location of the sprite

– v: sprite’s velocity

– f: input frame current sprite is copied from

• Example of location constraint:

• Constraint types:

– Location, Path, Anti-collision, Frame group

Hamster Path, 2 Locations, Locations & Anti-collision, Location + Group, Formation

VideoSprites\hamster_circle.mpeg

VideoSprites\hamster_books.mpeg

VideoSprites\2hamsters_books.mpeg



VideoSprites\2hamsters_standup.mpeg

VideoSprites\flies.mpeg

Trainable Videorealistic Speech Animation

• SIGGRAPH 2002 paper by Tony Ezzat, GadiGeiger, and Tomaso Poggio

– At Center for Biological and Computational Learning, MIT

• MikeTalk link (from 1998)

• Mary101 (web page of this research)

http://people.csail.mit.edu/tonebone/research/miketalk/miketalk.html

http://people.csail.mit.edu/tonebone/research/mary101/

EzzatPoggioTrainableSpeechAnim

1. Correct motion for the phonemes

2. Smooth transitions

3. Dynamics of plosives (‘b’ and ‘p’)

4. Co-articulation effects


Process

• Stabilize (all 15 min.)

• Phonemes

• MMM: Multidimensional Morphable Model

– EM-PCA [Roweis98], keep 15 dimensions

– K-means (N=46)

– Flow, via Dijkstra on “corpus graph” made with kNN

• Synthesis

– Trajectory, Render, Composite

Process

• Stabilize

• Phonemes

• Prototype:

– EM-PCA

– K-means

– Flow (via Dijkstra)

• Synthesis


Jump-Off Point to Further Research

Video Textures/Sprites

Direct Manipulationof Video

Free-Viewpoint Characters in 3D

Authoring of Cartoons

Jump-Off Point to Further Research

Video Textures/Sprites

Direct Manipulationof Video

Free-Viewpoint Characters in 3D

Authoring of Cartoons

Starck + Hilton: - Video Based

Character Animation v- de Aguiar et al. 2008-Vlasic et al. 2008-Ballan et al. 2008

Video Puppetry, Barnes et al. 2008 v--------------------------------Cartoon Textures by de Juan + Bodenheimer, 2004--------------------------------Accessible Animation and Customizable Graphics via Simplicial Configuration Modeling, Ngo et al. 2000

http://info.ee.surrey.ac.uk/Personal/J.Starck/

..\..\..\VideoBasedCharacterAnimation_SCA2005_Starck_Hilton_video-renderer.avi

http://www.mpi-inf.mpg.de/resources/perfcap/

http://people.csail.mit.edu/drdaniel/mesh_animation/index.html

http://www.inf.ethz.ch/personal/lballan/mhmc.html

http://vis.berkeley.edu/papers/vpuppet/

..\..\..\VideoObjectManip\vpuppetPrinceton.mov

http://www.vuse.vanderbilt.edu/~bobbyb/pubs/ct04.html

http://graphics.stanford.edu/papers/simplicial-animation/

Direct Manipulation of Video

• DimP: Video Browsing by Direct Manipulation

– Dragicevic et al., CHI 2008 (v)

• DRAGON: A Direct Manipulation Interface for Frame-Accurate In-Scene Video Navigation

– Karrer et al. CHI 2008 (v)

• Interactive Video Object Annotation

– Goldman et al. 2007 (v1-short, v2-long)

• How to map (2D) gestures to object motions?

http://www.aviz.fr/dimp/

..\..\..\VideoObjectManip\dimp-long.avi

http://hci.rwth-aachen.de/dragon

..\..\..\VideoObjectManip\1057-karrer.mov

http://grail.cs.washington.edu/projects/ivoa/tr07/

..\..\..\VideoObjectManip\InteractiveVideoObjectManipulation.mov

..\..\..\VideoObjectManip\VideoObjectAnnot_Goldman_ivoa.tr07.mov

..\..\..\VideoObjectManip\Daniel Chesterfield.MP4

..\..\..\VideoObjectManip\ProXFadeAd.MP4

computational photography and capture: video texture synthesis€¦ · 1 14-jan intro + more on...

Documents