university of toronto oct. 18, 2004 modelling motion patterns with video epitomes machine learning...

Post on 31-Mar-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Oct. 18, 2004

Universityof Toronto

Modelling Motion Patterns with Video Epitomes

Machine Learning Group MeetingUniversity of Toronto

Oct. 18, 2004

Vincent Cheung

Probabilistic and Statistical Inference GroupElectrical & Computer Engineering

University of TorontoToronto, Ontario, Canada

Advisor: Dr. Brendan J. Frey

Cheung2 / 18

ML Group Meeting, Oct. 18, 2004

Outline

● Image epitome► What?► Why?► How?

● Implementation computation issues► Efficiently implementing the learning algorithm

● Video epitome► Extension to videos► Filling-in missing information

Cheung3 / 18

ML Group Meeting, Oct. 18, 2004Im

age

Lea

rnin

gV

ide

o

Image Epitome

● Jojic, N., Frey, B., & Kannan, A. (2003). Epitomic analysis of appearance and shape. In Proc. IEEE ICCV.

● Miniature, condensed version of the image

● Models the image’s shape and textural components.

● Applications► object detection► texture segmentation► image retrieval► compression

Cheung4 / 18

ML Group Meeting, Oct. 18, 2004Im

age

Lea

rnin

gV

ide

o

Image Epitome Examples

Cheung5 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Learning the Image Epitome (1)

Cheung6 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Learning the Image Epitome (2)

Epitome

Input image

Training Set

SamplePatches

UnsupervisedLearning

e

Z1 Z2 ZM

TMT2T1Bayesiannetwork

e – epitomeTk – mappingZk – image patch

E:

M:

Cheung7 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Xe(i,j)

KK

N

N

Cumsum

2

N

N

Xe(i,j)

KK

N

N

Cumsum

2

N

N

Shifted Cumulative Sum Algorithm

(2, 2), (i+1, j+1)

(1, 2), (i, j+1)

(2, 1), (i+1, j)

– col 1+ col (P+1)

– row 1+ row (P+1)

+

+ + pixel (1,1)+ pixel (P+1, P+1)

(1, 1), (i, j)

(2, 2), (i+1, j+1)(2, 2), (i+1, j+1)

(1, 2), (i, j+1)(1, 2), (i, j+1)

(2, 1), (i+1, j)(2, 1), (i+1, j)

– col 1+ col (P+1)

– row 1+ row (P+1)

+

+ + pixel (1,1)+ pixel (P+1, P+1)

(1, 1), (i, j)(1, 1), (i, j)

+-

-+

Cheung8 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

X e

P

P

P

P

Ta

X e

P

P

P

P

Ta

Collecting Sufficient Statistics

Cheung9 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Extending Epitomes to Videos

● Desire a miniature, condensed version of a video sequence

● Want the epitome to model the basic shape, textural, and motion patterns of the video

● Applications► optic flow► segmentation► texture transfer► layer separation► compression► noise reduction► fill-in / inpainting

Cheung10 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Input Video

Frame 1Frame 2

Frame 3

Training Set

SamplePatches

Video Epitome

UnsupervisedLearning

Video Epitome

Cheung11 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Video Epitome Example

Temporally Compressed

Spatially Compressed

Cheung12 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Video Inpainting (1)

● Fill in missing portions of a video► damaged films► occluding objects

● Reconstruct the missing pixels from the video epitome

Cheung13 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Video Inpainting (2)

Cheung14 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Filling-in Missing Data

Likelihood

Joint

VariationalApprox

E-Step

M-Step

Cheung15 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Missing Channels

● Generalization of the video inpainting problem

● Inpainting► Missing entire pixels

● Missing Channels► Missing one or more of the red, green, or blue (RGB)

components of a given pixel

● Epitome must consolidate multiple patches together to piece together the missing channel information

► No training patch contains all the channel information► Use the epitome to fill-in the missing data

Cheung16 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Image Missing Channels Fill-in

Cheung17 / 18

Imag

eL

earn

ing

Vid

eo

ML Group Meeting, Oct. 18, 2004

Video Missing Channels Fill-in

Cheung18 / 18

ML Group Meeting, Oct. 18, 2004

Conclusion

● Improved the efficiency of learning image epitomes

● Extended the concept of epitomes to video sequences

● Demonstrated the ability of video epitomes to model motion patterns through video inpainting

Cheung19 / 18

ML Group Meeting, Oct. 18, 2004

Future Work

● Determining the size of the epitome► Dependent on the complexity of the image / video

■ Minimum description length■ Variational Bayesian

● Optimal patch size(s)► Problem specific

● Additional transformations into the epitome► Rotation► Scale

● Additional video epitome applications► Super-resolution► Layer separation► Object recognition

top related