need volunteers… from monday’s paper: a simple story about representations input signal: a...

Need volunteers…

From Monday’s paper: A simple story about

representations

Input signal: a moving edge.

Model it using an auto-regressive model,

Using two different representations for observations y:

Representation 1: image-based.

Representation 2: position-based.

Cxy Axx

Input signal

Representation 1

Bases, n=8

Representation 1

Dynamics, n=8

Representation 1

Bases, n=20

Representation 1

Dynamics, n=20

Representation 1

Bases, n=50

Representation 1

N = 50 dynamics

Representation 1

What happens next?

Representation 1

Representing the edge position

Input signal: y = [1:100]

What dimension of an auto-regressive model do we need to describe that signal?

Representation 2

Can only show exponentially decaying position.

Representation 2

A 2-d model can handle uniform translation exactly.

111 xx

Representation 2

The simple story

For a simple, canonical signal like a moving edge,

modelling it with an AR model,

The pixel-based representation requires a high-dimensional state vector, and even then doesn’t work very well.

The position-based representation works perfectly with a 2-dimensional state vector.

Separating style and content with bilinear

models

Bill Freeman, MIT AI Lab.

Josh Tenenbaum, MIT Dept. Brain and Cognitive Sciences

Content Style

character font

rendered observation

Matura MTCharacter #1

not observe

observed

thesis

Style and content example

Domain Content Style

typography letter font

face recognition identity head orientation

shape from shading shape lighting

color perception object color illum. color

speech recognition words speaker

Many perception problems have this

two-factor structure

Factor 1 Factor 2

Color constancy demo

How much of what we may consider to be (high-level) visual style can we account for by a simple, low-level statistical model?

Given: observations that are the result of two strongly interacting factors,

can we separately analyze or manipulate those two factors?

Perceptual tasks

Common form of observations

G H I . . .

factor 1

General case

content-class (“b” values)

style (“a”

values)

f(a1,b1) f(a1,b2) f(a1,b3) ...

f(a2,b1) f(a2,b2) f(a2,b3) ... ... ... ...

Account for observationsby a rendering function, f(a,b)

Asymmetric bilinear model

ysc = f(As , bc) = As bc

Observation vectorin style s and content c

Matrix for style s

Vector for content element c

Asymmetric bilinear model, with

identity is the style factor.

Symmetric bilinear model

ysck = f(as, bc) = as Wk bc

Kth element ofthe observation vectorin style s and content c

Matrix for element kof observationvector.

Vector for content element c

Vector for style s

Symmetric bilinear model

Fitting model to training observations

Iterate SVD’sMagnus and Neudecker, 1988

Asymmetric model

Symmetric model

ysc = As bc

ysck = as Wk bc

head pose

identity

Vector transpose

Related Work, bilinear models

Koenderink and Van Doorn, 1991, 1996

Tomasi and Kanade, 1992

Faugeras, 1993

Magnus and Neudecker, 1988

Marimont and Wandell, 1992

Turk and Pentland, 1991

Ullman and Basri, 1991

Murase and Nayar, 1995

Related Work, analyzing style

Hofstadter, 1995 and earlier papers.

Grebert et al, 1992

SIGGRAPH papers regarding controls for animation or line style. Typically hand-crafted, not learned.

Brand and Hertzmann, 2000

Hertzmann et al, 2001

Efros and Freeman, 2001

Procedure

(1) Fit a bilinear model to the training data of content elements observed across different styles, using linear algebra techniques.

(2) Use new data to find the parameters for a new, unknown style, or to classify new observations, or to generalize both style and content.

phoneme

speaker

“ah eh ou ... ”

“ah eh ou … ”

“ah eh ou ... ”

“ah eh ou ...”

“eh ee ”ou eeuah

training set

utterances from new speaker

Task: ClassificationDomain: vowel phonemes

Benchmark dataset

CMU machine learning repository

Training: 8 speakers saying 11 different vowel phonemes.

Testing: 7 new speakers

Data representation: LPC coefficients.

Classification using bilinear models

Use EM (expectation maximization) algorithm.

Build up model for new speaker’s style simultaneously with classification of the content.

yobserved = Anew speaker bphonemes

Vowel datafrom a speaker in a new style

Matrix describing the unknown style of the new speaker

Previously learned vowel (content) descriptors

Example problem for Expectation Maximization

(EM) algorithm

“Find the probability that each point came from one of two random spatial processes”.

Estimate the underlyingprobability distributions

Assign classmembership probabilities

EM algorithm

(E-step)(M-step)

Classification results: performance comparison

Multi-layer perceptron: 51%1-nearest neighbor (nn): 56%Discrm. adapt. nn: 62%Bilinear model:

data not grouped 69%data grouped by speaker 76%

Task: ClassificationDomain: faces and pose.

Nearest neighbor matching: 53%

Bilinear model: Estimate As while classify bc with EM: 74%

Face pose classification results

Given observations of a new face, what % of the poses can we identify correctly?

Chicago

Mistral

Times Bold

Monaco

(Rest of alphabet, used in training, not shown.)

Task: ExtrapolationDomain: typography

Coulomb warp representation

Describe each shape by the warp that a square of ink particles would have to undergo to form the shape.

Coulomb warping

reference shape

target shape

+ charges

- charges

Coulomb warp representation

shapes averages

S1 S2 S1+ S2 S1+ S2

(pixel) (Coulomb)

Basis functions for the asymmetric bilinear model

bletter “C”

Achicago

Amistral

x x x x x x x

Controlling complexity in calculating the style matrix for

the new font

asymmetric model, using symmetric model as a

asymmetric model

(173,280 parameters to fit)

symmetric model

(5 parameters to fit)

Monaco

(true)

synthetic

actual

Chicago

Mistral

Times Bold

Monaco

Results of extrapolation to a new style

Leave-one-out results

Task: TranslationDomain: shape and lighting

Factor 2: Identity (face shape)

(1) Fit symmetric bilinear model to training data (pixel representation).(2) Solve for parameters describing face and lighting of new image.

Training

Generalization

Translation Results

Factor 2: Identity (face shape)

r 1: lig

Conclusion: bilinear models are useful for translation,

classification, and extrapolation perceptual tasks.

factor 1 factor 2 observation

letter#1 Matura MT

phoneme speaker “ahh”

pose 3 Hiro

illuminant surface color eye cone

responses

End. Extra pages follow.

The following slides are extras….

Style and content

Mention unsupervised version would be a good class project. Josh or I would be into working with someone on it.

Increase dimensionality to represent non-linearities

Say f(x) = p x2 + q x + r.This parabola varies non-linearly with x,

but as a linear function of .

(Like “homogeneous coordinates” in graphics)

Fitting parabolas

1-d model

2-d model

3-d model

Reconstruction from low-dimensional model

Eigenfaces for each pose

Factor 1:

head pose

Factor 2:

identity

Task: ClassificationDomain: faces

and pose.

We build a bilinear model of how head pose and identity modify face appearance.

Basis images

Pose-dependent basis functions for face appearance.

One set of coefficients will reconstruct the same person in different poses.

need volunteers… from monday’s paper: a simple story about representations input signal: a...

Documents

choosing the best performing garch model for sri … ·...

regressive illinois · 2018-11-13 · regressive illinois i...

average (arma) model with the auto-regressive...

taxes and taxation · regressive tax •a regressive tax is...

assessing!climate!change!risks!...

instantaneous frequency estimation based on time-varying...

wk1 friendship monday’s cougar call

regressive sin taxes - nber

marvelous monday’s -...

regressive behaviour

welcome to monday’s class

proposal of a regressive model for the hourly diffuse solar...

stratral architectures of late quaternary...

monday’s warm-up

overview of monday’s and today’s lectures · overview...

chapter 4 vector auto regressive models

auto regressive, integrated, ...

auto regressive conditional density estimation

regressive evolution of vision and speciation in the...

reading for monday’s lecture: ( genetic mosaics &...