style/content separation

Style/Content separation

Evgeniy Bart, Dan Levi

April 13, 2003

Artistic stylesPhotograph

Artistic stylesImpressionist

Artistic stylesExpressionist

Artistic stylesPointillist

Photographic styles

*Pictures by Aya Aner-Wolf

Fonts

A B C D E

A B C D E

A B C D EA B C D E

Sty

leContent

Faces

*Images from FERET database

TasksExtrapolation:

• Extrapolation of familiar style to new content

TasksExtrapolation:

• Extrapolation of familiar content to new style

TasksTranslation:

Task specification by analogy

:

: ?Image analogies, Hertzmann et al.

Wei

&L

evoy

Ash

ikhm

in

Region growing

Somewhat similarto quilting

*Picture from presentation by Tal and Zeev

Region growing

Combining the two …

:

:

… hierarchically

:

:

...2

2

aaaaaaaaaa

aaaaaadd AWL

Selecting best match

Arbitrating

?

yes no

Use Wei&Levoy value

Use Ashikhmin value

AWL dd

Arbitrating

?

yes no

Use Wei&Levoy value

Use Ashikhmin value

AWL dd )1(

Arbitrating

AWLldd )

21(

1

?

yes no

Use Wei&Levoy value

Use Ashikhmin value

:

:

Results – artistic filters

Super-resolution

:

Training 1

Super-resolution

:

Training 2

Super-resolution

:

Training 3

Super-resolutionResults

:

A lonely pine is standing In the North where high winds blow.He sleeps; and the whitest blanket wraps him in ice and snow.

He dreams - dreams of a palm-tree that far in an Orient landLanguishes, lonely and drooping, Upon the burning sand.

H. Heine, translated by L. Untermeyer

Texture by numbers

:

:

ParametersannEpsilon: [float] = 1.000000ashLastLevel: [bool] = falsebiasPenalty: [float] = 0.000000cheesyBoundaries: [bool] = truecoherenceEps: [float] = 5.000000coherencePow: [float] = 2.000000createSrcLocHisto: [bool] = falsedecayWeight: [double] = 0.000000filterColorspace: [enum] = {Lab, Luv, RGB, XYZ}filterMM: [string] = (none!)filterModeMask: [string] = (none!)filterProcedure: [enum] = {Copy, Synthesize}filteredFeatureType: [enum] = {Difference, Raw}filteredPyramidType: [enum] = {Gaussian,

Laplacian}finalSourceFac: [float] = -1.000000gainPenalty: [float] = 0.000000heurAnnEpsilon: [float] = 1.000000heurMaxTSVQDepth [int] = 7histogramEq: [bool] = falselevelWeighting: [float] = 1.000000matchBtoA: [bool] = falsematchGrayHistogram: [bool] = falsematchMeanVariance: [bool] = falsemaxTSVQDepth: [int] = 20maxTSVQError: [float] = 0.000000modeMaskWeight: [float] = 0.010000neighborhoodWidth: [int] = 5

pyramidType: [enum] = {Gaussian, Laplacian, Steerable}samplerEpsilon: [float] = 0.100000searchMethod: [enum] = {ANN, Ash, HeurANN, HeurTSVQ,

Image, MLP, TSVQ, TSVQR, Vector}sourceColorspace: [enum] = {Lab, Luv, RGB, XYZ}srcWeight: [float] = 1.000000targetMM: [string] = (none!)targetModeMask: [string] = (none!)useBias: [bool] = falseuseFilter: [bool] = trueuseFilterModeMask: [bool] = falseuseGain: [bool] = falseuseInterface: [bool] = trueuseRandomStart: [bool] = trueuseSigmoidalDecay: [bool] = falseuseSplineWeights: [bool] = trueuseTargetModeMask: [bool] = falseuseYIQ: [bool] = falsenumHiddenNeurons: [int] = 20numLevels: [int] = 2numPasses: [int] = 1numTSVQBacktracks: [int] = 8onePixelSource: [bool] = falseoneway: [bool] = falsepyramidHeight: [int] = 4

3D rotation

:

:

What went wrong?

There is some structure, but not simple correspondence

Need more knowledge about objects

Rectangular parallelepipeds(cuboids)

24

Representation by 3D point coordinates

Linear classes, Vetter&Poggio

May combine linearly

+ =

24

2

1

x

x

x

+

24

2

1

y

y

y

24

2

1

z

z

z

=

Only 3 dimensions

+1 2 3+ =

=Call it linear class

d

iiiBX

1

Linear operators

)3( d

• Linear operator L

d

iii BLXL

1

)(

Example: rotation

+1 2 3+ =

=

d

iiiBX

1

Rotation

d

i

rii

r BX1

If

Then

Example: projection

+1 2 3+ =

=

d

iiiBX

1

Projection

d

iiibx

1

If

Then

Example: projection + rotation

+1 2 3+ =

=

d

iiiBX

1

Rotation + projection

d

i

rii

r bx1

If

Then

But also

d

iiibx

1

We may work entirely in 2D domain!

+1 2 3+

Working in 2D

:

:

+1 2 3+

Results

:

:

Can we apply this idea to faces?• Linear class assumption:

• Object may be represented as linear combination of other (similar) objects

+12

3+ =

• Use raw images as basis• Reconstruction quality will be poor• PCA reconstruction is much better• Can we use PCA?

+1 2 3+

Using linear classes

:

:

+1 2 3+

Using linear classes

• Eigenfaces do not correspond

• Cannot use the same coefficients

So far: Solving Specific Tasks (Image analogies), Linear Classes

Learn InteractionGeneralize To

New Examples

Goal : General Style/Content Framework

Goal : General Style/Content Framework1

b . . . . . . . . . . . . . . . . . . C

b1a

Sa

. . . . . . . .

),(cs

sc baFy

cb

sa

Motivation : Linear Models

• Faces Images Form A linear subspace• Illumination variations (of same face) can be modeled

by a low-dimensional linear space (Hallinan ‘94)

Eigenvector Basis

Reconstruct

faces

Model: Linear In Style And In Content

eigenfaces (Turk , Pentland ‘91)

Bilinear Models (Tenenbaum, Freeman 2000,97)

VyUx ,

is bilinear if :

f

WVUf :

Linear in x

),(),(

),(),(),( 2121

yxfyxf

yxfyxfyxxf

Linear in y

),(),(

),(),(),( 2121

yxfyxf

yxfyxfyyxf

y constant

yf WWU )(

Bilinear Forms : Examplex,y∈Real f(x,y) = xy

Bilinear Forms To Model Style And Content

2 Models : Symmetric , And Asymmetric

K

Image inStyle s Content c

Style vector

I

J

Content vector

IXJ

InteractionMatrix

Symmetric Bilinear Model

),(cssc

baFy

scy

sckyk

ck

sbWa

K


J

Content vector


scky

cb

Pixel Style vector

J

s

ka

),(cssc

baFy c

s bA

Style Matrix

Asscy

JXK


ijkc

jji

si

ck

ssck WbabWay

,

: basis vectors JjIiijw ..1,..1

ijc

jji

si

scwbay

,

Toy Example: Symmetric Model• Images:

cb

• Style: Color Content: Shape

(0,0,1,1) (2,2,2,0) (0,3,0,3)

Content:

(0,0,1,1) (1,0,1,0) (0,1,1,1)

Style:

1

2

3

sa

ijw

ijc

jji

si

scwbay

,

Face Example - SymmetricStyle: Pose , Content: Person

K


Style vector

I

J

Content vector

IXJ

InteractionMatrix

Asymmetric Bilinear Model

scky

k

ck

sbWa Wk

s

K


J

Content vector


scky

cb

Pixel Style vector

J

s

ka

Style Matrix

Asscy

JXK

As:ContentImages

K


J

Content vector


cb

scy

J

c

ssc

bAy


………

J

s

jj

cj ab

K


scy

A style specific basis

Mixed by content

coefficients

sa1

s

Ja

c

ssc

bAy

Content:

(0,0,1,1) (1,0,1,0) (0,1,1,1)

Toy Example: Asymmetric Model

cb

s

jaStyle:

s

jj

cj

scaby

Face Example - AsymmetricStyle: Pose , Content: Person

Training

PersonIl

lum

inat

ion

Style Content Image Matrix:

Training – Model Fitting• Problem : Given {ysc

(t)}t = 1..T find model parameters • Error Minimization: Asymmetric:

• Closed SVD solution or Quasi-Newton methods • As, bc

• Free parameter: J – content vector dimension.

Symmetric:

• Iterative solution using SVD • as, Wk, bc

Content:

person

Style:

illumination

Translation - Faces

Asymmetric Model Cannot handle translation!

• Problem: C = 23 (faces), S = 3 (illuminations)• Training : Fit a symmetric model using iterative

SVD with I = S , J = C as, Wk, bc

• Generalization: find as`, bc` that minimize

E* = k| yks`c`- as`Wkbc` |2

• Alternating Iterative Linear Solution

• Translation : Produce as`Wkbc , asWkbc` for each s and c.

Translation – Symmetric Model

Translation - Results

Extrapolation Content: Letter , Style: Font

• Main Problem : Image Representation

• Linear combinations of letters should look like a letter.

ADisplacement

Vector

Warp Map

Coulomb Warp Map• For unique mapping:

physical model of electrostatic forces.

•Linear combination of letters looks like a letter

Extrapolation Scheme

• Fit An Asymmetric bilinear model• S = 5 training fonts (styles) ,

C = 62 characters (content) , K=2888 data dim.• closed-form SVD As, bc

• C={c1,…,cM} letters in a new style s’ find best fitting As’

• Minimize : E* = c║ys’c - As’ bc║ ∂E*/ ∂ As’ = 0 Set J High(~60)• Overfitting on Test Data

(173,280 degrees of freedom!)

Constraint: Close To Symmetric

• AOLC = sαsAs (Optimal Linear Combination)

• αs - style parameters (symmetric)

• Minimize:

E* = c║ys’c - As’ bc║ + λ║ As’ - AOLC ║ • ∂E*/ ∂ As’ = 0 • Extrapolate missing letters by: ys’c = As’ bc

Results

Symmetric vs. Aymmetric• Can reduce

dimensionality of factors

• Learns the structure of factor interactions : handles translation

• More Flexible• Too flexible:

overfitting• Cannot handle

translation

Can be overcome by combining both

Bilinear Models (Tenenbaum, Freeman 2000,97)

• General framework for two factor problems• Explicit parameterized representations of each

factor and their interaction• Natural generalization for extrapolation and

translation tasks• Fast algorithms (SVD)

Pros:

Cons:•Assumes Linearity In Each Factor

• Find Clever Input Representations• Decompose To Sub Problems

Example-Based Style Synthesis( Ido Drori Hezi Yeshurun Daniel Cohen-Or 03)

Algorithm Outline1.Divide Image To Overlapping Tiles


2. Find Best Match In Each Scene



3.Synthesize tilesBy Bilinear Model




4. Image Quilting

Algorithm Outline1.Decompose

Image To Tiles



4. Image Quilting

5. Image Analogies

Create Gaussian Pyramids For ExamplesAnd Input Images

Apply Algorithm To Each Level From Coarse To Fine

Finding Best Matching Fragment

Similar Geometry Agreeing Boundaries

Vsearch = ( , , , , , )Gradient Laplacian

Luminance

For Each Training Scene : Create V In every Position And Orientation

Search For Nearest Neighbor

style/content separation

Documents