style/content separation
DESCRIPTION
Style/Content separation. Evgeniy Bart, Dan Levi. April 13, 2003. Artistic styles. Photograph. Artistic styles. Impressionist. Artistic styles. Expressionist. Artistic styles. Pointillist. Photographic styles. *Pictures by Aya Aner-Wolf. Content. Style. Fonts. Faces. - PowerPoint PPT PresentationTRANSCRIPT
Style/Content separation
Evgeniy Bart, Dan Levi
April 13, 2003
Artistic stylesPhotograph
Artistic stylesImpressionist
Artistic stylesExpressionist
Artistic stylesPointillist
Photographic styles
*Pictures by Aya Aner-Wolf
Fonts
A B C D E
A B C D E
A B C D EA B C D E
Sty
leContent
Faces
*Images from FERET database
TasksExtrapolation:
• Extrapolation of familiar style to new content
TasksExtrapolation:
• Extrapolation of familiar content to new style
TasksTranslation:
Task specification by analogy
:
: ?Image analogies, Hertzmann et al.
Wei
&L
evoy
Ash
ikhm
in
Region growing
Somewhat similarto quilting
*Picture from presentation by Tal and Zeev
Region growing
Combining the two …
:
:
… hierarchically
:
:
… hierarchically
:
:
… hierarchically
:
:
...2
2
aaaaaaaaaa
aaaaaadd AWL
Selecting best match
Arbitrating
?
yes no
Use Wei&Levoy value
Use Ashikhmin value
AWL dd
Arbitrating
?
yes no
Use Wei&Levoy value
Use Ashikhmin value
AWL dd )1(
Arbitrating
AWLldd )
21(
1
?
yes no
Use Wei&Levoy value
Use Ashikhmin value
:
:
Results – artistic filters
:
:
Results – artistic filters
:
:
Results – artistic filters
Super-resolution
:
Training 1
Super-resolution
:
Training 2
Super-resolution
:
Training 3
Super-resolutionResults
:
Super-resolutionResults
:
A lonely pine is standing In the North where high winds blow.He sleeps; and the whitest blanket wraps him in ice and snow.
He dreams - dreams of a palm-tree that far in an Orient landLanguishes, lonely and drooping, Upon the burning sand.
H. Heine, translated by L. Untermeyer
Texture by numbers
:
:
ParametersannEpsilon: [float] = 1.000000ashLastLevel: [bool] = falsebiasPenalty: [float] = 0.000000cheesyBoundaries: [bool] = truecoherenceEps: [float] = 5.000000coherencePow: [float] = 2.000000createSrcLocHisto: [bool] = falsedecayWeight: [double] = 0.000000filterColorspace: [enum] = {Lab, Luv, RGB, XYZ}filterMM: [string] = (none!)filterModeMask: [string] = (none!)filterProcedure: [enum] = {Copy, Synthesize}filteredFeatureType: [enum] = {Difference, Raw}filteredPyramidType: [enum] = {Gaussian,
Laplacian}finalSourceFac: [float] = -1.000000gainPenalty: [float] = 0.000000heurAnnEpsilon: [float] = 1.000000heurMaxTSVQDepth [int] = 7histogramEq: [bool] = falselevelWeighting: [float] = 1.000000matchBtoA: [bool] = falsematchGrayHistogram: [bool] = falsematchMeanVariance: [bool] = falsemaxTSVQDepth: [int] = 20maxTSVQError: [float] = 0.000000modeMaskWeight: [float] = 0.010000neighborhoodWidth: [int] = 5
pyramidType: [enum] = {Gaussian, Laplacian, Steerable}samplerEpsilon: [float] = 0.100000searchMethod: [enum] = {ANN, Ash, HeurANN, HeurTSVQ,
Image, MLP, TSVQ, TSVQR, Vector}sourceColorspace: [enum] = {Lab, Luv, RGB, XYZ}srcWeight: [float] = 1.000000targetMM: [string] = (none!)targetModeMask: [string] = (none!)useBias: [bool] = falseuseFilter: [bool] = trueuseFilterModeMask: [bool] = falseuseGain: [bool] = falseuseInterface: [bool] = trueuseRandomStart: [bool] = trueuseSigmoidalDecay: [bool] = falseuseSplineWeights: [bool] = trueuseTargetModeMask: [bool] = falseuseYIQ: [bool] = falsenumHiddenNeurons: [int] = 20numLevels: [int] = 2numPasses: [int] = 1numTSVQBacktracks: [int] = 8onePixelSource: [bool] = falseoneway: [bool] = falsepyramidHeight: [int] = 4
3D rotation
:
:
3D rotation
:
:
3D rotation
:
:
3D rotation
:
:
3D rotation
:
:
What went wrong?
There is some structure, but not simple correspondence
Need more knowledge about objects
Rectangular parallelepipeds(cuboids)
24
Representation by 3D point coordinates
Linear classes, Vetter&Poggio
May combine linearly
+ =
24
2
1
x
x
x
+
24
2
1
y
y
y
24
2
1
z
z
z
=
Only 3 dimensions
+1 2 3+ =
=Call it linear class
d
iiiBX
1
Linear operators
)3( d
• Linear operator L
d
iii BLXL
1
)(
Example: rotation
+1 2 3+ =
=
d
iiiBX
1
Rotation
d
i
rii
r BX1
If
Then
Example: projection
+1 2 3+ =
=
d
iiiBX
1
Projection
d
iiibx
1
If
Then
Example: projection + rotation
+1 2 3+ =
=
d
iiiBX
1
Rotation + projection
d
i
rii
r bx1
If
Then
But also
d
iiibx
1
We may work entirely in 2D domain!
+1 2 3+
Working in 2D
:
:
+1 2 3+
Results
:
:
Results
:
:
Can we apply this idea to faces?• Linear class assumption:
• Object may be represented as linear combination of other (similar) objects
+12
3+ =
• Use raw images as basis• Reconstruction quality will be poor• PCA reconstruction is much better• Can we use PCA?
+1 2 3+
Using linear classes
:
:
+1 2 3+
Using linear classes
• Eigenfaces do not correspond
• Cannot use the same coefficients
So far: Solving Specific Tasks (Image analogies), Linear Classes
Learn InteractionGeneralize To
New Examples
Goal : General Style/Content Framework
Goal : General Style/Content Framework1
b . . . . . . . . . . . . . . . . . . C
b1a
Sa
. . . . . . . .
),(cs
sc baFy
cb
sa
Motivation : Linear Models
• Faces Images Form A linear subspace• Illumination variations (of same face) can be modeled
by a low-dimensional linear space (Hallinan ‘94)
Eigenvector Basis
Reconstruct
faces
Model: Linear In Style And In Content
eigenfaces (Turk , Pentland ‘91)
Bilinear Models (Tenenbaum, Freeman 2000,97)
VyUx ,
is bilinear if :
f
WVUf :
Linear in x
),(),(
),(),(),( 2121
yxfyxf
yxfyxfyxxf
Linear in y
),(),(
),(),(),( 2121
yxfyxf
yxfyxfyyxf
y constant
yf WWU )(
Bilinear Forms : Examplex,y∈Real f(x,y) = xy
Bilinear Forms To Model Style And Content
2 Models : Symmetric , And Asymmetric
K
Image inStyle s Content c
Style vector
I
J
Content vector
IXJ
InteractionMatrix
Symmetric Bilinear Model
),(cssc
baFy
scy
sckyk
ck
sbWa
K
Image inStyle s Content c
J
Content vector
Symmetric Bilinear Model
scky
cb
Pixel Style vector
J
s
ka
),(cssc
baFy c
s bA
Style Matrix
Asscy
JXK
Symmetric Bilinear Model
ijkc
jji
si
ck
ssck WbabWay
,
: basis vectors JjIiijw ..1,..1
ijc
jji
si
scwbay
,
Toy Example: Symmetric Model• Images:
cb
• Style: Color Content: Shape
(0,0,1,1) (2,2,2,0) (0,3,0,3)
Content:
(0,0,1,1) (1,0,1,0) (0,1,1,1)
Style:
1
2
3
sa
ijw
ijc
jji
si
scwbay
,
Face Example - SymmetricStyle: Pose , Content: Person
K
Image inStyle s Content c
Style vector
I
J
Content vector
IXJ
InteractionMatrix
Asymmetric Bilinear Model
scky
k
ck
sbWa Wk
s
K
Image inStyle s Content c
J
Content vector
Asymmetric Bilinear Model
scky
cb
Pixel Style vector
J
s
ka
Style Matrix
Asscy
JXK
As:ContentImages
K
Image inStyle s Content c
J
Content vector
Asymmetric Bilinear Model
cb
scy
J
c
ssc
bAy
Asymmetric Bilinear Model
………
J
s
jj
cj ab
K
Image inStyle s Content c
scy
A style specific basis
Mixed by content
coefficients
sa1
s
Ja
c
ssc
bAy
Content:
(0,0,1,1) (1,0,1,0) (0,1,1,1)
Toy Example: Asymmetric Model
cb
s
jaStyle:
s
jj
cj
scaby
Face Example - AsymmetricStyle: Pose , Content: Person
Training
PersonIl
lum
inat
ion
Style Content Image Matrix:
Training – Model Fitting• Problem : Given {ysc
(t)}t = 1..T find model parameters • Error Minimization: Asymmetric:
• Closed SVD solution or Quasi-Newton methods • As, bc
• Free parameter: J – content vector dimension.
Symmetric:
• Iterative solution using SVD • as, Wk, bc
Content:
person
Style:
illumination
Translation - Faces
Asymmetric Model Cannot handle translation!
• Problem: C = 23 (faces), S = 3 (illuminations)• Training : Fit a symmetric model using iterative
SVD with I = S , J = C as, Wk, bc
• Generalization: find as`, bc` that minimize
E* = k| yks`c`- as`Wkbc` |2
• Alternating Iterative Linear Solution
• Translation : Produce as`Wkbc , asWkbc` for each s and c.
Translation – Symmetric Model
Translation - Results
Extrapolation Content: Letter , Style: Font
• Main Problem : Image Representation
• Linear combinations of letters should look like a letter.
ADisplacement
Vector
Warp Map
Coulomb Warp Map• For unique mapping:
physical model of electrostatic forces.
•Linear combination of letters looks like a letter
Extrapolation Scheme
• Fit An Asymmetric bilinear model• S = 5 training fonts (styles) ,
C = 62 characters (content) , K=2888 data dim.• closed-form SVD As, bc
• C={c1,…,cM} letters in a new style s’ find best fitting As’
• Minimize : E* = c║ys’c - As’ bc║ ∂E*/ ∂ As’ = 0 Set J High(~60)• Overfitting on Test Data
(173,280 degrees of freedom!)
Constraint: Close To Symmetric
• AOLC = sαsAs (Optimal Linear Combination)
• αs - style parameters (symmetric)
• Minimize:
E* = c║ys’c - As’ bc║ + λ║ As’ - AOLC ║ • ∂E*/ ∂ As’ = 0 • Extrapolate missing letters by: ys’c = As’ bc
Results
Symmetric vs. Aymmetric• Can reduce
dimensionality of factors
• Learns the structure of factor interactions : handles translation
• More Flexible• Too flexible:
overfitting• Cannot handle
translation
Can be overcome by combining both
Bilinear Models (Tenenbaum, Freeman 2000,97)
• General framework for two factor problems• Explicit parameterized representations of each
factor and their interaction• Natural generalization for extrapolation and
translation tasks• Fast algorithms (SVD)
Pros:
Cons:•Assumes Linearity In Each Factor
• Find Clever Input Representations• Decompose To Sub Problems
Example-Based Style Synthesis( Ido Drori Hezi Yeshurun Daniel Cohen-Or 03)
Algorithm Outline1.Divide Image To Overlapping Tiles
Algorithm Outline1.Divide Image To Overlapping Tiles
2. Find Best Match In Each Scene
Algorithm Outline1.Divide Image To Overlapping Tiles
2. Find Best Match In Each Scene
3.Synthesize tilesBy Bilinear Model
Algorithm Outline1.Divide Image To Overlapping Tiles
2. Find Best Match In Each Scene
3.Synthesize tilesBy Bilinear Model
4. Image Quilting
Algorithm Outline1.Decompose
Image To Tiles
2. Find Best Match In Each Scene
3.Synthesize tilesBy Bilinear Model
4. Image Quilting
5. Image Analogies
Create Gaussian Pyramids For ExamplesAnd Input Images
Apply Algorithm To Each Level From Coarse To Fine
Finding Best Matching Fragment
Similar Geometry Agreeing Boundaries
Vsearch = ( , , , , , )Gradient Laplacian
Luminance
For Each Training Scene : Create V In every Position And Orientation
Search For Nearest Neighbor