conditional generative adversarial networks (cgans) · 0.196 0.238 0.332 optimization igans:...

52
Conditional Generative Adversarial Networks (cGANs) Prof. Leal-Taixé and Prof. Niessner 1

Upload: others

Post on 17-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Conditional Generative Adversarial

Networks (cGANs)Prof. Leal-Taixé and Prof. Niessner 1

Page 2: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Conditional GANs (cGANs)• Gain control of output

• Modeling (e.g., sketch-based modeling, etc.)– Add semantic meaning to latent space manifold

• Domain transfer– Labels on A -> transfer to B, train network on ‘B’, test on B– More later

Prof. Leal-Taixé and Prof. Niessner 2

Page 3: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

GAN Manifold

Prof. Leal-Taixé and Prof. Niessner 3[Radford et al. 15]

Train Data Sampled Data -> G(z)

Page 4: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

GAN Manifold

Prof. Leal-Taixé and Prof. Niessner 4[Bojanowski et al 17]

Page 5: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

GAN Manifold

Prof. Leal-Taixé and Prof. Niessner 5

Page 6: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

GAN Manifold

Prof. Leal-Taixé and Prof. Niessner 6[Radford et al. 15]

𝐺(𝑧0) 𝐺(𝑧1)Linear interpolation in z space: 𝐺(𝑧0 + 𝑡 ⋅ 𝑧1 − 𝑧0 )

Page 7: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Conditional GANs (cGANs)

Prof. Leal-Taixé and Prof. Niessner 7Slide credit Zhu

Page 8: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

iGANs: Overview

original photo

projection on manifold

Project Edit Transfer

transition between the original and edited projection

different degree of image manipulation

Editing UI

Slide credit Zhu / [Zhu et al. 16]

Page 9: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

iGANs: Overview

original photo

projection on manifold

Project Edit Transfer

transition between the original and edited projection

different degree of image manipulation

Editing UI

Slide credit Zhu / [Zhu et al. 16]

Page 10: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

0.196 0.238 0.332

Optimization

iGANs: Projecting an Image onto the Manifold

Input: real image 𝑥𝑅

Output: latent vector z

Generative model 𝐺(𝑧)Reconstruction loss 𝐿

Slide credit Zhu / [Zhu et al. 16]

Page 11: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

0.196 0.238 0.332

Optimization

iGANs: Projecting an Image onto the Manifold

Inverting Network z = 𝑃 𝑥

0.218 0.242 0.336Auto-encoderwith a fixed decoder G

Input: real image 𝑥𝑅

Output: latent vector z

Slide credit Zhu / [Zhu et al. 16]

Page 12: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

0.196 0.238 0.332

Optimization

iGANs: Projecting an Image onto the Manifold

Inverting Network z = 𝑃 𝑥

0.218 0.242 0.336

0.153 0.167

Hybrid MethodUse the network as initialization

for the optimization problem0.268

Input: real image 𝑥𝑅

Output: latent vector z

Slide credit Zhu / [Zhu et al. 16]

Page 13: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

iGANs: Overview

original photo

projection on manifold

Project Edit Transfer

transition between the original and edited projection

different degree of image manipulation

Editing UI

Slide credit Zhu / [Zhu et al. 16]

Page 14: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

iGANs: Manipulating the Latent Vector

Objective:

𝐺(𝑧)

Guidance 𝑣𝑔

𝑧0

user guidance imageconstraint violation loss 𝐿𝑔

Slide credit Zhu / [Zhu et al. 16]

Page 15: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

iGANs: Overview

original photo

projection on manifold

Project Edit Transfer

transition between the original and edited projection

different degree of image manipulation

Editing UI

Slide credit Zhu / [Zhu et al. 16]

Page 16: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

iGANs: Edit Transfer

𝐺(𝑧1)𝐺(𝑧0)

Input

Motion (u, v)+ Color (𝑨𝟑×𝟒): estimate per-pixel geometric and color variation

Linear Interpolation in 𝑧 space

Slide credit Zhu / [Zhu et al. 16]

Page 17: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

iGANs: Edit Transfer

𝐺(𝑧1)𝐺(𝑧0)

Input

Linear Interpolation in 𝑧 space

Motion (u, v)+ Color (𝑨𝟑×𝟒): estimate per-pixel geometric and color variation

Slide credit Zhu / [Zhu et al. 16]

Page 18: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

iGANs: Edit Transfer

Result

𝐺(𝑧1)𝐺(𝑧0)

Input

Motion (u, v)+ Color (𝑨𝟑×𝟒): estimate per-pixel geometric and color variation

Linear Interpolation in 𝑧 space

Page 19: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

cGANs: Interactive GANs

https://github.com/junyanz/iGAN [Zhu et al. 16.]

Interactive GANs: projection to GAN embedding

Prof. Leal-Taixé and Prof. Niessner 19

Page 20: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

cGANs: Interactive GANs

Prof. Leal-Taixé and Prof. Niessner 20https://github.com/junyanz/iGAN [Zhu et al. 16.]

Page 21: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

cGANs: Interactive GANs

Prof. Leal-Taixé and Prof. Niessner 21https://github.com/junyanz/iGAN [Zhu et al. 16.]

Page 22: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Mapping in Latent Space is Difficult!• Semantics are missing• In most cases, no labels available • Ideally, need some unsupervised disentangled rep.

Prof. Leal-Taixé and Prof. Niessner 22InfoGAN [Chen et al. 16]

Page 23: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Paired vs Unpaired Setting

Prof. Leal-Taixé and Prof. Niessner 23

Page 24: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

pix2pix: Image-to-Image Translation

slides credit: Isola / Zhu

Page 25: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

real or fake?

[Goodfellow et al. 2014]

Discriminator

z G(z)

D

Generator

G

slides credit: Isola / Zhu

min𝐺

max𝐷

𝔼𝑧,𝑥 log𝐷(𝐺 𝑧 ) + log(1 − 𝐷 𝑥 )

Page 26: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

real or fake?

Discriminator

x G(x)

D

Generator

G

min𝐺

max𝐷

𝔼𝑥,𝑦 log 𝐷(𝐺 𝑥 ) + log(1 − 𝐷 𝑦 )

slides credit: Isola / Zhu

Page 27: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Real!

Discriminator

x G(x)

D

Generator

G

min𝐺

max𝐷

𝔼𝑥,𝑦 log 𝐷(𝐺 𝑥 ) + log(1 − 𝐷 𝑦 )

slides credit: Isola / Zhu

Page 28: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Discriminator

x G(x)

D

Generator

G Real too!

min𝐺

max𝐷

𝔼𝑥,𝑦 log 𝐷(𝐺 𝑥 ) + log(1 − 𝐷 𝑦 )

slides credit: Isola / Zhu

Page 29: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

min𝐺

max𝐷

𝔼𝑥,𝑦 log 𝐷(𝑥, 𝐺 𝑥 ) + log(1 − 𝐷 𝑥, 𝑦 )

real or fake pair ?

x G(x)

G

D

match joint distribution p G x , y ∼ p(x, y)fake pair real pair

slides credit: Isola / Zhu

Page 30: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Edges → Images

Input Output Input Output Input Output

Edges from [Xie & Tu, 2015]slides credit: Isola / Zhu

Page 31: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Pix2Pix: Paired Setting

• Great when we have ‘free’ training data

• Often called self-supervised

• Think about these settings

31

Page 32: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Sketches → Images

Input Output Input Output Input Output

Trained on Edges → Images

Data from [Eitz, Hays, Alexa, 2012]slides credit: Isola / Zhu

Page 33: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

#edges2cats [Christopher Hesse]

Ivy Tasi @ivymyt

@gods_tail

@matthematician

https://affinelayer.com/pixsrv/

Vitaly Vidmirov @vvid

slides credit: Isola / Zhu

Page 34: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Input Output Groundtruth

Data from[maps.google.com]

slides credit: Isola / Zhu

Page 35: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

BW → Color

Input Output Input Output Input Output

Data from [Russakovsky et al. 2015]slides credit: Isola / Zhu

Page 36: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

- Expensive to collect pairs.- Impossible in many scenarios.

Label ↔ photo: per-pixel labeling

Paired

Horse ↔ zebra: how to get zebras?

slides credit: Isola / Zhu

Page 37: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

………

Paired Unpaired

slides credit: Isola / Zhu

Page 38: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

x G(x)

Generator

G

D

No input-output pairs!

slides credit: Isola / Zhu

Page 39: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Discriminator

x G(x)

D

Generator

G Real!

slides credit: Isola / Zhu

Page 40: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Discriminator

x G(x)

D

Generator

G Real too!

GANs doesn’t force output to correspond to input

slides credit: Isola / Zhu

Page 41: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

mode collapse!

slides credit: Isola / Zhu

Page 42: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Cycle-Consistent Adversarial Networks

[Zhu*, Park*, Isola, and Efros, ICCV 2017]slides credit: Isola / Zhu

Page 43: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Cycle-Consistent Adversarial Networks

[Mark Twain, 1903]

⋯ ⋯

[Zhu*, Park*, Isola, and Efros, ICCV 2017]slides credit: Isola / Zhu

Page 44: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Cycle Consistency Loss

G(x) F(G x )x

F G x − x1

[Zhu*, Park*, Isola, and Efros, ICCV 2017]

DY(G x )

Reconstructionerror

slides credit: Isola / Zhu

Page 45: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Cycle Consistency Loss

G(x) F(G x )x

F G x − x1

Large cycle lossSmall cycle loss

[Zhu*, Park*, Isola, and Efros, ICCV 2017]

DY(G x )

Reconstructionerror

slides credit: Isola / Zhu

Page 46: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

G(x) F(G x )x F(y) G(F x )𝑦

Cycle Consistency Loss

F G x − x1

G F y − 𝑦1

[Zhu*, Park*, Isola, and Efros, ICCV 2017]

DY(G x )

Reconstructionerror

Reconstructionerror

DG(F x )

slides credit: Isola / Zhu

Page 47: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Cycle GAN - Overview

https://junyanz.github.io/CycleGAN/ [Zhu et al. 17.]Prof. Leal-Taixé and Prof. Niessner 48

Page 48: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Monet’s paintings → photos

slides credit: Isola / Zhu

Page 49: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

slides credit: Isola / Zhu

Page 50: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

slides credit: Isola / Zhu

Page 51: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

Next Lectures• Next Monday 24th,

– Xmas GANs– No Lecture

• Jan 14th -> No lecture, but office hours

• Next Lecture -> Jan 14th

• We are still working on feedback for presentations – will send around asap…

• Keep working on the projects!

Prof. Leal-Taixé and Prof. Niessner 52

Page 52: Conditional Generative Adversarial Networks (cGANs) · 0.196 0.238 0.332 Optimization iGANs: Projecting an Image onto the Manifold Inverting Network z=𝑃 0.218 0.242 0.336 0.153

See you next year

Prof. Leal-Taixé and Prof. Niessner 53