deep feedforward generative models - github...
TRANSCRIPT
![Page 1: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/1.jpg)
Deep Feedforward Generative Models
Ming-Yu Liu, NVIDIA
1
![Page 2: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/2.jpg)
Deep Feedforward Generative Models
• A generative model is a model for randomly generating data.
• Many deep learning-based generative models exist including Restrictive Boltzmann Machine (RBM), Deep Boltzmann Machines (DBM), Deep Belief Networks (DBN) ….
• We will focus on deep feedforward generative models.
• We will focus on models that maps a random sample z from a parametric probability distribution to an image 𝑥.
• Variational Autoencoders (Kingma and Welling 2014)
• Generative Adversarial Networks (Goodfellow et al 2014)
2
![Page 3: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/3.jpg)
Comparison
3
Deep Feedforward Discriminative
Network
Formosan Mountain Dog
Husky
Persian cat
Deep Feedforward Generative
Network
![Page 4: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/4.jpg)
Comparison
Model Input Output Operation
Deep Feedforward Discriminative Networks
• Image• High-dimensional
• A probabilitydistribution of class labels
• Low-dimensional
• “Compression”• Many down-
sampling operations
Deep Feedforward Generative Networks
• Random sample from a parametric probabilistic distribution
• Low-dimensional
• A probability distribution of images
• High-dimensional
• “Decompression”• Many up-
samplingoperations
4
![Page 5: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/5.jpg)
Manifold Hypothesis
• Structured high-dimensional data (images) live in a low-dimensional manifold.
5Image credit Goodfellow et al 2016
![Page 6: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/6.jpg)
Autoencoder
6
Decoder Network: 𝑓𝜃𝑑𝑒𝑐 𝑧𝑖Encoder Network: 𝑓𝜃
𝑒𝑛𝑐 𝑥𝑖
Learning is usually done by solving m𝑖𝑛𝜃
σ𝑥𝑖∈𝐷𝑓𝜃 𝑥𝑖 − 𝑥𝑖 2
2
where 𝑓𝜃 𝑥𝑖 = 𝑓𝜃𝑑𝑒𝑐(𝑓𝜃
𝑒𝑛𝑐 𝑥𝑖 )
Latent representation: 𝑧𝑖
![Page 7: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/7.jpg)
Remarks on Autoencoder
• One of the architectures that led to the renaissances of neural networks in 2007.
• In order to avoid learning a trivial identify function, the input sample is noise corrupted. Denoising Autoencoder (Vincent et al 2010)
• The hierarchical representation learned in the encoder can be used as a feature extractor for a supervised learning task.
• However, difficult to sample from the latent space.
• Poor generalization: the decoder often just remember the input samples.
7
![Page 8: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/8.jpg)
Variational Autoencoder
• Put a constraint on the latent space to make sampling easier.
• Constraint the encoder to output a conditional Gaussian distribution for an input sample 𝑥𝑖
𝑓𝜃𝑒𝑛𝑐 𝑥𝑖 = 𝑞𝜃 𝑧 𝑥𝑖 = 𝒩 𝑧 𝜇𝜃 𝑥𝑖 , 𝐼
• The decoder reconstructs the input from a random sample from the conditional distribution 𝑧𝑖~𝑞𝜃 𝑧 𝑥𝑖
𝑓𝜃𝑑𝑒𝑐 𝑧𝑖 = 𝑝𝜃(𝑥𝑖|𝑧𝑖~𝑞𝜃 𝑧 𝑥𝑖 )
• Train the encoder to output a zero mean Gaussian distribution and the decoder to reconstruct the input.
min𝜃
𝑥𝑖∈𝐷
𝐷𝐾𝐿(𝒩(𝑧|𝜇𝜃 𝑥𝑖 , 𝐼) ||𝒩 𝑧 0, 𝐼 ) + 𝐸𝑧𝑖~𝑞𝜃 𝑧 𝑥𝑖1
2𝑓𝜃𝑑𝑒𝑐 𝑧𝑖 − 𝑥𝑖
2
2
8
![Page 9: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/9.jpg)
Variational Lower Bound𝐿 𝜃 𝐷 =
𝑥𝑖∈𝐷
log(𝑝𝜃 𝑥 )
=
𝑥𝑖∈𝐷
𝑧
𝑞𝜃(𝑧|𝑥)log(𝑝𝜃 𝑥 )
=
𝑥𝑖∈𝐷
𝑧
𝑞𝜃(𝑧|𝑥)log(𝑝𝜃(𝑧, 𝑥)
𝑝𝜃(𝑧|𝑥))
=
𝑥𝑖∈𝐷
𝑧
𝑞𝜃(𝑧|𝑥)log(𝑝𝜃(𝑧, 𝑥)
𝑞𝜃(𝑧|𝑥)
𝑞𝜃(𝑧|𝑥)
𝑝𝜃(𝑧|𝑥))
=
𝑥𝑖∈𝐷
𝑧
𝑞𝜃 𝑧 𝑥 log𝑝𝜃 𝑧, 𝑥
𝑞𝜃 𝑧 𝑥+
𝑧
𝑞𝜃 𝑧 𝑥 log 𝑞𝜃 𝑧 𝑥 log𝑞𝜃(𝑧|𝑥)
𝑝𝜃(𝑧|𝑥)
= 𝐿𝑉 𝜃 𝐷 +
𝑥𝑖∈𝐷
𝐷𝐾𝐿(𝑞𝜃(𝑧|𝑥)||𝑝𝜃(𝑧|𝑥))
≥ 𝐿𝑉 𝜃 𝐷
9
𝐿𝑉 𝜃 𝐷 is the variational lower bound of the log-likelihood function 𝐿 𝜃 𝐷 .
![Page 10: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/10.jpg)
Maximize the Variational Lower Bound
𝐿𝑉 𝜃 𝐷 =
𝑥𝑖∈𝐷
𝑧
𝑞𝜃 𝑧 𝑥 log𝑝𝜃 𝑧, 𝑥
𝑞𝜃 𝑧 𝑥
=
𝑥𝑖∈𝐷
𝑧
𝑞𝜃 𝑧 𝑥 log𝑝𝜃 𝑥|𝑧 𝑝(𝑧)
𝑞𝜃 𝑧 𝑥
=
𝑥𝑖∈𝐷
𝑧
𝑞𝜃 𝑧 𝑥 log𝑝(𝑧)
𝑞𝜃 𝑧 𝑥+
𝑧
𝑞𝜃 𝑧 𝑥 log 𝑝𝜃 𝑥|𝑧
=
𝑥𝑖∈𝐷
−𝐷𝐾𝐿(𝑞𝜃 𝑧 𝑥 ||𝑝(𝑧)) + 𝐸𝑧𝑖~𝑞𝜃 𝑧 𝑥𝑖 [log 𝑝𝜃 𝑥|𝑧𝑖 ]
10
max𝜃
𝐿𝑉 𝜃 𝐷 ⟺ min𝜃
𝑥𝑖∈𝐷
𝐷𝐾𝐿(𝒩(𝑧|𝜇𝜃 𝑥𝑖 , 𝐼) ||𝒩 𝑧 0, 𝐼 ) + 𝐸𝑧𝑖~𝑞𝜃 𝑧 𝑥𝑖1
2𝑓𝜃𝑑𝑒𝑐 𝑧𝑖 − 𝑥𝑖
2
2
Regularization Reconstruction
![Page 11: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/11.jpg)
Implementation of VAE• 𝐷𝐾𝐿(𝒩(𝑧|𝜇𝜃 𝑥𝑖 , 𝐼) | 𝒩 𝑧 0, 𝐼 =
1
2σ𝑑(𝜇𝑑,𝜃 𝑥𝑖 )
2
• Sample approximation 𝐸𝑧𝑖~𝑞𝜃 𝑧 𝑥𝑖 log 𝑝𝜃 𝑥|𝑧𝑖 ≈1
𝐿σ𝑙=1𝐿 1
2𝑓𝜃𝑑𝑒𝑐 𝑧𝑖
𝑙− 𝑥𝑖
2
2
where 𝑧𝑖𝑙= 𝑞𝜃 𝑧 𝑥𝑖 + 𝜀(𝑙) and 𝜀(𝑙)~𝒩(0, 𝐼)
11Encoder Network: 𝑓𝜃𝑒𝑛𝑐 𝑥𝑖 Decoder Network: 𝑓𝜃
𝑑𝑒𝑐 𝑧𝑖
+
+
𝜀2
𝜀1
![Page 12: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/12.jpg)
Conditional Variational Autoencoder
12
+
+
𝜀2
𝜀1
Code or attributes
Kingma et al. “Semi-supervised learning with deep generative models.” NIPS 2014
![Page 13: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/13.jpg)
Attribute2Image
13Yan et al “Attribute2Image: Conditional Image Generation from Visual Attribute” ECCV 2016
![Page 14: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/14.jpg)
Drawback of VAE
• Euclidean loss is not a good perceptual loss.
• Regress to the mean and render blurry images
• Difficult to hand-craft a good perceptual loss function.
• Why not learn one?
14
50 100 150 2000
5
10x 10
4
Translation
Eu
clid
ea
n L
oss
Reference
The blue curve plots the Euclidean loss between a reference image and its translations. The red bar is the Euclidean loss between the reference image and a background image. The Euclidean loss suggests that the background image is more similar to the reference image.
in pixels
![Page 15: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/15.jpg)
Generative Adversarial Networks• Forget about how to design an image similarity loss. Let use a deep feedforward
discriminative network to verify if a generated image is similar to a real image. (Goodfellowet al 2014)
15
Generator/Decoder Network: 𝑓𝜙𝑔𝑒𝑛
𝑧𝑖 Discriminator Network: 𝑓𝜑𝑑𝑖𝑠 𝑥𝑖
True or Fake image
Goodfellow et al “Generative Adversarial Networks” NIPS 2014
![Page 16: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/16.jpg)
Generative Adversarial Networks• Generator: map a random sample from a Gaussian distribution to an image.
• Discriminator: Differentiate a generated image from a real image.
16
Generator/Decoder Network: 𝑓𝜙𝑔𝑒𝑛
𝑧𝑖 Discriminator Network: 𝑓𝜑𝑑𝑖𝑠 𝑥𝑖
True or Fake image
![Page 17: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/17.jpg)
Generative Adversarial Networks• The generator and the discriminator is playing a zero-sum game.
• min𝜙
max𝜑
𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log 𝑓𝜑𝑑𝑖𝑠 𝑥 +𝐸𝑧~𝑝𝑍(𝑧) log(1 − 𝑓𝜑
𝑑𝑖𝑠 𝑓𝜙𝑔𝑒𝑛
𝑧 )
17Image credit Goodfellow et al 2014
![Page 18: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/18.jpg)
Generative Adversarial Networks• What does this optimization do?
• For a fixed generator 𝑓𝜙𝑔𝑒𝑛
𝑧 , the optimal discriminator is 𝑓𝜑𝑑𝑖𝑠 𝑥 =
𝑝𝑑𝑎𝑡𝑎(𝑥)
𝑝𝑑𝑎𝑡𝑎 𝑥 +𝑓𝜙𝑔𝑒𝑛
𝑧.
min𝜙
max𝜑
𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log 𝑓𝜑𝑑𝑖𝑠 𝑥 +𝐸𝑧~𝑝𝑍(𝑧) log(1 − 𝑓𝜑
𝑑𝑖𝑠 𝑓𝜙𝑔𝑒𝑛
𝑧 )
= min𝜙
𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝑝𝑑𝑎𝑡𝑎(𝑥)
𝑝𝑑𝑎𝑡𝑎 𝑥 +𝑓𝜙𝑔𝑒𝑛
𝑧+𝐸𝑧~𝑝𝑍(𝑧) log
𝑓𝜙𝑔𝑒𝑛
𝑧
𝑝𝑑𝑎𝑡𝑎 𝑥 +𝑓𝜙𝑔𝑒𝑛
𝑧
= min𝜙
𝐷𝐾𝐿(𝑝𝑑𝑎𝑡𝑎(𝑥)||𝑝𝑑𝑎𝑡𝑎 𝑥 +𝑓𝜙
𝑔𝑒𝑛𝑧
2)+𝐷𝐾𝐿(𝑓𝜙
𝑔𝑒𝑛𝑧 ||
𝑝𝑑𝑎𝑡𝑎 𝑥 +𝑓𝜙𝑔𝑒𝑛
𝑧
2) − log 4
= min𝜙𝐷𝐽𝑆(𝑝𝑑𝑎𝑡𝑎(𝑥)||𝑓𝜙
𝑔𝑒𝑛𝑧 ) − log(4)
18Jensen-Shannon Divergence
![Page 19: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/19.jpg)
Generative Adversarial Network Training
• 𝑉 𝜑,𝜙 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log 𝑓𝜑𝑑𝑖𝑠 𝑥 +𝐸𝑧~𝑝𝑍(𝑧) log(1 − 𝑓𝜑
𝑑𝑖𝑠 𝑓𝜙𝑔𝑒𝑛
𝑧 )
• min𝜙
max𝜑
𝑉 𝜑,𝜙
• Alternating gradient descent
• Fix 𝜙 (generator), apply a stochastic gradient ascent step on 𝑉 𝜑,𝜙 .
• Fix 𝜑 (discriminator), apply a stochastic gradient descent step on 𝑉 𝜑,𝜙 .
19
![Page 20: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/20.jpg)
Deep Convolutional Generative Adversarial Networks
20Radford et al. “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
![Page 21: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/21.jpg)
Deep Convolutional Generative Adversarial Networks
21Radford et al. “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
![Page 22: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/22.jpg)
Deep Convolutional Generative Adversarial Networks
22Radford et al. “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
![Page 23: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/23.jpg)
InfoGAN: Interpretable Representation Learning byInformation Maximizing GAN
23Chen et al. “InfoGAN: Interpretable Representation Learning byInformation Maximizing Generative Adversarial Nets”, NIPS 2016
True or Fake image
In addition to the adversarial loss, InfoGAN also maximizes 𝐼(𝑐; 𝑓𝜙𝑔𝑒𝑛
𝑧, 𝑐 )
which is done by maximizing 𝐼(𝑐, 𝑓𝜑𝑑𝑖𝑠,𝑐 𝑥 ) ≤ 𝐼(𝑐; 𝑓𝜙
𝑔𝑒𝑛𝑧, 𝑐 )
(Check out the data processing lemma in information theory)𝑐
![Page 24: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/24.jpg)
InfoGAN: Interpretable Representation Learning byInformation Maximizing GAN
24Chen et al. “InfoGAN: Interpretable Representation Learning byInformation Maximizing Generative Adversarial Nets”, NIPS 2016
![Page 25: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/25.jpg)
InfoGAN: Interpretable Representation Learning byInformation Maximizing GAN
25Chen et al. “InfoGAN: Interpretable Representation Learning byInformation Maximizing Generative Adversarial Nets”, NIPS 2016
![Page 26: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/26.jpg)
VAE and GAN Comparison
26
Model Optimization Image Quality Generalization
VariationalAutoencoders (VAE)
• Stochastic gradient descent
• Converge to local minimum
• Easier
• Smooth• Blurry
• Tend to remember input images
GenerativeAdversarial Networks (GAN)
• Alternating stochastic gradient descent
• Converge to saddle points
• Harder
• Sharp• Artifact
• Generate new unseen images
![Page 27: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/27.jpg)
VAE/GAN Model
27Larsen et al. “Autoencoding beyond pixels using a learned similarity metric”, ICML 2016
VAEEncoder
VAEDecoder𝑥 𝑧 ො𝑥
GANGenerator𝑧
Weight-sharing
GANDiscriminator
ො𝑥
𝑥
GANDiscriminator
as Feature Extractor
𝑇𝑟𝑢𝑒
𝐹𝑎𝑙𝑠𝑒
𝑥
Weight-sharing
Style Loss
Content LossPrior Loss
𝑓(𝑥)
𝑓(ො𝑥)
![Page 28: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/28.jpg)
VAE/GAN Model
28Larsen et al. “Autoencoding beyond pixels using a learned similarity metric”, ICML 2016
![Page 29: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/29.jpg)
VAE/GAN Model
29Larsen et al. “Autoencoding beyond pixels using a learned similarity metric”, ICML 2016
![Page 30: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/30.jpg)
Applications
• Image Superresolution
• Inpainting
• Image Editing
• Domain Adaptation
30
![Page 31: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/31.jpg)
Application: Image Super-resolution
31Ledig et al, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network” arXiv 1609.0480
Minimize • Adversarial Loss• Content Loss• TV-norm
![Page 32: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/32.jpg)
Application: Image Super-resolution
32
PSNR/SSIM
Ledig et al, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network” arXiv 1609.0480
![Page 33: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/33.jpg)
Application: Image Super-resolution
33Ledig et al, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network” arXiv 1609.0480
![Page 34: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/34.jpg)
Image Inpaiting
34Yeh et al, “Semantic Image Inpainting with Perceptual and Contextual Losses” arXiv 1607.07539
Let ҧ𝑥 be a corrupted images. By solving
We can get the inpainted image by
𝑥 = 𝑓𝜙𝑔𝑒𝑛
(𝑧∗)
𝑧∗ = argmin𝑧
log(1 − 𝑓𝜑𝑑𝑖𝑠(𝑓𝜙
𝑔𝑒𝑛𝑧 ) + 𝑀⨀𝑓𝜙
𝑔𝑒𝑛𝑧 − 𝑀⨀ ҧ𝑥
2
2
Inpainted images w/wo perceptual loss
![Page 35: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/35.jpg)
Generative Visual Manipulation on the Natural Image Manifold
35Zhu et al, “Generative Visual Manipulation on the Natural Image Manifold” ECCV 2016
![Page 36: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/36.jpg)
Generative Visual Manipulation on the Natural Image Manifold
36Zhu et al, “Generative Visual Manipulation on the Natural Image Manifold” ECCV 2016
Let 𝑥0 be an input image. Find the hidden code that the generator would use𝑧0 = argmin
𝑧𝐿(𝑥0, 𝐺(𝑧))
The user then made some edits. The edits are given as constraints. We then solve the optimization problem for find a new hidden code that resembles the original image while satisfying the constraints by solving
Perceptual loss
![Page 37: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/37.jpg)
Generative Visual Manipulation on the Natural Image Manifold
37Zhu et al, “Generative Visual Manipulation on the Natural Image Manifold” ECCV 2016
![Page 38: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/38.jpg)
Scene
Coupled Generative Adversarial Networks
• 𝑝(𝑋1, 𝑋2, … , 𝑋𝑁): where 𝑋𝑖 are images of the scene in different modalities.
• Ex. 𝑝(𝑋𝑐𝑜𝑙𝑜𝑟 , 𝑋𝑡ℎ𝑒𝑟𝑚𝑎𝑙 , 𝑋𝑑𝑒𝑝𝑡ℎ):
color image
depth image
near
far
thermal image
cool
hot
Image plane
Camera
Liu et al, “Coupled Generative Adversarial Networks” NIPS 2016
Learn joint distribution of multi-domain images without any corresponding images in the different domains.
38
![Page 39: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/39.jpg)
Coupled Generative Adversarial Networks
• Define domain by attribute.
• Multi-domain images are views of an object with different attributes.
Non-smiling Smiling Young SeniorNon-beard Beard
summer winterimages Hand-drawings
Font#1 Font#2
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201639
![Page 40: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/40.jpg)
Coupled Generative Adversarial Networks
GAN1
𝐷1 : Dataset of training images in Domain 1
𝒙𝟏
GAN2
𝐷2 : Dataset of training images in Domain 2
𝒙𝟐
weight sharing weight sharing
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201640
![Page 41: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/41.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201641
![Page 42: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/42.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201642
![Page 43: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/43.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201643
![Page 44: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/44.jpg)
NYU
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201644
![Page 45: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/45.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201645
![Page 46: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/46.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201646
![Page 47: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/47.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201647
![Page 48: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/48.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201648
![Page 49: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/49.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201649
![Page 50: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/50.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201650
![Page 51: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/51.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201651
![Page 52: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/52.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201652
![Page 53: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/53.jpg)
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201653
![Page 54: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/54.jpg)
Application: Unsupervised Domain Adaptation
GAN1
𝐷1 : Dataset of training images in Domain 1
(𝒙𝟏, 𝑦)
GAN2
𝐷2 : Dataset of training images in Domain 2
𝒙𝟐
weight sharing weight sharing
Class label
![Page 55: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/55.jpg)
Unsupervised Domain Adaptation
Liu et al, “Coupled Generative Adversarial Networks” NIPS 201655
![Page 56: Deep Feedforward Generative Models - GitHub Pagesaliensunmin.github.io/project/accv16tutorial/media/generative.pdf · •We will focus on deep feedforward generative models. ... perceptual](https://reader034.vdocument.in/reader034/viewer/2022051600/5aa352e27f8b9ada698e19ed/html5/thumbnails/56.jpg)
Conclusions
• We discussed two popular deep generative models• Variational Autoencoders
• Generative Adversarial Networks
• We discussed their pros and cons and how to take the best from both.
• We discussed several computer vision applications of these models.
• Many other applications and interesting properties of these deep generative models are waiting for your exploration.
56