saypraseuth mounsaveng jérôme rony€¦ · sound generation c. donahue, j. mcauley, and m....
TRANSCRIPT
![Page 1: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/1.jpg)
Data Augmentation and Semi-Supervised learning with
Generative Adversarial Networks
Jérôme RonySaypraseuth Mounsaveng
![Page 2: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/2.jpg)
The GAN Framework
Principle and Applications
2
![Page 3: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/3.jpg)
How it all began...
Source: www.les3brasseurs.cascholar.google.fr
Source: central photo: www.freeimageslive.co.ukcat: cc0.photo
Source: bottom cat: www.pinterest.ca/pin/45669383692759692/
Razvan Pascanu
Ian Goodfellow
3
![Page 4: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/4.jpg)
Generative Adversarial Networks
4
Source: https://medium.com/@devnag/generative-adversarial-networks-gans-in-50-lines-of-code-pytorch-e81b79659e3f
![Page 5: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/5.jpg)
Generative Adversarial Networks
Source: towardsdatascience.com
5
![Page 6: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/6.jpg)
A 2-player minimax game
Training means solving:
Where:
In practice:
● Sample a minibatch of random vectors z and generate a minibatch of images with G● Sample a minibatch of real images● Compute loss of D as a binary classifier with real and fake images, backprop and optimize
● Sample a minibatch of random vectors z and generate a minibatch of images with G● Compute loss of G by feeding D with fake images, backprop and optimize
6
![Page 7: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/7.jpg)
Advantages
● Flexibility on the type of networks used for the generator and discriminator
○ MLP, CNN or VAE
● Subjectively better visual quality than other generative models
○ VAE images are blurry
● Faster generation: no sequential process involved like in autoregressive models
○ Easier exploration of the latent space
● Adaptation to other tasks like classification
7
![Page 8: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/8.jpg)
Pitfalls
● Unstable training: nash equilibrium difficult to reach
with SGD optimisation due to saddle
● Mode collapse
● Difficulty to handle discrete data (e.g. text)
8
Source: Unrolled generative adversarial networks (Metz et al., 2017)
Source: Wikipedia
![Page 9: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/9.jpg)
Conditional Generation
Monarch butterfly goldfinch daisy redshank grey whale
128×128 images from ImageNet
A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier gans. arXiv arXiv:1610.09585, 2016
9
![Page 10: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/10.jpg)
Domain and Style Transfer
P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-Image translation with conditional adversarial networks. In CVPR, 2017
Live demo at https://affinelayer.com/pixsrv/
10
![Page 11: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/11.jpg)
Domain Transfer at High-Resolution
T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. arXiv preprint arXiv:1711.11585, 2017
2048×1024 images from Cityscapes Dataset
11
![Page 12: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/12.jpg)
Domain Transfer at High-Resolution
12
![Page 13: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/13.jpg)
Super-Resolution
C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, ´ A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016
Input SR-GAN Original
13
![Page 14: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/14.jpg)
Sound Generation
● C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208, 2018http://wavegan-v1.s3-website-us-east-1.amazonaws.com/
● E. Hosseini-Asl, Y. Zhou, C. Xiong, R. Socher, “A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation” arXiv arXiv:1804.00522https://einstein.ai/research/a-multi-discriminator-cyclegan-for-unsupervised-non-parallel-speech-domain-adaptation
14
![Page 15: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/15.jpg)
And Much More!
15
https://github.com/hindupuravinash/the-gan-zoo
![Page 16: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/16.jpg)
Semi supervised image classificationwith GANs
Source: Oliver et al., 2018
16
![Page 17: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/17.jpg)
Multi-agent architecture
Source: towardsdatascience.com
Generation task
Classification task
17
Architecture with 2 agents learning a different task and helping each other in an adversarial setup
![Page 18: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/18.jpg)
● For image generation, D helps G approximate the true data distribution and generate better
images
Source: https://blog.openai.com/generative-models/
Image generation
18
![Page 19: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/19.jpg)
Image classification with GANs
● For classification, D is extended to a K+1 classes classifier, and G helps D by generating
additional samples (Salimans, 2016 and Odena, 2016)
○ True samples are classified in the K classes
○ Generated samples are classified in the K+1 class
19
Source: https://github.com/buriburisuri/ac-gan
K
K+1
![Page 20: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/20.jpg)
Image classification with GANs
New loss function of the D:
where
and
Pushes predicted class of real data
to one of the K real classes
Pushes predicted class of real data away from K+1
class
Pushes predicted class of
generated data to K+1 class
20
![Page 21: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/21.jpg)
Semi supervised image classification with GANs
Hypothesis: limited amount of labeled dataset, large amount of unlabeled data
Problem A: Increase the usefulness of generated samples for D
Problem B: Leverage information contained in the unlabeled samples
21
![Page 22: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/22.jpg)
Semi supervised image classification with GANs
Good Semi-supervised Learning That Requires a Bad GAN (Dai et al, 2017)
Problem A: Increase the usefulness of generated samples for D
Perfect generator generates samples around labeled data
No improvement compared to fully supervised learning
Idea: Learn a “complementary distribution”
Complementary distribution is defined as
Generation of low-density samples leveraged by
22
![Page 23: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/23.jpg)
Semi supervised image classification with GANs
Good Semi-supervised Learning That Requires a Bad GAN (Dai et al, 2017)
Problem B: Leverage information contained in the unlabeled samples
Idea: Features matching = reduce distance between generated samples and unlabeled samples
Idea: Reinforce true/fake discrimination for unlabeled data by maximizing entropy of predicted class on real classes
23
![Page 24: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/24.jpg)
Semi supervised image classification with GANs
Good Semi-supervised Learning That Requires a Bad GAN (Dai et al, 2017)
Other issue addressed: Generator mode collapse
Idea: Maximize entropy of generated samples
24
![Page 25: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/25.jpg)
Semi supervised image classification with GANs
New objective function for D:Pushes predicted class of real data
to one of the K real classes
Pushes predicted class of
generated data to K+1 class
Pushes predicted class of unlabeled data to one of the
K real classes
Reinforce true/fake belief
on unlabeled data
25
![Page 26: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/26.jpg)
Semi supervised image classification with GANs
New objective function for G:
Minimizes mode collapse
Generates samples closer to
unlabeled data
Generates low density samples
26
![Page 27: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/27.jpg)
Semi supervised image classification with GANs
Results:
27
# of labeled samples: 100 for MNIST, 1000 for SVHN, 4000 for CIFAR-10
![Page 28: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/28.jpg)
Data Augmentation with GANs
PottedPlant Horse Bus ChurchOutdoor Bicycle TVMonitorSofa
T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive growing of gans for improved quality, stability, and variation," arXiv preprint arXiv:1710.10196, 2017
28
![Page 29: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/29.jpg)
Real Sample
Data distribution
Learnt distribution
Why Data Augmentation with GANs?
Learning the distribution of real data while maintaining high image quality
InterpolationSynthetic Sample
29
![Page 30: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/30.jpg)
What do you mean “not stable”?
30
![Page 31: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/31.jpg)
Let’s start with another formulation:Wasserstein GAN with Gradient Penalty
Pushes the samples toward the
distribution of the real samples
Defines the distribution of
the real samples
Prevents gradient explosion
I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved Training of Wasserstein GANs. arXiv preprint arXiv:1704.00028, 2017
31
![Page 32: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/32.jpg)
Problem at High Resolutions?
At low resolutions G and D = simple functions
WGAN-GP is based on the Lipschitz-continuity of D
32
![Page 33: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/33.jpg)
Problem at High Resolutions?
WGAN-GP is based on the Lipschitz-continuity of D
At high resolutions G and D = simple functions
33
![Page 34: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/34.jpg)
Solution: Progressive Growing (and other details)
34
T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In arXiv preprint arXiv:1710.10196, 2017.
![Page 35: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/35.jpg)
Solution: Progressive Growing (and other details)
35
4×44×4
8×88×8
×2
16×1616×16
×2
Equalized Learning RateConvolution: 3×3 / 1Pixel normalization
Upsampling (nearest neighbor) 4×4
4×4
8×88×8
×2
32×3232×32
×2
toRGB
toRGB
Convolution 1×1 / 1
![Page 36: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/36.jpg)
And in practice?
36
![Page 37: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/37.jpg)
High-resolutions images https://www.youtube.com/watch?v=G06dEcZ-QTg
37
![Page 38: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/38.jpg)
“This looks too good to be true”, Y. Bengio
38
![Page 39: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/39.jpg)
Improved results on smaller images as well
39
FakeReal
![Page 40: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/40.jpg)
Improved variability
Method Inception Score
ALI (Dumoulin et al., 2016) 5.34 ± 0.05
GMAN (Durugkar et al., 2016) 6.00 ± 0.19
Improved GAN (Salimans et al., 2016) 6.86 ± 0.06
CEGAN-Ent-VI (Dai et al., 2017) 7.07 ± 0.07
LR-AGN (Yang et al., 2017) 7.17 ± 0.17
DFM (Warde-Farley & Bengio, 2017) 7.72 ± 0.13
WGAN-GP (Gulrajani et al., 2017) 7.86 ± 0.07
Splitting GAN (Grinblat et al. 2017) 7.90 ± 0.09
PG-GAN (best run) 8.80 ± 0.05
PG-GAN (from 10 runs) 8.56 ± 0.06
Results on CIFAR-10 in
Unsupervised mode:
Only “standardized” way
of measuring image
quality and diversity
40
![Page 41: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/41.jpg)
Fake Conditional Generation - Pre-Training
Subjects FakeReal
41
![Page 42: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/42.jpg)
Fake Conditional Generation - Fine TuningGlasses
Illumination
Hairstyle
42
![Page 43: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/43.jpg)
And When Training is Successful, Interpolation is Fun!
43
![Page 44: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/44.jpg)
44
Can a GAN really generate new data?
Nearest neighbors found from the training data, based on feature-space distance. We used activations from five VGG layers. Only the crop highlighted in bottom right image was used for comparison in order to exclude image background and focus the search on matching facial features.
![Page 45: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/45.jpg)
45
Thank You For Your Attention!
Any Questions?
![Page 46: Saypraseuth Mounsaveng Jérôme Rony€¦ · Sound Generation C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208,](https://reader036.vdocument.in/reader036/viewer/2022081405/5f07ed2e7e708231d41f742d/html5/thumbnails/46.jpg)
46
Supplementary Material / Recommended Lectures
● I. Goodfellow. Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160, 2016.
● A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
● M. Mirza and S. Osindero. Conditional generative adversarial nets. arXiv:1411.1784v1, 2014.
● X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In NIPS, 2016.
● P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In CVPR, 2017.
● J. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In International Conference on Computer Vision (ICCV), to appear, 2017.
● X. Mao, Q. Li, H. Xie, R. Y. K. Lau, and Z. Wang, Least squares generative adversarial networks. ArXiv: 1611.04076, 2016.
● M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
● I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of Wasserstein GANs. arXiv:1704.00028v2, 2017.
● A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier GANs. In ICML, 2017.
● D. Warde-Farley and Y. Bengio. Improving generative adversarial networks with denoising feature matching. In ICLR, 2017.
● T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In arXiv preprint arXiv:1710.10196, 2017.
● R. D. Hjelm, A. P. Jacob, T. Che, K. Cho, and Y. Bengio. Boundary-seeking generative adversarial networks. arXiv preprint arXiv:1702.08431, 2017.