saypraseuth mounsaveng jérôme rony€¦ · sound generation c. donahue, j. mcauley, and m....

Data Augmentation and Semi-Supervised learning with

Generative Adversarial Networks

Jérôme RonySaypraseuth Mounsaveng

The GAN Framework

Principle and Applications

2

How it all began...

Source: www.les3brasseurs.cascholar.google.fr

Source: central photo: www.freeimageslive.co.ukcat: cc0.photo

Source: bottom cat: www.pinterest.ca/pin/45669383692759692/

Razvan Pascanu

Ian Goodfellow

3

http://www.freeimageslive.co.uk


4

Source: https://medium.com/@devnag/generative-adversarial-networks-gans-in-50-lines-of-code-pytorch-e81b79659e3f


Source: towardsdatascience.com

5

A 2-player minimax game

Training means solving:

Where:

In practice:

● Sample a minibatch of random vectors z and generate a minibatch of images with G● Sample a minibatch of real images● Compute loss of D as a binary classifier with real and fake images, backprop and optimize

● Sample a minibatch of random vectors z and generate a minibatch of images with G● Compute loss of G by feeding D with fake images, backprop and optimize

6

Advantages

● Flexibility on the type of networks used for the generator and discriminator

○ MLP, CNN or VAE

● Subjectively better visual quality than other generative models

○ VAE images are blurry

● Faster generation: no sequential process involved like in autoregressive models

○ Easier exploration of the latent space

● Adaptation to other tasks like classification

7

Pitfalls

● Unstable training: nash equilibrium difficult to reach

with SGD optimisation due to saddle

● Mode collapse

● Difficulty to handle discrete data (e.g. text)

8

Source: Unrolled generative adversarial networks (Metz et al., 2017)

Source: Wikipedia

Conditional Generation

Monarch butterfly goldfinch daisy redshank grey whale

128×128 images from ImageNet

A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier gans. arXiv arXiv:1610.09585, 2016

9

Domain and Style Transfer

P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-Image translation with conditional adversarial networks. In CVPR, 2017

Live demo at https://affinelayer.com/pixsrv/

10

https://affinelayer.com/pixsrv/

Domain Transfer at High-Resolution

T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. arXiv preprint arXiv:1711.11585, 2017

2048×1024 images from Cityscapes Dataset

11

Domain Transfer at High-Resolution

12

Super-Resolution

C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, ´ A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016

Input SR-GAN Original

13

Sound Generation

● C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208, 2018http://wavegan-v1.s3-website-us-east-1.amazonaws.com/

● E. Hosseini-Asl, Y. Zhou, C. Xiong, R. Socher, “A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation” arXiv arXiv:1804.00522https://einstein.ai/research/a-multi-discriminator-cyclegan-for-unsupervised-non-parallel-speech-domain-adaptation

14

http://wavegan-v1.s3-website-us-east-1.amazonaws.com/

https://einstein.ai/research/a-multi-discriminator-cyclegan-for-unsupervised-non-parallel-speech-domain-adaptation

https://einstein.ai/research/a-multi-discriminator-cyclegan-for-unsupervised-non-parallel-speech-domain-adaptation

And Much More!

15

https://github.com/hindupuravinash/the-gan-zoo

https://github.com/hindupuravinash/the-gan-zoo

Semi supervised image classificationwith GANs

Source: Oliver et al., 2018

16

Multi-agent architecture

Source: towardsdatascience.com

Generation task

Classification task

17

Architecture with 2 agents learning a different task and helping each other in an adversarial setup

● For image generation, D helps G approximate the true data distribution and generate better

images

Source: https://blog.openai.com/generative-models/

Image generation

18

Image classification with GANs

● For classification, D is extended to a K+1 classes classifier, and G helps D by generating

additional samples (Salimans, 2016 and Odena, 2016)

○ True samples are classified in the K classes

○ Generated samples are classified in the K+1 class

19

Source: https://github.com/buriburisuri/ac-gan

K

K+1

Image classification with GANs

New loss function of the D:

where

and

Pushes predicted class of real data

to one of the K real classes

Pushes predicted class of real data away from K+1

class

Pushes predicted class of

generated data to K+1 class

20

Semi supervised image classification with GANs

Hypothesis: limited amount of labeled dataset, large amount of unlabeled data

Problem A: Increase the usefulness of generated samples for D

Problem B: Leverage information contained in the unlabeled samples

21


Good Semi-supervised Learning That Requires a Bad GAN (Dai et al, 2017)

Problem A: Increase the usefulness of generated samples for D

Perfect generator generates samples around labeled data

No improvement compared to fully supervised learning

Idea: Learn a “complementary distribution”

Complementary distribution is defined as

Generation of low-density samples leveraged by

22



Problem B: Leverage information contained in the unlabeled samples

Idea: Features matching = reduce distance between generated samples and unlabeled samples

Idea: Reinforce true/fake discrimination for unlabeled data by maximizing entropy of predicted class on real classes

23



Other issue addressed: Generator mode collapse

Idea: Maximize entropy of generated samples

24


New objective function for D:Pushes predicted class of real data

to one of the K real classes

Pushes predicted class of

generated data to K+1 class

Pushes predicted class of unlabeled data to one of the

K real classes

Reinforce true/fake belief

on unlabeled data

25


New objective function for G:

Minimizes mode collapse

Generates samples closer to

unlabeled data

Generates low density samples

26


Results:

27

# of labeled samples: 100 for MNIST, 1000 for SVHN, 4000 for CIFAR-10

Data Augmentation with GANs

PottedPlant Horse Bus ChurchOutdoor Bicycle TVMonitorSofa

T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive growing of gans for improved quality, stability, and variation," arXiv preprint arXiv:1710.10196, 2017

28

Real Sample

Data distribution

Learnt distribution

Why Data Augmentation with GANs?

Learning the distribution of real data while maintaining high image quality

InterpolationSynthetic Sample

29

What do you mean “not stable”?

30

Let’s start with another formulation:Wasserstein GAN with Gradient Penalty

Pushes the samples toward the

distribution of the real samples

Defines the distribution of

the real samples

Prevents gradient explosion

I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved Training of Wasserstein GANs. arXiv preprint arXiv:1704.00028, 2017

31

Problem at High Resolutions?

At low resolutions G and D = simple functions

WGAN-GP is based on the Lipschitz-continuity of D

32

Problem at High Resolutions?

WGAN-GP is based on the Lipschitz-continuity of D

At high resolutions G and D = simple functions

33

Solution: Progressive Growing (and other details)

34

T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In arXiv preprint arXiv:1710.10196, 2017.

Solution: Progressive Growing (and other details)

35

4×44×4

8×88×8

×2

16×1616×16

×2

Equalized Learning RateConvolution: 3×3 / 1Pixel normalization

Upsampling (nearest neighbor) 4×4

4×4

8×88×8

×2

32×3232×32

×2

toRGB

toRGB

Convolution 1×1 / 1

And in practice?

36

High-resolutions images https://www.youtube.com/watch?v=G06dEcZ-QTg

37

https://www.youtube.com/watch?v=G06dEcZ-QTg

“This looks too good to be true”, Y. Bengio

38

Improved results on smaller images as well

39

FakeReal

Improved variability

Method Inception Score

ALI (Dumoulin et al., 2016) 5.34 ± 0.05

GMAN (Durugkar et al., 2016) 6.00 ± 0.19

Improved GAN (Salimans et al., 2016) 6.86 ± 0.06

CEGAN-Ent-VI (Dai et al., 2017) 7.07 ± 0.07

LR-AGN (Yang et al., 2017) 7.17 ± 0.17

DFM (Warde-Farley & Bengio, 2017) 7.72 ± 0.13

WGAN-GP (Gulrajani et al., 2017) 7.86 ± 0.07

Splitting GAN (Grinblat et al. 2017) 7.90 ± 0.09

PG-GAN (best run) 8.80 ± 0.05

PG-GAN (from 10 runs) 8.56 ± 0.06

Results on CIFAR-10 in

Unsupervised mode:

Only “standardized” way

of measuring image

quality and diversity

40

Fake Conditional Generation - Pre-Training

Subjects FakeReal

41

Fake Conditional Generation - Fine TuningGlasses

Illumination

Hairstyle

42

And When Training is Successful, Interpolation is Fun!

43

44

Can a GAN really generate new data?

Nearest neighbors found from the training data, based on feature-space distance. We used activations from five VGG layers. Only the crop highlighted in bottom right image was used for comparison in order to exclude image background and focus the search on matching facial features.

45

Thank You For Your Attention!

Any Questions?

46

Supplementary Material / Recommended Lectures

● I. Goodfellow. Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160, 2016.

● A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.

● M. Mirza and S. Osindero. Conditional generative adversarial nets. arXiv:1411.1784v1, 2014.

● X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In NIPS, 2016.

● P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In CVPR, 2017.

● J. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In International Conference on Computer Vision (ICCV), to appear, 2017.

● X. Mao, Q. Li, H. Xie, R. Y. K. Lau, and Z. Wang, Least squares generative adversarial networks. ArXiv: 1611.04076, 2016.

● M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.

● I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of Wasserstein GANs. arXiv:1704.00028v2, 2017.

● A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier GANs. In ICML, 2017.

● D. Warde-Farley and Y. Bengio. Improving generative adversarial networks with denoising feature matching. In ICLR, 2017.

● T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In arXiv preprint arXiv:1710.10196, 2017.

● R. D. Hjelm, A. P. Jacob, T. Che, K. Cho, and Y. Bengio. Boundary-seeking generative adversarial networks. arXiv preprint arXiv:1702.08431, 2017.

saypraseuth mounsaveng jérôme rony€¦ · sound generation c. donahue, j. mcauley, and m....

Documents