![Page 1: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/1.jpg)
Progress on Generative AdversarialNetworks
Wangmeng Zuo
Vision Perception and Cognition CentreHarbin Institute of Technology
![Page 2: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/2.jpg)
Content
• Image generation: problem formulation
• Three issues about GAN
• Discriminate a complex distribution from another one
• Improve the training of generator
• Reveal the connection between input and output
![Page 3: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/3.jpg)
Image generation
![Page 4: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/4.jpg)
Image generation
• Goal: learn a generative model to transform the input to an image from specific distribution• Input: image or variable from input distribution
• Output: image from the desired distribution• One typical setting: generate an image from Gaussian noise
![Page 5: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/5.jpg)
Image translation (Zhu et al., Arxiv 2017)
![Page 6: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/6.jpg)
Image restoration
Super-resolution(Sajjadi et al., Arxiv 2016)
Deblocking(Guo & Chao, Arxiv 2016)
![Page 7: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/7.jpg)
Face editing
Face hallucination(Sønderby et al., ICLR 2017)
Gender transfer(Li et al., Arxiv 2016)
![Page 8: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/8.jpg)
Domain Adaptation
Refining synthetic images(Shrivastava et al., Arxiv 2016)
Generating realistic image from rendering image
(Bousmalis et al., Arxiv 2016)
![Page 9: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/9.jpg)
Image captioning
Dai et al., Arxiv 2017
![Page 10: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/10.jpg)
Three issues you should know on GAN
• Goal: Transform a sample or variable from input distribution to a sample from the desired distribution• Distribution discrepancy measurement: How to evaluate the
closeness between the output and the desired distributions?
• Generation network design: How to design and train the generator?
• How to connect the input and output?
![Page 11: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/11.jpg)
Distribution discrepancy measurement
![Page 12: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/12.jpg)
How to evaluate the closeness between two distributions• KL-Divergence?
• Discriminator• Unbiased Look at Dataset Bias (A. Torralba, A. Efros, CVPR 2011)
• P(xs) P(xt)
![Page 13: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/13.jpg)
Generative Adversarial Networks ( Goodfellow et al., NIPS 2014)• Update the generator to generate more realistic image
• Improve the discriminator to discriminate the synthetic images from real ones
![Page 14: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/14.jpg)
Generative Adversarial Nets (Goodfellow et al., NIPS 2014)
![Page 15: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/15.jpg)
Mode Collapse
• D in inner loop: convergence to correct distribution
• G in inner loop: place all mass on most likely point
![Page 16: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/16.jpg)
MMD for measuring distribution discrepancy
• Maximum Mean Discrepancy (MMD) (Borgwardt, Bioinformatics 2006)
![Page 17: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/17.jpg)
MMD
• Choosing • Linear and Gaussian RBF kernel
• Multiple kernel (Gretton et al., NIPS 2012)
• Let be a CNN (Salimans et al., NIPS 2016)
• Adversarial Learning• Fixed , update the generator to minimize MMD
• Fixed generator, update to maximize MMD
![Page 18: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/18.jpg)
MMD in image generation
• Generative Moment Matching Networks (GMMN) (Li et al., ICML 2015)
![Page 19: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/19.jpg)
Weighted GMMN
![Page 20: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/20.jpg)
MMD in image generation
• Improved GAN (Salimans et al., NIPS 2016)
• Wasserstein GAN (Arjovsky et al., Arxiv 2017)• Wasserstein-1 distance
![Page 21: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/21.jpg)
Use MMD to evaluate the performance of a generative model (Sutherland et al., 2016)
![Page 22: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/22.jpg)
Design and train the generator
![Page 23: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/23.jpg)
DCGAN (Radford et al., ICLR 2016)
• Fully convolutional networks
• Using BN to most layers except the last layer of generator and 1st layer of discriminator
• Two mini-batches for the discriminator are normalized separately
![Page 24: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/24.jpg)
Stacked generator
• Zhang et al., Arxiv 2016; Huang et al., Arxiv 2016
![Page 25: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/25.jpg)
Image enhancement: ResNet
• Super-resolution (Ledig et al., Arxiv 2016)
• Facial attribute transfer (Li et al., Arxiv 2016)
![Page 26: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/26.jpg)
Image translation: U-Net
• Image translation (Isola et al., Arxiv 2016)
• Guided face completion (Zhao et al., 2017)
![Page 27: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/27.jpg)
Image captioning
• Dai et al., Arxiv 2017
![Page 28: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/28.jpg)
Connect Input and Output
![Page 29: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/29.jpg)
InfoGAN (Chen et al., NIPS 2016)
• GAN
• InfoGAN (Chen et al., NIPS 2016)• Input: z, c
• Interpretable and disentangled representations
• Easy to train
![Page 30: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/30.jpg)
Perceptual loss (Li & Wand, Arxiv 2016)
•
![Page 31: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/31.jpg)
Adaptive perceptual loss (Li et al., Arxiv 2016)
•
![Page 32: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/32.jpg)
Conditinal GAN (Isola et al., Arxiv 2016)
• Supervised GAN learning
• Positive pair: • (Input, groundtruth)
• Negative pair:• (Input, synthesis)
![Page 33: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/33.jpg)
Extra Guidance (Zhao et al., 2017)
•
![Page 34: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/34.jpg)
Cycle-Consistent supervision
• Shen & Liu, Arxiv 2016
• Zhu et al., Arxiv 2017
• Liu et al., Arxiv 2017
• Yi et al., Arxiv 2017
![Page 35: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/35.jpg)
Summary
• How to evaluate the closeness between the output and the desired distributions?• Classifier and MMD
• How to design and train the generator?• Problematic-specific
• How to connect the input and output?• Paired, unpaired, guidance
![Page 36: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/36.jpg)
Reference• Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
ourville, A., Bengio, Y.: Generative adversarial nets. In: NIPS. 2014
• Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks, ICLR 2016.
• P. Isola, J.Y. Zhu, T. Zhou, A.A. Efros, Image-to-image translation with conditional adversarial networks, Arxiv 2016.
• M. S. M. Sajjadi, B. Scholkopf, M. Hirsch, EnhanceNet: Single Image Super-Resolution through Automated Texture Synthesis, Arxiv 2016.
• C. Ledig, L. Theis, F. Huszar, J. Caballero, Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, Arxiv 2016.
• H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, D. Metaxas, StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, Arxiv 2016.
![Page 37: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/37.jpg)
• C. K. Sønderby, J. Caballero, L. Theis, W. Shi, F. Huszár, Amortised MAP Inference for image super-resolution, ICLR 2017.
• M. Li, W. Zuo, D. Zhang, Deep Identity-aware Transfer of Facial Attributes, Arxiv 2016.
• K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, D. Krishnan, Unsupervised Pixel–Level Domain Adaptation with Generative Adversarial Networks, Arxiv 2016.
• X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets, NIPS 2016.
• T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, Improved techniques for training GANs, NIPS 2016.
![Page 38: Progress on Generative Adversarial Networksvalser.org/2017/ppt/APR/12GAN_zwm.pdf · MMD •Choosing •Linear and Gaussian RBF kernel •Multiple kernel (Gretton et al., NIPS 2012)](https://reader033.vdocument.in/reader033/viewer/2022041711/5e481fda1a73a72f782c7e14/html5/thumbnails/38.jpg)
• H. Yan, Y. Ding, P. Li, Q. Wang, Y. Xu, W. Zuo, Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation, CVPR 2017.
• A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Scholkopf, and A. Smola. A kernel two-sample test. Journal of Machine Learning Research, 2012.
• A. Gretton, D. Sejdinovic, H. Strathmann, S. Balakrishnan, M. Pontil, K. Fukumizu, and B. K. Sriperumbudur. Optimal kernel choice for large-scale two-sample tests. NIPS 2012.
• Y. Li, K. Swersky, and R. Zemel. Generative moment matching networks. ICML 2015.
• J.-Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, Arxiv 2017.