Data Augmentation and Semi-Supervised learning with
Generative Adversarial Networks
Jérôme RonySaypraseuth Mounsaveng
The GAN Framework
Principle and Applications
2
How it all began...
Source: www.les3brasseurs.cascholar.google.fr
Source: central photo: www.freeimageslive.co.ukcat: cc0.photo
Source: bottom cat: www.pinterest.ca/pin/45669383692759692/
Razvan Pascanu
Ian Goodfellow
3
Generative Adversarial Networks
4
Source: https://medium.com/@devnag/generative-adversarial-networks-gans-in-50-lines-of-code-pytorch-e81b79659e3f
Generative Adversarial Networks
Source: towardsdatascience.com
5
A 2-player minimax game
Training means solving:
Where:
In practice:
● Sample a minibatch of random vectors z and generate a minibatch of images with G● Sample a minibatch of real images● Compute loss of D as a binary classifier with real and fake images, backprop and optimize
● Sample a minibatch of random vectors z and generate a minibatch of images with G● Compute loss of G by feeding D with fake images, backprop and optimize
6
Advantages
● Flexibility on the type of networks used for the generator and discriminator
○ MLP, CNN or VAE
● Subjectively better visual quality than other generative models
○ VAE images are blurry
● Faster generation: no sequential process involved like in autoregressive models
○ Easier exploration of the latent space
● Adaptation to other tasks like classification
7
Pitfalls
● Unstable training: nash equilibrium difficult to reach
with SGD optimisation due to saddle
● Mode collapse
● Difficulty to handle discrete data (e.g. text)
8
Source: Unrolled generative adversarial networks (Metz et al., 2017)
Source: Wikipedia
Conditional Generation
Monarch butterfly goldfinch daisy redshank grey whale
128×128 images from ImageNet
A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier gans. arXiv arXiv:1610.09585, 2016
9
Domain and Style Transfer
P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-Image translation with conditional adversarial networks. In CVPR, 2017
Live demo at https://affinelayer.com/pixsrv/
10
Domain Transfer at High-Resolution
T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. arXiv preprint arXiv:1711.11585, 2017
2048×1024 images from Cityscapes Dataset
11
Domain Transfer at High-Resolution
12
Super-Resolution
C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, ´ A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016
Input SR-GAN Original
13
Sound Generation
● C. Donahue, J. McAuley, and M. Puckette, “Synthesizing audio with generative adversarial networks,” CoRR, vol. abs/1802.04208, 2018http://wavegan-v1.s3-website-us-east-1.amazonaws.com/
● E. Hosseini-Asl, Y. Zhou, C. Xiong, R. Socher, “A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation” arXiv arXiv:1804.00522https://einstein.ai/research/a-multi-discriminator-cyclegan-for-unsupervised-non-parallel-speech-domain-adaptation
14
And Much More!
15
https://github.com/hindupuravinash/the-gan-zoo
Semi supervised image classificationwith GANs
Source: Oliver et al., 2018
16
Multi-agent architecture
Source: towardsdatascience.com
Generation task
Classification task
17
Architecture with 2 agents learning a different task and helping each other in an adversarial setup
● For image generation, D helps G approximate the true data distribution and generate better
images
Source: https://blog.openai.com/generative-models/
Image generation
18
Image classification with GANs
● For classification, D is extended to a K+1 classes classifier, and G helps D by generating
additional samples (Salimans, 2016 and Odena, 2016)
○ True samples are classified in the K classes
○ Generated samples are classified in the K+1 class
19
Source: https://github.com/buriburisuri/ac-gan
K
K+1
Image classification with GANs
New loss function of the D:
where
and
Pushes predicted class of real data
to one of the K real classes
Pushes predicted class of real data away from K+1
class
Pushes predicted class of
generated data to K+1 class
20
Semi supervised image classification with GANs
Hypothesis: limited amount of labeled dataset, large amount of unlabeled data
Problem A: Increase the usefulness of generated samples for D
Problem B: Leverage information contained in the unlabeled samples
21
Semi supervised image classification with GANs
Good Semi-supervised Learning That Requires a Bad GAN (Dai et al, 2017)
Problem A: Increase the usefulness of generated samples for D
Perfect generator generates samples around labeled data
No improvement compared to fully supervised learning
Idea: Learn a “complementary distribution”
Complementary distribution is defined as
Generation of low-density samples leveraged by
22
Semi supervised image classification with GANs
Good Semi-supervised Learning That Requires a Bad GAN (Dai et al, 2017)
Problem B: Leverage information contained in the unlabeled samples
Idea: Features matching = reduce distance between generated samples and unlabeled samples
Idea: Reinforce true/fake discrimination for unlabeled data by maximizing entropy of predicted class on real classes
23
Semi supervised image classification with GANs
Good Semi-supervised Learning That Requires a Bad GAN (Dai et al, 2017)
Other issue addressed: Generator mode collapse
Idea: Maximize entropy of generated samples
24
Semi supervised image classification with GANs
New objective function for D:Pushes predicted class of real data
to one of the K real classes
Pushes predicted class of
generated data to K+1 class
Pushes predicted class of unlabeled data to one of the
K real classes
Reinforce true/fake belief
on unlabeled data
25
Semi supervised image classification with GANs
New objective function for G:
Minimizes mode collapse
Generates samples closer to
unlabeled data
Generates low density samples
26
Semi supervised image classification with GANs
Results:
27
# of labeled samples: 100 for MNIST, 1000 for SVHN, 4000 for CIFAR-10
Data Augmentation with GANs
PottedPlant Horse Bus ChurchOutdoor Bicycle TVMonitorSofa
T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive growing of gans for improved quality, stability, and variation," arXiv preprint arXiv:1710.10196, 2017
28
Real Sample
Data distribution
Learnt distribution
Why Data Augmentation with GANs?
Learning the distribution of real data while maintaining high image quality
InterpolationSynthetic Sample
29
What do you mean “not stable”?
30
Let’s start with another formulation:Wasserstein GAN with Gradient Penalty
Pushes the samples toward the
distribution of the real samples
Defines the distribution of
the real samples
Prevents gradient explosion
I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved Training of Wasserstein GANs. arXiv preprint arXiv:1704.00028, 2017
31
Problem at High Resolutions?
At low resolutions G and D = simple functions
WGAN-GP is based on the Lipschitz-continuity of D
32
Problem at High Resolutions?
WGAN-GP is based on the Lipschitz-continuity of D
At high resolutions G and D = simple functions
33
Solution: Progressive Growing (and other details)
34
T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In arXiv preprint arXiv:1710.10196, 2017.
Solution: Progressive Growing (and other details)
35
4×44×4
8×88×8
×2
16×1616×16
×2
Equalized Learning RateConvolution: 3×3 / 1Pixel normalization
Upsampling (nearest neighbor) 4×4
4×4
8×88×8
×2
32×3232×32
×2
toRGB
toRGB
Convolution 1×1 / 1
And in practice?
36
High-resolutions images https://www.youtube.com/watch?v=G06dEcZ-QTg
37
“This looks too good to be true”, Y. Bengio
38
Improved results on smaller images as well
39
FakeReal
Improved variability
Method Inception Score
ALI (Dumoulin et al., 2016) 5.34 ± 0.05
GMAN (Durugkar et al., 2016) 6.00 ± 0.19
Improved GAN (Salimans et al., 2016) 6.86 ± 0.06
CEGAN-Ent-VI (Dai et al., 2017) 7.07 ± 0.07
LR-AGN (Yang et al., 2017) 7.17 ± 0.17
DFM (Warde-Farley & Bengio, 2017) 7.72 ± 0.13
WGAN-GP (Gulrajani et al., 2017) 7.86 ± 0.07
Splitting GAN (Grinblat et al. 2017) 7.90 ± 0.09
PG-GAN (best run) 8.80 ± 0.05
PG-GAN (from 10 runs) 8.56 ± 0.06
Results on CIFAR-10 in
Unsupervised mode:
Only “standardized” way
of measuring image
quality and diversity
40
Fake Conditional Generation - Pre-Training
Subjects FakeReal
41
Fake Conditional Generation - Fine TuningGlasses
Illumination
Hairstyle
42
And When Training is Successful, Interpolation is Fun!
43
44
Can a GAN really generate new data?
Nearest neighbors found from the training data, based on feature-space distance. We used activations from five VGG layers. Only the crop highlighted in bottom right image was used for comparison in order to exclude image background and focus the search on matching facial features.
45
Thank You For Your Attention!
Any Questions?
46
Supplementary Material / Recommended Lectures
● I. Goodfellow. Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160, 2016.
● A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
● M. Mirza and S. Osindero. Conditional generative adversarial nets. arXiv:1411.1784v1, 2014.
● X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In NIPS, 2016.
● P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In CVPR, 2017.
● J. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In International Conference on Computer Vision (ICCV), to appear, 2017.
● X. Mao, Q. Li, H. Xie, R. Y. K. Lau, and Z. Wang, Least squares generative adversarial networks. ArXiv: 1611.04076, 2016.
● M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
● I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of Wasserstein GANs. arXiv:1704.00028v2, 2017.
● A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier GANs. In ICML, 2017.
● D. Warde-Farley and Y. Bengio. Improving generative adversarial networks with denoising feature matching. In ICLR, 2017.
● T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In arXiv preprint arXiv:1710.10196, 2017.
● R. D. Hjelm, A. P. Jacob, T. Che, K. Cho, and Y. Bengio. Boundary-seeking generative adversarial networks. arXiv preprint arXiv:1702.08431, 2017.