Generative Model

This paper list is a bit different from others. I'll put some opinion and summary on it. However, to understand the whole paper, you still have to read it by yourself!
Surely, any pull request or discussion are welcomed!

Paper

Improved Techniques for Training GANs [NIPS 2016]
- Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen
- Code for the paper
- Feature matching: instead of maximizing the output of discriminator, it's trained to match the feature on an imtermediate layer of discriminator
- Minibatch-discrimination:
  - Motivation: because the discriminator processes each example independently, there is no coordination between its gradients, and thus no mechanism to tell the outputs of the generator to become more dissimilar to each other
  - Allow the discriminator to look at multiple data examples in combination-, and perform what we call minibatch discrimination
  - Calculate the l1-error btn each samples feature and finally concatenate the output with the sample feature
  - Hope the generated images to be diverse 👉 less probability to collapse
- Historical averaging to stablize the training process
Semantic Image Inpainting with Perceptual and Contextual Losses [arXiv 2016]
- Raymond Yeh, Chen Chen, Teck Yian Lim, Mark Hasegawa-Johnson, Minh N. Do
- Semantic inpainting can be viewed as contrained image generation
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [ICLR 2016]
- Alec Radford, Luke Metz, Soumith Chintala
- Explore the extension of models for deeper generative model
  - all-convolutional layers: to learn upsampling itself
  - eleminate the fully connected layer: increase the model stability but hurt convergence speed
  - use batchnorm: get deep generator to begin learning, preventing from collapsing all sample to single point
  - ReLU activation: for generator, it helps to converge faster and cover the color space. for discriminator, use leaky ReLU
- Fractionally-strided convolution instead of deconvolution. To see how fractionally-strided conv is, here's the link
- Want the model to generalize instead of memorize
- Use the discriminator as feature extractor (laerned unsupervised) and apply it to supervised laerning task. This produces comparable results
- Official source code: Torch version, Theano version
Generative Adversarial Networks [NIPS 2014]
- Scenario: The generative model can be thought of as analogous to a team of counterfeiters, trying to produce fake currency and use it without detection, while the discriminative model is analogous to the police, trying to detect the counterfeit currency.
- In other words, D and G play the following two-player minimax game with value function
- Find Nash equilibrium by gradient descent of D and G
- Nice post from Eric Jang, Generative Adversarial Nets in TensorFlow
- Another post about GAN: Generating Faces with Torch
- Official source code: Theano version
Deep multi-scale video prediction beyond mean square error [ICLR 2016]
- Original work only use MSECritetrion to minimize the L2(L1) distance, which induce the blurring output. This work propose the GDL (gradient difference loss), which aims to keep the sharp aprt of the image.
- Adversial training: create two networks(Discriminative ,Generative model). The goals of D is to discriminate whether the image is fake or not. The goals of G is to generate the image not to discriminated by D. => Adversial
- D model outputs a scalar, while G model outputs an image
- Use Multi-scale architecture to solve the limitation of convolution (kernel size is limited, eg. 3*3)
- Still immature. Used in UCF101 dataset, due to the fixed background

Suggest papers

Adversarial examples in the physical world [arXiv 2016]
- Alexey Kurakin, Ian Goodfellow, Samy Bengio

tsenghungchen / Generative-Model-Survey

Generative Model

Paper

Suggest papers

Recommended Post

About