KimRass / VAE

PyTorch implementation of 'VAE' (Kingma and Welling, 2014) and training it on MNIST

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

1. Pre-trained Parameters

  • Trained on MNIST for 84 epochs (vae_mnist.pth)
    • seed=888, recon_weight=600, lr=0.0005, batch_size=64
    • val_recon_loss=0.1085, val_kld_loss=7.3032

2. Visualization

1) Encoder Output

# e.g.,
python3 vis/encoder_output/main.py\
    --seed=888\ # Optional
    --batch_size=64\ # Optional
    --taget="mean"\ # Or `"std"`
    --model_params="/.../datasets/vae/vae_mnist.pth"\
    --data_dir="/.../datasets"\
    --save_dir="/.../workspace/VAE/vis/encoder_output"
  • Mean and STD of MNIST Test Set
    • 평균의 경우 4와 9, 3과 5가 많이 겹쳐 있습니다.
    • 표준편차의 경우 1에 가까워지도록 학습이 이루어졌으나 0에 가까운 값을 띄고 있습니다. 시각화를 통해 얻을 수 있는 인사이트는 크게 없는 것으로 보입니다.

2) Decoder Output

python3 vis/decoder_output/main.py\
    --seed=888\ # Optional
    --latent_min=-4\ # Optional
    --latent_max=-4\ # Optional
    --n_cells=32\ # Optional
    --model_params="/.../datasets/vae/vae_mnist.pth"\
    --data_dir="/.../datasets"\
    --save_dir="/.../workspace/VAE/vis/encoder_output"
  • latent_min=-4, latent_max=4, n_cells=32
    • Encoder output의 평균의 분포와 매우 유사함을 볼 수 있습니다.

3) Image Reconstruction

python3 vis/reconstruct/main.py\
    --seed=888\ # Optional
    --batch_size=128\ # Optional
    --model_params="/.../datasets/vae/vae_mnist.pth"\
    --data_dir="/.../datasets"\
    --save_dir="/.../workspace/VAE/vis/encoder_output"
  • MNIST Test Set

3. Theoretical Background

1) Bayes' Theorem [3]

$$P(A \vert B) = \frac{P(B \vert A)P(A)}{P(B)}$$

  • $P(A \vert B)$ is a conditional probability or posterior probability of $A$ given $B$.
  • $P(A)$ and $P(B)$ are known as the prior probability and marginal probability. $$P(A \vert B) = \frac{P(B \vert A)P(A)}{P(B)}, \text{ if } P(B) \neq 0$$

2) ELBO (Evidence Lower BOund)

$$\int q_{\phi}(z \vert x)dz = 1$$ $$ \begin{align} \ln(P(x)) &= \int \ln(P(x))q_{\phi}(z \vert x)dz\\ &= \int \ln \bigg(\frac{P(z, x)}{P(z \vert x)}\bigg)q_{\phi}(z \vert x)dz\\ &= \int \ln \bigg(\frac{P(z, x)}{q_{\phi}(z \vert x)}\frac{q_{\phi}(z \vert x)}{P(z \vert x)}\bigg)q_{\phi}(z \vert x)dz\\ &= \int \ln \bigg(\frac{P(z, x)}{q_{\phi}(z \vert x)}\bigg)q_{\phi}(z \vert x)dz + \int \ln \bigg(\frac{q_{\phi}(z \vert x)}{P(z \vert x)}\bigg)q_{\phi}(z \vert x)dz\\ \end{align} $$

  • A basic result in variational inference is that latent_minimizing the KL-divergence is equivalent to latent_maximizing the log-likelihood [2]. $$ \begin{align} \text{ELBO} &= \int \ln \bigg(\frac{P(z, x)}{q_{\phi}(z \vert x)}\bigg)q_{\phi}(z \vert x)dz\ &= \int \ln \bigg(\frac{P(x \vert z)P(z)}{q_{\phi}(z \vert x)}\bigg)q_{\phi}(z \vert x)dz\ &= \int \ln \big(P(x \vert z)\big)q_{\phi}(z \vert x)dz + \int \ln \bigg(\frac{P(z)}{q_{\phi}(z \vert x)}\bigg)q_{\phi}(z \vert x)dz\ \end{align} $$

4. References

About

PyTorch implementation of 'VAE' (Kingma and Welling, 2014) and training it on MNIST


Languages

Language:Python 99.9%Language:Shell 0.1%