KimRass / DDPM

PyTorch implementation of 'DDPM' (Ho et al., 2020) and training it on CelebA 64×64

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

1. Pre-trained Models

ddpm_celeba_32×32.pth Trained on CelebA 32 × 32
ddpm_celeba_64×64.pth Trained on CelebA 64 × 64

2. Sampling

1) "normal" mode

# e.g.,
python3 ../sample.py\
    --mode="normal"\
    --model_params="....pth
    --save_path="samples/normal/0.jpg"\
    --img_size=64\
    --batch_size=100\

2) "denoising_process" mode

# e.g.,
python3 ../sample.py\
    --mode="denoising_process"\
    --model_params="....pth
    --save_path="samples/denoising_process/0.gif"\
    --img_size=64\
    --batch_size=100\

3) "interpolation" mode

# e.g.,
python3 ../sample.py\
    --mode="interpolation"\
    --model_params="....pth
    --save_path="samples/interpolation/0.jpg"\
    --img_size=64
    --data_dir="/Users/jongbeomkim/Documents/datasets/"\
    --image_idx1=50\
    --image_idx2=100\
  • interpolate_at=500

4) "coarse_to_fine" mode

  • Please refer to "Figure 9" in the paper for the meaning of each row and column.
# e.g.,
python3 ../sample.py\
    --mode="coarse_to_fine"\
    --model_params="....pth
    --save_path="samples/coarse_to_fine/0.jpg"\
    --img_size=64
    --data_dir="/Users/jongbeomkim/Documents/datasets/"\
    --image_idx1=50\
    --image_idx2=100\

3. Evaluation

# e.g.,
python3 eval.py
    --ckpt_path=".....pth"\
    --real_data_dir="../img_align_celeba/"\
    --gen_data_dir="../ddpm_eval_images/"\
    --batch_size=32\
    --n_eval_imgs=28000\
    --n_cpus=4\ # Optional
    --padding=1\ # Optional
    --n_cells=100 # Optional

4. Theorectical Background

1) Forward (Diffusion) Process

$$q(x_{t} \vert x_{t - 1}) = \mathcal{N}(x_{t}; \sqrt{1 - \beta_{t}}x_{t - 1}, \beta_{t}I)$$ $$q(x_{t} \vert x_{0}) = \mathcal{N}(x_{t}; \sqrt{\bar{\alpha}{t}}x{0}, (1 - \bar{\alpha}_{t})I)$$

  • Timestep이 매우 커질 때 이미지가 Normal gaussian distribution을 따르는 이유는? $$\prod_{s=1}^{t}{\alpha_{s}}$$
    • 1보다 작은 많은 수들을 서로 곱할 경우 0에 수렴합니다.

2) Backward (Denoising) Process

$$\mu_{\theta}(x_{t}, t) = \frac{1}{\sqrt{\alpha_{t}}}\Big(x_{t} - \frac{\beta_{t}}{\sqrt{1 - \bar{\alpha_{t}}}}\epsilon_{\theta}(x_{t}, t)\Big)$$

3) FID (Frechet Inception Distance)

$$\text{FID} = \lVert\mu_{X} - \mu_{Y}\rVert^{2}{2} +Tr\big(\Sigma{x} + \Sigma_{Y} - 2\sqrt{\Sigma_{X}\Sigma_{Y}}\big)$$

About

PyTorch implementation of 'DDPM' (Ho et al., 2020) and training it on CelebA 64×64


Languages

Language:Python 97.3%Language:Shell 2.7%