thanhluantrinh / LDDGAN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Table of contents
  1. Installation
  2. Dataset preparation
  3. How to run
  4. Results
  5. Evaluation
  6. Acknowledgments
  7. Contacts

Official PyTorch implementation of "Latent Denoising Diffusion GANs: Faster sampling, Higher image quality" (IEEE Access)

LDDGAN is a novel diffusion scheme. Experimental results on CelebA-HQ, CIFAR-10, LSUN-Church, and STL-10 datasets show that LDDGAN provides state-of-the-art training and inference speed, which serves as a stepping-stone to offering real-time and high-fidelity diffusion models.

Details of the model architecture and experimental results can be found in IEEE Access paper (will be updated soon).

@inproceedings{trinhldgan,
 title={Latent Denoising Diffusion GANs: Faster sampling, Higher image quality},
 author={Luan Thanh Trinh and Tomoki Hamagami},
 booktitle={IEEE Access},
 year={2024}
}

Please CITE our paper whenever this repository is used to help produce published results or incorporated into other software.

Installation

Python 3.7.13 and Pytorch 1.10.0 are used in this implementation.

It is recommended to create conda env from our provided environment.yml:

conda env create -f environment.yml
conda activate ldgan

Or you can install neccessary libraries as follows:

pip install -r requirements.txt

Autoencoder

Download using links about and put them in autoencoder/weight

Dataset preparation

We trained on four datasets, including CIFAR10, LSUN Church Outdoor 256 and CelebA HQ 256.

For CIFAR10 and STL10, they will be automatically downloaded in the first time execution.

For CelebA HQ (256) and LSUN, please check out here for dataset preparation.

Once a dataset is downloaded, please put it in data/ directory as follows:

data/
├── celeba
├── cifar-10
└── lsun

How to run

We provide a bash script for our experiments on different datasets. The syntax is following:

bash run.sh <DATASET> <MODE> <#GPUS>

where:

  • <DATASET>: cifar10, stl10, celeba_256, celeba_512, celeba_1024, and lsun.
  • <MODE>: train and test.
  • <#GPUS>: the number of gpus (e.g. 1, 2, 4, 8).

Note, please set argument --exp correspondingly for both train and test mode. All of detailed configurations are well set in run.sh.

GPU allocation: Our work is experimented on NVIDIA 40GB A100 GPUs. For train mode, we use a single GPU for CIFAR10 and STL10, 2 GPUs for CelebA-HQ 256, 4 GPUs for LSUN, and 8 GPUs for CelebA-HQ 512 & 1024. For test mode, only a single GPU is required for all experiments.

Results

Model performance and pretrained checkpoints are provided as below:

Model FID Recall Time (s) Checkpoints Autoencoder Checkpoints
CIFAR-10 2.95 0.58 0.08 Here Here
CelebA-HQ (256 x 256) 5.21 0.40 0.55 Here Here
LSUN Church 4.67 0.42 1.02 Here Here

Inference time is computed over 300 trials on a single NVIDIA A5000 GPU for a batch size of 100, except for the one of high-resolution CelebA-HQ (512 & 1024) is computed for a batch of 25 samples.

Downloaded pre-trained models should be put in saved_info/ld_gan/<DATASET>/<EXP> directory where <DATASET> is defined in How to run section and <EXP> corresponds to the folder name of pre-trained checkpoints.

Evaluation

Inference

Samples can be generated by calling run.sh with test mode.

FID

To compute fid of pretrained models at a specific epoch, we can add additional arguments including --compute_fid and --real_img_dir /path/to/real/images of the corresponding experiments in run.sh.

Recall

We adopt the official Pytorch implementation of StyleGAN2-ADA to compute Recall of generated samples.

Acknowledgments

Thanks to Xiao et al for releasing their official implementation of the DDGAN paper.

Contacts

If you have any problems, please open an issue in this repository or ping an email to luan.trinh.t@gmail.com.

About


Languages

Language:Python 97.5%Language:Cuda 1.6%Language:Shell 0.5%Language:C++ 0.3%