hiarsal / DAE-GAN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DAE-GAN

Pytorch implementation for reproducing DAE-GAN results in the paper [DAE-GAN: Dynamic Aspect-aware GAN for Text-to-Image Synthesis] by Shulan Ruan, Yong Zhang, Kun Zhang, Yanbo Fan, Fan Tang, Qi Liu, Enhong Chen. (This work was performed when Ruan was an intern with Tencent AI Lab).

Dependencies

python 3.6

Pytorch

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

  • python-dateutil
  • easydict
  • pandas
  • torchfile
  • nltk
  • scikit-image

Data

  1. Download our preprocessed metadata for birds coco, name them as captions.pickle and save them to data/birds and data/coco
  2. Download the birds image data. Extract them to data/birds/
  3. Download coco dataset and extract the images to data/coco/

Training

  • Train DAE-GAN models:

    • For bird dataset: python main.py --cfg cfg/bird_DAEGAN.yml --gpu 0
    • For coco dataset: python main.py --cfg cfg/coco_DAEGAN.yml --gpu 0
  • *.yml files are example configuration files for training/evaluation our models.

Pretrained Model

  • DAMSM for bird. Download and save it to DAMSMencoders/
  • DAMSM for coco. Download and save it to DAMSMencoders/
  • DAE-GAN for bird. Download and save it to models/. netG_3s means generation with 3 Generators (2 aspects) in total. netG_4s means generation with 4 Generators (3 aspects) in total.
  • DAE-GAN for coco. Download and save it to models/

Validation

  • To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run python main.py --cfg cfg/eval_coco.yml --gpu 1
  • We compute inception score for models trained on birds using StackGAN-inception-model.
  • We compute inception score for models trained on coco using improved-gan/inception_score.

Examples generated by DAE-GAN

Citing DAE-GAN

If you find DAE-GAN useful in your research, please consider citing:

@inproceedings{ruan2021dae,
  title={DAE-GAN: Dynamic Aspect-aware GAN for Text-to-Image Synthesis},
  author={Ruan, Shulan and Zhang, Yong and Zhang, Kun and Fan, Yanbo and Tang, Fan and Liu, Qi and Chen, Enhong},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={13960--13969},
  year={2021}
}

Reference

About


Languages

Language:Python 99.4%Language:Starlark 0.6%