kevinwoodward / bird-gan

Research for text-to-image synthesis via modified auxiliary classifier GANs. Incremental modification of model architecture for improved results, fully documented.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BirdGAN

Progressive implementations of GAN architectures applied to the CUB200 dataset to generate unique images conditioned on attributes and caption embeddings.

Prerequisites

  • The CUB200 dataset
  • Captions for the CUB200 dataset
  • Pretrained BERT-large (uncased) model for embedding captions to 1024D vectors
  • bert-as-service for utilizing the pretrained BERT model
  • A python notebook environment
  • Python 3.7
    • TensorFlow 2.0 or greater
    • Pandas
    • OpenCV3

Implementation Categories (ordered old → new)

  1. Vanilla DCGAN
  2. Multilabel ACGAN
  3. Multilabel ACGAN with a split discriminator (for finer tuning)
  4. Multilabel ACGAN with a split discriminator with BERT captions
  5. Multilabel ACGAN with a split discriminator with BERT captions V2

Sample Generations (ordered old → new)

Vanilla DCGAN:

Vanilla DCGAN

Multilabel ACGAN:

Multilabel ACGAN

Multilabel ACGAN w/split Discriminator:

Multilabel ACGAN w/split Discriminator

Multilabel ACGAN w/split Discriminator and Captions:

Multilabel ACGAN w/split Discriminator and Captions

Multilabel ACGAN w/split Discriminator and Captions V2

Multilabel ACGAN w/split Discriminator and Captions V2

About

Research for text-to-image synthesis via modified auxiliary classifier GANs. Incremental modification of model architecture for improved results, fully documented.


Languages

Language:Jupyter Notebook 100.0%