johnpaulbin / DALLE-reproduction

Reproducing OpenAI's DALLE model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DALLE-reproduction

This repository is for sharing pre-trained OpenAI DALLE model and generating images from given texts.

All models are trained by lucidrains/DALLE-pytorch + VQGAN (Taming transformer) with different training code and BPE model.

If you want to train DALLE, please go to lucidrains/DALLE-pytorch and support them to reproduce better DALLE models ✈️

The notebook includes

1. Text to image generation

2. Pre-trained CLIP reranking

  • CUB200

  • COCO

3. Generate rest of image based on the given cropped image

  • CUB200

  • COCO

Usage

  1. Install requirements
$ pip install -r requirements
  1. Install DeepSpeed
  • Follow the instruction here and install DeepSpeed

Models

  • Download models below and save them in pretrained folder
  • Check the link in Details for the model specifics
Dataset Download Password Optimizer Size Details
CUB200 link v9ge Adam 1.1GB link
CUB200 link 47w1 AdamW 1.1GB link
COCO link p3ki Adam 1.5GB link

About

Reproducing OpenAI's DALLE model


Languages

Language:Jupyter Notebook 100.0%