VQ-GAN + CLIP

Repository stems from: Original colab

Additional upscaling using ESRGAN

VQ-GAN original paper: https://arxiv.org/abs/2012.09841

CLIP original paper: https://arxiv.org/abs/2103.00020

ESRGAN original paper: https://arxiv.org/abs/1809.00219

Very nice introduction to the technique: Alien Dreams

Installation

If you want to run the script on GPU, firstly install PyTorch with CUDA support!

git clone https://github.com/openai/CLIP 
git clone https://github.com/CompVis/taming-transformers 
git clone https://github.com/xinntao/ESRGAN
pip install ftfy 
pip install regex
pip install tqdm
pip install omegaconf
pip install pytorch-lightning
pip install kornia 
pip install einops 
pip install imageio-ffmpeg
pip install opencv-python

Pretrained models

Copy pretrained models into models/

vqgan_imagenet_f16_16384

model config

Additional links to models - work in progress...

Run

python CLIP_VQGAN.py -texts your_text_prompt

Additional run option

-width - Image width
-height - Image height
-model - Used pretrained model for VQ-GAN
-display_int - Display interval during generation of the image
-init_image - Starting image instead of random noise
-target images - Target images instead of text prompt
-seed - Random seed
-max_iterations - Maximum number of optimization iterations
-make_video - Possibility of making video from genrated images
-upscale - Possibility to 4x upscale images

Additional information

Work in progress...

brunoreisportela / VQGAN-CLIP

VQ-GAN + CLIP

Installation

Pretrained models

vqgan_imagenet_f16_16384

Run

Additional information

About

Languages