brunoreisportela / VQGAN-CLIP

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VQ-GAN + CLIP

Repository stems from: Original colab

Additional upscaling using ESRGAN

VQ-GAN original paper: https://arxiv.org/abs/2012.09841

CLIP original paper: https://arxiv.org/abs/2103.00020

ESRGAN original paper: https://arxiv.org/abs/1809.00219

Very nice introduction to the technique: Alien Dreams

Installation

If you want to run the script on GPU, firstly install PyTorch with CUDA support!

git clone https://github.com/openai/CLIP 
git clone https://github.com/CompVis/taming-transformers 
git clone https://github.com/xinntao/ESRGAN
pip install ftfy 
pip install regex
pip install tqdm
pip install omegaconf
pip install pytorch-lightning
pip install kornia 
pip install einops 
pip install imageio-ffmpeg
pip install opencv-python

Pretrained models

Copy pretrained models into models/

vqgan_imagenet_f16_16384

model config

Additional links to models - work in progress...

Run

python CLIP_VQGAN.py -texts your_text_prompt

Additional run option

  • -width - Image width
  • -height - Image height
  • -model - Used pretrained model for VQ-GAN
  • -display_int - Display interval during generation of the image
  • -init_image - Starting image instead of random noise
  • -target images - Target images instead of text prompt
  • -seed - Random seed
  • -max_iterations - Maximum number of optimization iterations
  • -make_video - Possibility of making video from genrated images
  • -upscale - Possibility to 4x upscale images

Additional information

Work in progress...

About


Languages

Language:Python 100.0%