joanrod / ocr-vqgan

OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Perceptual loss for clear text-within-image generation. Fork from VQGAN in CompVis/taming-transformers

Home Page:https://arxiv.org/abs/2210.11248

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

joanrod/ocr-vqgan Stargazers