VAR: a new visual generation method elevates GPT-style models beyond diffusion🚀 & Scaling laws observed📈
This is the official PyTorch implementation of Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction.
NOTE: Mark your calendars📅! Our code will be ready before 9:00 AM UTC on 4/4/2024. Feel free to star ⭐ or watch 👓 for the latest updates🤗!
Visual Autoregressive Modeling (VAR) redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction".
For a deep dive into our analyses, discussions, and evaluations, check out our paper.
This project is licensed under the MIT License - see the LICENSE file for details.
If our work assists your research, feel free to give us a star ⭐ or cite us using:
@Article{VAR,
title={Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction},
author={Keyu Tian and Yi Jiang and Zehuan Yuan and Bingyue Peng and Liwei Wang},
year={2024},
eprint={2404.02905},
archivePrefix={arXiv},
primaryClass={cs.CV}
}