Zhendong Wang's starred repositories
diffusion-forcing
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Awesome-Text-to-Image
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
InstanceDiffusion
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
DiffusionDPO
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
GaussianCube
GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling
img2img-turbo
One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more
RectifiedFlow
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
sd-forge-layerdiffuse
[WIP] Layer Diffusion for WebUI (via Forge)
Awesome-Controllable-T2I-Diffusion-Models
A collection of resources on controllable generation with text-to-image diffusion models.
fastcomposer
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
MaskTextSpotterV3
The code of "Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting"
img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.