felixfuu's starred repositories
IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
VAR
[GPT beats diffusionš„] [scaling laws in visual generationš] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
style-aligned
Official code for "Style Aligned Image Generation via Shared Attention"
mPLUG-DocOwl
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
cv-arxiv-daily
šAutomatically Update CV Papers Daily using Github Actions (Update Every 12th hours)
infinite-zoom-automatic1111-webui
infinite zoom effect extension for AUTOMATIC1111's webui - stable diffusion
MimicBrush
Official implementations for paper: Zero-shot Image Editing with Reference Imitation
InstructCV
[ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"
Prompt-Diffusion
Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"
ReMoDiffuse
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
scaling_on_scales
When do we not need larger vision models?
MS-Diffusion
Official implementation of MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
llmblueprint
[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"
Awesome-Open-Vocabulary-Detection-and-Segmentation
Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future