Xiao Yu's repositories
awsome-distributed-training
Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.
champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
ComfyUI
A powerful and modular stable diffusion GUI with a graph/nodes interface.
ComfyUI-IF_AI_tools
ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generation workflow by leveraging the power of language models.
ComfyUI-Whisper
Transcribe audio and add subtitles to videos using Whisper in ComfyUI, licensed under CC BY-NC-SA 4.0
ComfyUI_StoryDiffusion
You can using StoryDiffusion in ComfyUI
CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
EasyAnimate
šŗ An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
edit-one-for-all
āļø Edit One for All: Interactive Batch Image Editing (CVPR 2024)
how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
IDM-VTON
IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
IMAGDressing
šIMAGDressingš: Interactive Modular Apparel Generation for Virtual Dressing
InstanceDiffusion
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
Intelli-Agent
Chatbot Portal with Agent: Streamlined Workflow for Building Agent-Based Applications
lectures
Material for cuda-mode lectures
LivePortrait
Bring portraits to life!
mamba
Mamba SSM architecture
MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
MIGC
[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)
MimicBrush
Official implementations for paper: Zero-shot Image Editing with Reference Imitation
MotionClone
Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation
PuLID
Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
StoryDiffusion
Create Magic Story!
StoryGen
[CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models
TPD
This is the official repository for the paper "Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On". CVPR 2024
VAR
[GPT beats diffusionš„] [scaling laws in visual generationš] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
zest_code
This is the official implementation of ZeST