Xiong Lin's repositories
awesome-3d-diffusion
A collection of papers on diffusion models for 3D generation.
bark
🔊 Text-Prompted Generative Audio Model
champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
CuMo
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
FinePOSE_CVPR2024
FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models
FunClip
Open-source, accurate and easy-to-use video clipping tool | 开源、精准、方便的视频切片工具
HPT
HPT - Open Multimodal LLMs from HyperGAI
HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
IC-Light
More relighting!
Latte
Latte: Latent Diffusion Transformer for Video Generation.
leptonai
A Pythonic framework to simplify AI service building
lerobot
🤗 LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
MagicDance
[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
OpenLRM
An open-source impl. of Large Reconstruction Models
PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
RADIO
Official repository for "AM-RADIO: Reduce All Domains Into One"
Rip-NeRF
Rip-NeRF: Anti-aliasing Radiance Fields with Ripmap-Encoded Platonic Solids
RoHM
The official PyTorch code for RoHM: Robust Human Motion Reconstruction via Diffusion.
SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
SEED-X
Multimodal Models in Real World
spad
Code for SPAD : Spatially Aware Multiview Diffusers, CVPR 2024
StoryDiffusion
Create Magic Story!
SwissArmyTransformer
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
SyncTalk
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
VideoMV
VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model