bruinxiong

Xiong Lin's repositories

AniTalker

Apache-2.0000

awesome-3d-diffusion

A collection of papers on diffusion models for 3D generation.

MIT000

bark

🔊 Text-Prompted Generative Audio Model

MIT000

champ

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

Apache-2.0000

CuMo

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Apache-2.0000

FinePOSE_CVPR2024

FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models

MIT000

FunClip

Open-source, accurate and easy-to-use video clipping tool | 开源、精准、方便的视频切片工具

Language:PythonMIT000

HPT

HPT - Open Multimodal LLMs from HyperGAI

Apache-2.0000

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

NOASSERTION000

IC-Light

More relighting!

Language:PythonApache-2.0000

ID-Animator

Language:Python000

Latte

Latte: Latent Diffusion Transformer for Video Generation.

Apache-2.0000

leptonai

A Pythonic framework to simplify AI service building

Apache-2.0000

lerobot

🤗 LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch

Language:PythonApache-2.0000

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Apache-2.0000

MagicDance

[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion

000

OpenLRM

An open-source impl. of Large Reconstruction Models

Apache-2.0000

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

AGPL-3.0000

RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

NOASSERTION000

Rip-NeRF

Rip-NeRF: Anti-aliasing Radiance Fields with Ripmap-Encoded Platonic Solids

Language:Python000

RoHM

The official PyTorch code for RoHM: Robust Human Motion Reconstruction via Diffusion.

NOASSERTION000

SadTalker

[CVPR 2023] SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Language:PythonNOASSERTION000

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookNOASSERTION000

spad

Code for SPAD : Spatially Aware Multiview Diffusers, CVPR 2024

000

StoryDiffusion

Create Magic Story!

Apache-2.0000

SwissArmyTransformer

SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.

Apache-2.0000

SyncTalk

[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"

NOASSERTION000

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT000

VideoCrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

NOASSERTION000

VideoMV

VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model

MIT000