houxuedong's starred repositories

Language:PythonStargazers:59Issues:0Issues:0
Stargazers:185Issues:0Issues:0

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:16260Issues:0Issues:0

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonLicense:NOASSERTIONStargazers:1229Issues:0Issues:0

V-Express

V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.

Language:PythonStargazers:1280Issues:0Issues:0

syncnet_python

Out of time: automated lip sync in the wild

Language:PythonLicense:MITStargazers:612Issues:0Issues:0

ViViD

ViViD: Video Virtual Try-on using Diffusion Models

License:MITStargazers:244Issues:0Issues:0
Language:PythonLicense:MITStargazers:58Issues:0Issues:0

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:400Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:33205Issues:0Issues:0

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonLicense:Apache-2.0Stargazers:3783Issues:0Issues:0

Yi-1.5

Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.

License:Apache-2.0Stargazers:299Issues:0Issues:0

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonLicense:MITStargazers:1210Issues:0Issues:0

LAION-Face

The human face subset of LAION-400M for large-scale face pretraining.

Language:PythonStargazers:253Issues:0Issues:0

facer

Face analysis tools for modern research, equipped with state-of-the-art Face Parsing and Face Alignment

Language:PythonLicense:MITStargazers:286Issues:0Issues:0

InstantStyle

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥

Language:Jupyter NotebookStargazers:1337Issues:0Issues:0
Language:PythonStargazers:27Issues:0Issues:0

sep

Code release for "Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models" https://arxiv.org/abs/2402.03659

Language:PythonStargazers:55Issues:0Issues:0

custom-diffusion

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

Language:PythonLicense:NOASSERTIONStargazers:1795Issues:0Issues:0

DiffSHEG

[CVPR'24] DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation

Language:PythonLicense:BSD-3-ClauseStargazers:73Issues:0Issues:0

StoryImager

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion

License:MITStargazers:27Issues:0Issues:0

MoMA

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

Language:Jupyter NotebookStargazers:109Issues:0Issues:0

ml-hugs

Official repository of HUGS: Human Gaussian Splats (CVPR 2024)

Language:PythonLicense:NOASSERTIONStargazers:87Issues:0Issues:0

SiTH

[CVPR 2024] SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion

Language:PythonLicense:MITStargazers:61Issues:0Issues:0

GaussianTalker

Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko, Sangjun Ahn and Seungryong Kim

Language:PythonLicense:NOASSERTIONStargazers:155Issues:0Issues:0

diff-sampler

[CVPR-2024, Highlight, Top 2.8%] Official implementation for "Fast ODE-based Sampling for Diffusion Models in Around 5 Steps".

Language:PythonLicense:Apache-2.0Stargazers:56Issues:0Issues:0

pegasus

Official Repository for CVPR 2024 paper PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:36Issues:0Issues:0

Parts2Whole

[Arxiv 2024] From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation

Language:PythonLicense:MITStargazers:130Issues:0Issues:0
Language:PythonLicense:MITStargazers:233Issues:0Issues:0

PuLID

Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Language:PythonLicense:Apache-2.0Stargazers:847Issues:0Issues:0