Beast code in Giters

houxuedong's starred repositories

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

Language:Jupyter NotebookNOASSERTION1626000

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonNOASSERTION122900

V-Express

V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.

Language:Python128000

syncnet_python

Out of time: automated lip sync in the wild

Language:PythonMIT61200

ViViD

ViViD: Video Virtual Try-on using Diffusion Models

MIT24400

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookNOASSERTION40000

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.03320500

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonApache-2.0378300

Yi-1.5

Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.

Apache-2.029900

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonMIT121000

LAION-Face

The human face subset of LAION-400M for large-scale face pretraining.

Language:Python25300

facer

Face analysis tools for modern research, equipped with state-of-the-art Face Parsing and Face Alignment

Language:PythonMIT28600

InstantStyle

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥

Language:Jupyter Notebook133700

sep

Code release for "Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models" https://arxiv.org/abs/2402.03659

Language:Python5500

custom-diffusion

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

Language:PythonNOASSERTION179500

DiffSHEG

[CVPR'24] DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation

Language:PythonBSD-3-Clause7300

StoryImager

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion

MIT2700

MoMA

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

Language:Jupyter Notebook10900

ml-hugs

Official repository of HUGS: Human Gaussian Splats (CVPR 2024)

Language:PythonNOASSERTION8700

SiTH

[CVPR 2024] SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion

Language:PythonMIT6100

Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko, Sangjun Ahn and Seungryong Kim

Language:PythonNOASSERTION15500

diff-sampler

[CVPR-2024, Highlight, Top 2.8%] Official implementation for "Fast ODE-based Sampling for Diffusion Models in Around 5 Steps".

Language:PythonApache-2.05600

pegasus

Official Repository for CVPR 2024 paper PEGASUS: Personalized Generative 3D Avatars with Composable Attributes

Language:Jupyter NotebookApache-2.03600

Parts2Whole

[Arxiv 2024] From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation

Language:PythonMIT13000

FlashFace

Language:PythonMIT23300

PuLID

Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Language:PythonApache-2.084700

houxuedong