xunnew's repositories
gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
lerobot
🤗 LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch
3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
StoryDiffusion
Create Magic Story!
audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
IDM-VTON
IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
OpenPromptStudio
🥣 AIGC 提示词可视化编辑器 | OPS | Open Prompt Studio
metahuman-stream
Real time streaming digital human based on nerf
MagicTime
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
pytorch3d
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
VirtualWife
VirtualWife是一个虚拟数字人项目,项目还处于孵化阶段,有很多需要优化的地方,作者想打造一个拥有自己“灵魂”的虚拟数字人,你可以像朋友一样和她相识,作者希望虚拟数字人融入人类生活,作为恋爱导师,心理咨询师,解决人类的情感需求。
MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
ComfyUI-Diffusers
This repository is a custom node in ComfyUI. This is a program that allows you to use Huggingface Diffusers module with ComfyUI. Additionally, Stream Diffusion is also available.
Fay
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guides, broadcasters, assistants, waiters, teachers, and voice or text-based mobile assistants.
OOTDiffusion
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
AnimateDiff
Official implementation of AnimateDiff.
Fooocus
Focus on prompting and generating
edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
DigiHuman
Automatic 3D Character animation using Pose Estimation and Landmark Generation techniques
Orion
Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型,包括对话模型,长文本模型,量化模型,RAG微调模型,Agent微调模型等。
FFmpeg
Mirror of https://git.ffmpeg.org/ffmpeg.git
anything-llm
Open-source ChatGPT experience for LLMs, embedders, and vector databases. Unlimited documents, messages, and concurrent users with permission management in one app.