Yuan-ManX

followers

following

stars

Shanghai, China

ym1076302261@163.com

Yuan-Man's repositories

ai-audio-datasets

AI Audio Datasets 🎵. A list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.

ai-game-devtools

Here we will keep track of the latest AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥

audio-development-tools

This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.

MIT254 100

ai-agent-roadmap

Explore the latest AI Agent Framework!

MIT30 50

ComfyUI-Tools-Roadmap

Here we will track the latest development tools for ComfyUI, including Image, Mesh, Texture, Animation, Video, Audio, 3D Model, and more!🔥

MIT22 30

ai-multimodal-timeline

Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D content. 🔥

MIT800

ai-audio-startups

Community list of startups working with AI in audio and music technology

Apache-2.03 10

Awesome-ChatTTS

Awesome-ChatTTS 整理和汇总了 ChatTTS 项目的常见问题和相关资源，是 ChatTTS 的最佳入门指南。

NOASSERTION200

Yuan-ManX

MIT2 20

ai-voice-agents

AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧

MIT100

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

Language:PythonNOASSERTION100

Comfyui-MusePose

NOASSERTION100

elevenlabs-examples

MIT100

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

NOASSERTION100

llm-app-stack

1 10

MultiRAG

MIT100

MuseTalk

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

NOASSERTION100

Omost

Your image is almost there!

Language:PythonApache-2.0100

AudioLLM

Audio Large Language Models

000

awesome-ssm-ml

Reading list for research topics in state-space models

MIT000

ComfyUI_examples

Examples of ComfyUI workflows

Language:HTML000

friendly-stable-audio-tools

Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stability AI.

MIT000

LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

MIT000

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonMIT000

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.0000

PyChuck

MIT000

ragoon

Improve large language models (LLM) retrieval using dynamic web-search based on blazingly fast query generation from Groq chips ⚡

Apache-2.0000

Scrapegraph-ai

Python scraper based on AI

MIT000

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:Python000

XTTSv2

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.0000