Yuan-Man's repositories
ai-audio-datasets
AI Audio Datasets 🎵. A list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
ai-game-devtools
Here we will keep track of the latest AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥
audio-development-tools
This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.
ai-agent-roadmap
Explore the latest AI Agent Framework!
ComfyUI-Tools-Roadmap
Here we will track the latest development tools for ComfyUI, including Image, Mesh, Texture, Animation, Video, Audio, 3D Model, and more!🔥
ai-multimodal-timeline
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D content. 🔥
ai-audio-startups
Community list of startups working with AI in audio and music technology
Awesome-ChatTTS
Awesome-ChatTTS 整理和汇总了 ChatTTS 项目的常见问题和相关资源,是 ChatTTS 的最佳入门指南。
ai-voice-agents
AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧
HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
AudioLLM
Audio Large Language Models
awesome-ssm-ml
Reading list for research topics in state-space models
ComfyUI_examples
Examples of ComfyUI workflows
friendly-stable-audio-tools
Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stability AI.
LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
ragoon
Improve large language models (LLM) retrieval using dynamic web-search based on blazingly fast query generation from Groq chips ⚡
Scrapegraph-ai
Python scraper based on AI
SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
XTTSv2
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production