Yuan-Man's repositories
ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
audio-development-tools
This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.
audio-ai-agent
Here we will track the latest Audio AI Agent, including speech, music, sound effects, etc.
audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
audio-preprocess
Preprocess Audio for training
game-engine
Explore Game Engine Tools! 🚀
multi-clip
Connecting text, images, audio, and video!
Retrieval-based-Voice-Conversion-WebUI
Voice data <= 10 mins can also be used to train a good VC model!
speechtoolkit
[EARLY PUBLIC ALPHA] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activity detection, and more!
GPT-SoVITS-GUI
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
metavoice-src
AI for human-level speech intelligence
open-interpreter
A natural language interface for computers
Open-Sora-Plan
This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.
speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
spleeter
Deezer source separation library including pretrained models.
VisionProTeleop
VisionOS App + Python Library to stream head / wrist / finger tracking data from Vision Pro to any robots.