Yuan-ManX

Yuan-Man's repositories

ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.

MIT395 12 1

audio-development-tools

This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.

MIT264 110

SouPyX

SouPyX: An Audio Exploration Space.🪐

Language:PythonMIT31 2 2

audio-ai-agent

Here we will track the latest Audio AI Agent, including speech, music, sound effects, etc.

MIT11 20

audio-ai-timeline

A timeline of the latest AI models for audio generation, starting in 2023!

5 10

AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Language:PythonApache-2.02 10

artnex

ArtNex is a deep learning framework exploring the innovative fusion of art and technology.

Language:PythonMIT2 30

riffusion

Stable diffusion for real-time music generation

Language:PythonMIT2 10

audio-preprocess

Preprocess Audio for training

Language:PythonApache-2.01 10

game-engine

Explore Game Engine Tools! 🚀

MIT1 20

multi-clip

Connecting text, images, audio, and video!

MIT1 20

ollama

Get up and running with Llama 2, Mistral, and other large language models locally.

Language:GoMIT1 10

open-tts-tracker

1 10

Retrieval-based-Voice-Conversion-WebUI

Voice data <= 10 mins can also be used to train a good VC model!

Language:PythonMIT1 10

speechtoolkit

[EARLY PUBLIC ALPHA] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activity detection, and more!

Language:Python1 10

ComfyUI-AudioScheduler

Language:PythonGPL-3.0010

GPT-SoVITS-GUI

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT010

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION010

mechaAI

MIT020

MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Language:PythonMIT010

metavoice-src

AI for human-level speech intelligence

Language:PythonApache-2.0010

NexEngine

NexEngine Game Engine 🚀

MIT020

open-interpreter

A natural language interface for computers

Language:PythonAGPL-3.0010

Open-Sora-Plan

This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.

Language:Jupyter NotebookNOASSERTION010