There are 132 repositories under voice-conversion topic.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easily train a good VC model with voice data <= 10 mins!
SoftVC VITS Singing Voice Conversion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
so-vits-svc fork with realtime support, improved interface and more features.
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
A simple, high-quality voice conversion tool focused on ease of use and performance.
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
This is now the official location of the Merlin project.
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
一个简易的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
The code for the bark-voicecloning model. Training and inference.
Deep learning for audio processing
Unsupervised Speech Decomposition Via Triple Information Bottleneck
Voice Conversion Tool Kit
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Voice Converter Using CycleGAN and Non-Parallel Data
This is a pytorch implementation of the paper: StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
基于javaFX的简单字幕处理桌面程序,集成在线翻译及语音转换
The dataset of Speech Recognition
A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (VC), Singing Voice Conversion (SVC), and related interesting works (such as Music Synthesis, Automatic Music Transcription, Automatic MOS Prediction, SSL-based ASR...etc).
Audio style transfer with shallow random parameters CNN.