There are 248 repositories under voice-cloning topic.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
A Python/Pytorch app for easily synthesising human voices
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
A webui for different audio related Neural Networks
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Wunjo AI: Synthesize & clone voices in English, Russian & Chinese, real-time speech recognition, deepfake face & lips animation, face swap with one photo, change video by text prompts, segmentation, and retouching. Open-source, local & free.
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)
The code for the bark-voicecloning model. Training and inference.
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
This repository has implementation for "Neural Voice Cloning With Few Samples"
Phoneme multilingual(Russian-English) voice cloning based on
Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
:relaxed: One Shot Voice Cloning base on Unet-TTS
[WIP] VoiceSmith makes training text to speech models easy.
A simple Google Colab notebook which can translate an original video into multiple languages along with lip sync.
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
A guide to clone anyone's voice and use it as a text-to-speech with android
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech
The best looking and most functional webui for RVC related tasks. See website for UI demo:
A program to dub non-english media with modern AI speech synthesis, diarization, and voice cloning!
Voice Conversion by CycleGAN (语音克隆/语音转换):CycleGAN-VC3
XTTSv2 Extension for oobabooga text-generation-webui
This is sample code for an Alexa skill that uses realistic voice cloning powered by Resemble AI's text-to-speech API, and Open AI’s GPT-3 AI engine.
🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning
Takes a youtube video, clones the voice and re-creates that video in a different language
MimicMania is a web application that allows you to generate speech and clone voices using text-to-speech technology. With MimicMania, you can create custom voices in a variety of languages and use them for a range of applications, from voiceovers to chatbots.
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
Korean TTS using coqui TTS (glowtts and multiband melgan) - 한국어 TTS
Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the YourTTS TTS model to clone and generate realistic audio waves