There are 256 repositories under voice-cloning topic.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
A Python/Pytorch app for easily synthesising human voices
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
A webui for different audio related Neural Networks
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Wunjo AI: Synthesize & clone voices in English, Russian & Chinese, real-time speech recognition, deepfake face & lips animation, face swap with one photo, change video by text prompts, segmentation, and retouching. Open-source, local & free.
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)
The code for the bark-voicecloning model. Training and inference.
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
This repository has implementation for "Neural Voice Cloning With Few Samples"
Phoneme multilingual(Russian-English) voice cloning based on
Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
:relaxed: One Shot Voice Cloning base on Unet-TTS
[WIP] VoiceSmith makes training text to speech models easy.
A simple Google Colab notebook which can translate an original video into multiple languages along with lip sync.
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech
A program to dub non-english media with modern AI speech synthesis, diarization, and voice cloning!
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
A guide to clone anyone's voice and use it as a text-to-speech with android
The best looking and most functional webui for RVC related tasks. See website for UI demo:
Voice Conversion by CycleGAN (语音克隆/语音转换):CycleGAN-VC3
XTTSv2 Extension for oobabooga text-generation-webui
a comfyui custom node for GPT-SoVITS! you can voice cloning and tts in comfyui now
🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning
This is sample code for an Alexa skill that uses realistic voice cloning powered by Resemble AI's text-to-speech API, and Open AI’s GPT-3 AI engine.
Pandrator aspires to be a user-friendly app with a graphical interface and a one-click installer that creates high-quality speech from text in multiple languages (audiobooks, speech synchronised with subtitles and more) using local models (XTTS, Silero or VoiceCraft), plus voice cloning, LLM pre-processing, RVC enhancement, and automatic evaluation
Takes a youtube video, clones the voice and re-creates that video in a different language
MimicMania is a web application that allows you to generate speech and clone voices using text-to-speech technology. With MimicMania, you can create custom voices in a variety of languages and use them for a range of applications, from voiceovers to chatbots.
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io