There are 108 repositories under speech-synthesis topic.
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
so-vits-svc fork with realtime support, improved interface and more features.
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
An Open Source text-to-speech system built by inverting Whisper.
Foundational model for human-like, expressive TTS
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Lingvo
WaveNet vocoder
An open-source ChatGPT app with a voice
DeepMind's Tacotron-2 Tensorflow implementation
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Kalliope is a framework that will help you to create your own personal assistant.
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
This is now the official location of the Merlin project.
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing