There are 67 repositories under speech-synthesis topic.
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Lingvo
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
WaveNet vocoder
DeepMind's Tacotron-2 Tensorflow implementation
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Kalliope is a framework that will help you to create your own personal assistant.
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
This is now the official location of the Merlin project.
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
an open-source implementation of sequence-to-sequence based speech processing engine
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)