There are 4 repositories under fastspeech2 topic.
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei and aishell3 datasets
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
AdaSpeech: Adaptive Text to Speech for Custom Voice
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
The Implementation of FastSpeech2 Based on Pytorch.
Use FastSpeech2 and HiFi-GAN to easily perform end-to-end Korean speech synthesis.
Refactored version of https://github.com/ming024/FastSpeech2
A Tensorflow Implementation of the FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering
Multi-Speaker FastSpeech2 applicable to Korean. Description about train and synthesize in detail.
An Android application that allows visually impaired people to hear which bus lines are passing next to them.
Clean and modernized implementation of FastSpeech2/LightSpeech using IPA
Aligning latent space of speaking style with human perception using a re-embedding strategy
An Android application that acts as a speaking assistant for the hearing impaired people.
This repository accompanies my MSc Thesis for the degree Voice Technology, storing all referenced data and other relevant resources.
Created this repo as a part of the project "Speech Technologies in Indian languages". About Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving quality of synthesis, as well as small foot print TTS integrated with disability aids and various other applications.
Convert Image to audio using ViT, GPT and FastSpeech