hifi-gan

There are 2 repositories under hifi-gan topic.

Amphion
open-mmlab / Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
audio-generation audio-synthesis audioldm audit fastspeech2 hifi-gan music-generation naturalspeech2 singing-voice-conversion speech-synthesis text-to-audio text-to-speech vall-e vits voice-conversion
Language:Python 3995
jik876 / hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
deep-learning gan hifi-gan pytorch speech-synthesis text-to-speech tts vocoder
Language:Python 1775
keonlee9420 / PortaSpeech
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
text-to-speech normalizing-flows generative-model deep-neural-networks pytorch tts speech-synthesis neural-tts non-autoregressive portable-tts vae fastspeech hifi-gan non-ar mel-gan high-quality
Language:Python 328
keonlee9420 / Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
comprehensive deep-learning fastspeech fastspeech2 hifi-gan mel-gan multi-speaker neural-tts non-ar non-autoregressive pytorch single-speaker sota speech-synthesis supervised text-to-speech transformer tts ultimate-tts unsupervised
Language:Python 314
keonlee9420 / DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
text-to-speech deep-neural-networks pytorch tts speech-synthesis generative-model ddpm diffusion neural-tts non-autoregressive diffspeech diffgan-tts gan non-ar hifi-gan diffusion-models fastspeech multi-speaker-tts single-speaker-tts
Language:Python 297
NTT123 / vietTTS
Vietnamese Text to Speech library
tts-engines deep-learning tacotron vocoder hifi-gan vietnam vietnamese text-to-speech
Language:Python 184
keonlee9420 / Comprehensive-E2E-TTS
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
deep-learning end-to-end fastspeech2 hifi-gan jets multi-speaker neural-tts non-ar non-autoregressive pytorch single-speaker sota speech-synthesis text-to-speech text-to-wav tts ultimate-tts unsupervised
Language:Python 140
rishikksh20 / Avocodo-pytorch
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
hifi-gan speech-synthesis text-to-speech tts vocoder pytorch avocodo gan generative-adversarial-network
Language:Python 114
tts-arabic-pytorch
nipponjo / tts-arabic-pytorch
TTS models for Arabic (Tacotron2, FastPitch)
arabic hifi-gan pytorch tacotron2 tacotron2-pytorch text-to-speech torchaudio tts python deep-learning fastpitch speech speech-synthesis arabic-tts
Language:Jupyter Notebook 59
Voice-Privacy-Challenge / Voice-Privacy-Challenge-2022
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software
anonymization speaker-recognition asr privacy-protection voice-privacy-challenge voice-privacy voice-anonymization privacy attack-model anonymization-metrics metrics de-identification asv speech-recognition voice-conversion mcadams speech-synthesis speech-processing hifi-gan kaldi
Language:Python 57
keonlee9420 / Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
text-to-speech tts tacotron tacotron2 pytorch speech-synthesis autoregressive single-speaker multi-speaker robustness efficiency comprehensive neural-tts mel-gan hifi-gan reduction-factor diagonal-guided-attention deep-learning
Language:Python 42
hwRG / End-to-End-TTS-Fine-Tune
Use FastSpeech2 and HiFi-GAN to easily perform end-to-end Korean speech synthesis.
end-to-end fastspeech2 fine-tune hifi-gan tts
Language:Python 24
NTT123 / hifigan-tpu
Train HiFi-GAN on TPU
gan hifi-gan jax pax text-to-speech tts vocoder
Language:Python 10
jik876 / hifi-gan-demo
Audio samples from "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis"
speech-synthesis tts hifi-gan text-to-speech deep-learning gan
Language:HTML 9
manhph2211 / ViTTS
In this repo, I developed a step-by-step pipeline for a standard MultiSpeaker Text-to-Speech system :smile: In general, I used Portaspeech as an acoustic model and iSTFTNet as vocoder...
hifi-gan portaspeech text-to-speech mfa deepspeech istftnet realtime-tts mosnet vietnamese-tts multispeaker-speech-synthesis speech-synthesis normalizing-flow vocoder vietnamese-text-to-speech
Language:Python 9
ssmlkl / MnTTS2
This is the experimental description of MnTTS2.
tts fastspeech2 hifi-gan mongolian multi-speaker-tts
Language:Jupyter Notebook 7
nipponjo / tts-german-pytorch
TTS (FastPitch) for German
deep-learning fastpitch german hifi-gan python pytorch speech speech-synthesis text-to-speech torchaudio tts german-language emotional-speech
Language:Python 5
34j / neural-source-filter
Python package for NSF and NSF-HiFi-GAN (unofficial)
hifi-gan nsf tts vocoder voice-conversion neural-source-filter python pytorch mypy
Language:Python 4
mehdihosseinimoghadam / Catalan-Text-to-Speech
Catalan Text to Speech
catalan catalan-language catalan-text-to-speech fastspeech hifi-gan melgan pytorch speech speech-processing speech-synthesis speech-to-text tacotron tacotron2 tacotron2-pytorch wavernn
Language:Python 2
watchstep / glow-tts-jejueo
제주어 음성 합성 (보완 중)
tts glow-tts jejueo hifi-gan korean
Language:Jupyter Notebook 2
claire-1125 / POSCO_Academy_AI_Project
포스코 청년 AI·Big Data 아카데미 - AI 프로젝트
pytorch faceswap hifi-gan stylegan2-ada first-order-motion-model wav2lip glow-tts
Language:Jupyter Notebook 1
khaykingleb / HiFi-GAN
Vocoder for TTS
gan hifi-gan pytorch tts vocoder
Language:Python 1
lordzuko / SpeakingStyle
Aligning latent space of speaking style with human perception using a re-embedding strategy
fastspeech2 pytorch speaking-style speech-synthesis hifi-gan pytorch-distributeddataparallel vocoder blizzard-challenge
Language:Jupyter Notebook 1
hwRG / HiFi-GAN-Pytorch
If you have a wav & transcript, can train HiFi-GAN right now.
gan hifi-gan tts vocoder
Language:Python 0