mnfutao's repositories
AdaIN-VC
An unofficial implementation of the paper "One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization".
AttentionBasedProsodyPrediction
Encoder and Decoder and Attention Based Prosody Prediction
autovc-official
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
autovc-unofficial_tw
An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".
Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.
dvector
Speaker embedding (d-vector) trained with GE2E loss
Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS (text to speech, speech synthesis) based on FastSpeech2, supporting English and Korean
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
FragmentVC
Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention
fullstop-deep-punctuation-prediction
A model that predicts the punctuation of English, Italian, French and German texts.
g2p
g2p: English Grapheme To Phoneme Conversion
GitHub-Chinese-Top-Charts
:cn: GitHub中文排行榜,各语言分设「软件 | 资料」榜单,精准定位中文好项目。各取所需,高效学习。
Model_Fusion_Based_Prosody_Prediction
Model Fusion Based Prosody Prediction
Prosody_Prediction
Predict prosody labels for Chinese sentences.
punctuation_prediction
chinese sentence punctuation prediction,中文句子标点符号预测。
pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
PyTSMod
An open-source Python library for audio time-scale modification.
voicefixer
General Speech Restoration
VQMIVC
Official implementation of VQMIVC: One-shot Voice Conversion @ Interspeech 2021
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
ZMM-TTS
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations