Xiaomin Tang's repositories
Audio-Effects
Collection of audio effects plugins implemented from the explanations in the book "Audio Effects: Theory, Implementation and Application" by Joshua D. Reiss and Andrew P. McPherson.
cotatron
Official code for Cotatron @ INTERSPEECH 2020
Cross-Lingual-Voice-Cloning
Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.
dl-for-emo-tts
:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:
DurIAN
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
DurIAN-1
Implementation of "DurIAN: Duration Informed Attention Network For Multimodal Synthesis".
espnet
End-to-End Speech Processing Toolkit
flowtron
Auto-regressive flow-based generative network for text to speech synthesis
glow-tts
A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Glow_TTS
An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.
hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
hmm-for-emo-tts
:computer: A repository with comprehensive instructions for using the Festvox toolkit for generating Emotional speech :speaker: from text
LPCNet_parallel
Simulation of parallel synthesis with LPCNet vocoder
ModulateAgoraDemo
A demo integration between Modulate's Voice Skin SDK and Agora's Voice Chat SDK
multi-speaker-tacotron
VCTK multi-speaker tacotron for ICASSP 2020
Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
OpenSeq2Seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
python-audio-effects
Apply audio effects such as reverb and EQ directly to audio files or NumPy ndarrays.
python-pinyin
汉字转拼音(pypinyin)
pytorch-dc-tts
Text to Speech with PyTorch (English and Mongolian)
pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
semi-tts
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
sonic-annotator
Batch tool for feature extraction and annotation of audio files using Vamp plugins
tacotron2
Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow
tacotron_pytorch
PyTorch implementation of Tacotron speech synthesis model.
valgrind
Experimental Version of Valgrind for macOS 10.14.6 Mojave and 10.15.4 Catalina (10.15.x NOT WORKING)