Charlottecuc

Xiaomin Tang's repositories

adaptive_voice_conversion

Language:PythonApache-2.0010

AlignTTS

Implementation of the AlignTTS

Language:Jupyter NotebookMIT010

Audio-Effects

Collection of audio effects plugins implemented from the explanations in the book "Audio Effects: Theory, Implementation and Application" by Joshua D. Reiss and Andrew P. McPherson.

Language:C++010

cotatron

Official code for Cotatron @ INTERSPEECH 2020

BSD-3-Clause000

Cross-Lingual-Voice-Cloning

Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.

BSD-3-Clause000

dl-for-emo-tts

:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:

MIT000

DurIAN

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

BSD-3-Clause000

DurIAN-1

Implementation of "DurIAN: Duration Informed Attention Network For Multimodal Synthesis".

000

espnet

End-to-End Speech Processing Toolkit

Apache-2.0000

flowtron

Auto-regressive flow-based generative network for text to speech synthesis

Apache-2.0000

glow-tts

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

MIT000

Glow_TTS

An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.

MIT000

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

MIT000

hmm-for-emo-tts

:computer: A repository with comprehensive instructions for using the Festvox toolkit for generating Emotional speech :speaker: from text

MIT000

LPCNet_parallel

Simulation of parallel synthesis with LPCNet vocoder

000

ModulateAgoraDemo

A demo integration between Modulate's Voice Skin SDK and Agora's Voice Chat SDK

000

multi-speaker-tacotron

VCTK multi-speaker tacotron for ICASSP 2020

BSD-3-Clause000

multi_emotional_tacotron

MIT000

Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

MIT000

OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Apache-2.0000

python-audio-effects

Apply audio effects such as reverb and EQ directly to audio files or NumPy ndarrays.

MIT000

python-pinyin

汉字转拼音(pypinyin)

MIT000

pytorch-dc-tts

Text to Speech with PyTorch (English and Mongolian)

MIT000

pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

MIT000

semi-tts

Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation

MIT000

sonic-annotator

Batch tool for feature extraction and annotation of audio files using Vamp plugins

GPL-2.0000

speedyspeech

BSD-3-Clause000

tacotron2

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

BSD-3-Clause000

tacotron_pytorch

PyTorch implementation of Tacotron speech synthesis model.

NOASSERTION000

valgrind

Experimental Version of Valgrind for macOS 10.14.6 Mojave and 10.15.4 Catalina (10.15.x NOT WORKING)

GPL-2.0000