Fu Guanyu's starred repositories
audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
VITS-BigVGAN-SpanPSP-Chinese
基于PyTorch的VITS-BigVGAN的tts中文模型,加入韵律预测模型。
DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
audiomentations
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
TTS-frontend
TTS-frontend with Bert and CRF/lstm (For Tacotron)
WeTextProcessing
Text Normalization & Inverse Text Normalization
ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
chinese_text_normalization
Chinese text normalization for speech processing
speech_dataset
The dataset of Speech Recognition
chinese_speech_pretrain
chinese speech pretrained models
PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
forced-alignment-tools
A collection of links and notes on forced alignment tools