Xinsheng Wang's repositories
xinshengwang.github.io
Personal webpage
chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
wesing
An open-source high-quality Mandarin singing voice synthesis corpus
ddsp
DDSP: Differentiable Digital Signal Processing
spectacular-oregano-dc2d0
Jamstack site created with Stackbit
PortaSpeech
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Tacotron-pytorch
Tacotron series TTS model implemented with Pytorch
FECNet_extractor
FECNet to extract facial expression features
vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
Interspeech2021_submissions_TTS_and_VC
Papers submitted to Interspeech 2021 in terms of text-to-speech (TTS) and voice conversion (VC)
ICASSP2021_paper_list-VC
ICASSP 2021 accepted papers in term of voice conversion (VC)
ICASSP2021_paper_list-TTS
TTS papers accepted to ICASSP 2021
face-landmark-frontalization
Rotate 3D face landmarks to front
ObamaNet_Pytorch
pytorch implementation of ObamaNet
first-order-model
This repository contains the source code for the paper First Order Motion Model for Image Animation
glow-tts
A Generative Flow for Text-to-Speech via Monotonic Alignment Search
edx-SRS
微软edx语音识别课程
No-audio-speech-detection
The code is for the No-audio Speech Detection task in MediaEval 2020
Kaldi-Tutorial
Kaldi 入门教程
Word-boundary-discovery
word boundary discovery in continuous speech signal
Tacotron2_batch_inference
Pytorch tacotron2 that can be used to perform batch inference
academic-kickstart
📝 Easily create a beautiful website using Academic, Hugo, and Netlify