xinshengwang

followers

following

stars

Delft

https://xinshengwang.github.io/

Xinsheng Wang's repositories

S2IGAN

Pytorch Code for S2IGAN

Language:Python40 6 6

ICASSP2021_paper_list-VC

ICASSP 2021 accepted papers in term of voice conversion (VC)

Tacotron-pytorch

Tacotron series TTS model implemented with Pytorch

Language:PythonMIT900

Show-and-Speak

Language:Jupyter Notebook700

Tacotron2_batch_inference

Pytorch tacotron2 that can be used to perform batch inference

Language:PythonBSD-3-Clause300

ObamaNet_Pytorch

pytorch implementation of ObamaNet

Language:Jupyter Notebook200

Machine-Learning

Language:Jupyter Notebook100

No-audio-speech-detection

The code is for the No-audio Speech Detection task in MediaEval 2020

Language:Python1 10

opencpop

Language:HTML100

Speech-word-embedding

Language:Python100

Word-boundary-discovery

word boundary discovery in continuous speech signal

Language:Jupyter Notebook100

academic-kickstart

📝 Easily create a beautiful website using Academic, Hugo, and Netlify

Language:ShellMIT000

Automatic-Prosody-Annotation

000

ddsp

DDSP: Differentiable Digital Signal Processing

Apache-2.0000

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

NOASSERTION000

dissertation

Language:HTMLNOASSERTION000

edx-SRS

微软edx语音识别课程

000

face-landmark-frontalization

Rotate 3D face landmarks to front

Language:Jupyter Notebook000

FECNet_extractor

FECNet to extract facial expression features

Language:Python000

first-order-model

This repository contains the source code for the paper First Order Motion Model for Image Animation

NOASSERTION000

glow-tts

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

MIT000

ICASSP2021_paper_list-TTS

TTS papers accepted to ICASSP 2021

000

Interspeech2021_submissions_TTS_and_VC

Papers submitted to Interspeech 2021 in terms of text-to-speech (TTS) and voice conversion (VC)

000

Kaldi-Tutorial

Kaldi 入门教程

000

PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

MIT000

spectacular-oregano-dc2d0

Jamstack site created with Stackbit

Language:JavaScriptNOASSERTION000

videoprocess

CN-Celeb, a large-scale Chinese celebrities dataset published by Center for Speech and Language Technology (CSLT) at Tsinghua University.

000

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

MIT000

wesing

An open-source high-quality Mandarin singing voice synthesis corpus

000

xinshengwang.github.io

Personal webpage

Language:JavaScriptMIT000