Xinsheng Wang's repositories

S2IGAN

Pytorch Code for S2IGAN

ICASSP2021_paper_list-VC

ICASSP 2021 accepted papers in term of voice conversion (VC)

Tacotron-pytorch

Tacotron series TTS model implemented with Pytorch

Language:PythonLicense:MITStargazers:9Issues:0Issues:0
Language:Jupyter NotebookStargazers:7Issues:0Issues:0

Tacotron2_batch_inference

Pytorch tacotron2 that can be used to perform batch inference

Language:PythonLicense:BSD-3-ClauseStargazers:3Issues:0Issues:0

ObamaNet_Pytorch

pytorch implementation of ObamaNet

Language:Jupyter NotebookStargazers:2Issues:0Issues:0
Language:Jupyter NotebookStargazers:1Issues:0Issues:0

No-audio-speech-detection

The code is for the No-audio Speech Detection task in MediaEval 2020

Language:PythonStargazers:1Issues:1Issues:0
Language:HTMLStargazers:1Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0

Word-boundary-discovery

word boundary discovery in continuous speech signal

Language:Jupyter NotebookStargazers:1Issues:0Issues:0

academic-kickstart

📝 Easily create a beautiful website using Academic, Hugo, and Netlify

Language:ShellLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

ddsp

DDSP: Differentiable Digital Signal Processing

License:Apache-2.0Stargazers:0Issues:0Issues:0

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

License:NOASSERTIONStargazers:0Issues:0Issues:0
Language:HTMLLicense:NOASSERTIONStargazers:0Issues:0Issues:0

edx-SRS

微软edx语音识别课程

Stargazers:0Issues:0Issues:0

face-landmark-frontalization

Rotate 3D face landmarks to front

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

FECNet_extractor

FECNet to extract facial expression features

Language:PythonStargazers:0Issues:0Issues:0

first-order-model

This repository contains the source code for the paper First Order Motion Model for Image Animation

License:NOASSERTIONStargazers:0Issues:0Issues:0

glow-tts

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

License:MITStargazers:0Issues:0Issues:0

ICASSP2021_paper_list-TTS

TTS papers accepted to ICASSP 2021

Stargazers:0Issues:0Issues:0

Interspeech2021_submissions_TTS_and_VC

Papers submitted to Interspeech 2021 in terms of text-to-speech (TTS) and voice conversion (VC)

Stargazers:0Issues:0Issues:0

Kaldi-Tutorial

Kaldi 入门教程

Stargazers:0Issues:0Issues:0

PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

License:MITStargazers:0Issues:0Issues:0

spectacular-oregano-dc2d0

Jamstack site created with Stackbit

Language:JavaScriptLicense:NOASSERTIONStargazers:0Issues:0Issues:0

videoprocess

CN-Celeb, a large-scale Chinese celebrities dataset published by Center for Speech and Language Technology (CSLT) at Tsinghua University.

Stargazers:0Issues:0Issues:0

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

License:MITStargazers:0Issues:0Issues:0

wesing

An open-source high-quality Mandarin singing voice synthesis corpus

Stargazers:0Issues:0Issues:0

xinshengwang.github.io

Personal webpage

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0