wangyang199609

wangyang199609's repositories

audio-visual-speech-enhancement

Official Implementation of "Visual Speech Enhancement", Interspeech 2018.

Language:Python000

awesome-speech-recognition-speech-synthesis-papers

Speech synthesis, voice conversion, self-supervised learning, music generation,Automatic Speech Recognition, Speaker Verification, Speech Synthesis, Language Modeling

MIT000

bsseval

audio source separation evaluation metrics

Language:PythonMIT000

CodingInterviewChinese2

《剑指Offer》第二版源代码

Language:C++NOASSERTION000

kaldi

This is the official location of the Kaldi project.

NOASSERTION000

MultimodalAnalysis_SpeakerDiarization

The project tries to solve a speaker diarization problem using audio features, face recognition and video feature extraction from face image, mouth tracking.

000

phasen

A unofficial Pytorch implementation of Microsoft's PHASEN

Language:Python000

Speech-measure-SDR-SAR-STOI-PESQ

Speech quality measure of SDR、SAR、STOI、ESTOI、PESQ via MATLAB

000

SpEx

Implementation of "SpEx: Multi-Scale Time Domain Speaker Extraction Network".

CC0-1.0000

SpEx_Plus

SpEx+(tied) source code

Language:Python000

VGGVox

VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

000