wangyang199609's repositories
audio-visual-speech-enhancement
Official Implementation of "Visual Speech Enhancement", Interspeech 2018.
awesome-speech-recognition-speech-synthesis-papers
Speech synthesis, voice conversion, self-supervised learning, music generation,Automatic Speech Recognition, Speaker Verification, Speech Synthesis, Language Modeling
bsseval
audio source separation evaluation metrics
CodingInterviewChinese2
《剑指Offer》第二版源代码
kaldi
This is the official location of the Kaldi project.
MultimodalAnalysis_SpeakerDiarization
The project tries to solve a speaker diarization problem using audio features, face recognition and video feature extraction from face image, mouth tracking.
phasen
A unofficial Pytorch implementation of Microsoft's PHASEN
Speech-measure-SDR-SAR-STOI-PESQ
Speech quality measure of SDR、SAR、STOI、ESTOI、PESQ via MATLAB
SpEx
Implementation of "SpEx: Multi-Scale Time Domain Speaker Extraction Network".
SpEx_Plus
SpEx+(tied) source code
VGGVox
VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets