MinSang Baek's repositories
3D-Speaker
A repository for single- and multi-modal speaker verification, speaker recognition and speaker diarization.
AudioDec
An Open-source Streaming High-fidelity Neural Audio Codec
audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
BERP
The pytorch implementation of BERP: A Blind Estimator of Room acoustic and physical Parameters
DDDM-VC
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)
ddsp
DDSP: Differentiable Digital Signal Processing
DeepWaveDOA
ICASSP 2024: Robust DOA estimation from deep acoustic imaging
ears_dataset
Expressive Anechoic Recordings of Speech (EARS)
generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
gtcrn
The official implementation of GTCRN, an ultra-lite speech enhancement model.
NOTSOFAR1-Challenge
NOTSOFAR-1 Challenge: Distant Diarization and ASR
peerRTF
robust RTFs by GCN
penn
Pitch Estimating Neural Networks (PENN)
pykaldi
A Python wrapper for Kaldi
PySDR
PySDR.org textbook source material, feel free to post issues/PRs
se-scaling
Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement"
SepReformer
Official repository of SepReformer for speech separation
SEtrain
A training code template for DNN-based speech enhancement.
silero-vad
Python Wrapper of Silero VAD
speech_evaluation
A toolkit dedicate for speech evaluation.
tf-locoformer
Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
torchcrepe
Pytorch implementation of the CREPE pitch tracker
webMUSHRA
a MUSHRA compliant web audio API based experiment software
wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit