MinSang Baek's repositories
audiomentations
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
audiosocket
Simple bidirectional audio protocol
awesome-python-scientific-audio
Curated list of python software and packages related to scientific research in audio
DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
FN-SSL
PyTorch implementation of "FN-SSL: Full-Band and Narrow-Band Fusion for Sound Source Localization." [INTERSPEECH 2023]
FQSE
Fully Quantized Neural Networks For Speech Enhancement
FullSubNet-plus
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".
LPCNet
Efficient neural speech synthesis
ml-spatial-librispeech
A large synthetic dataset of spatial audio with multiple labels
MULTI-AUDIODEC
This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.
NeMo
NeMo: a toolkit for conversational AI
NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
nussl
A flexible source separation library in Python
open-unmix-pytorch
Open-Unmix - Music Source Separation for PyTorch
pulse
A Pytorch implementation of "Audio signal enhancement with learning from positive and unlabelled data"
pydiogment
:mega: Python library for audio augmentation
SC-Wind-Noise-Generator
Generate synthetic wind noise signals based on a wind speed profile.
speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
SRMRpy
Python implementation of the SRMR toolbox
TDANet
An efficient speech separation method
torch-pesq
PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio