MinSang Baek's repositories

License:CC-BY-SA-4.0Stargazers:0Issues:0Issues:0

wesep

Target Speaker Extraction Toolkit

Stargazers:0Issues:0Issues:0

DENSE

ICASSP2025Dynamic Embedding Causal Target Speech Extraction

Stargazers:0Issues:0Issues:0

Target-Conversation-Extraction

This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamics"

License:NOASSERTIONStargazers:0Issues:0Issues:0

Apollo

Music repair method to convert lossy MP3 compressed music to lossless music.

Stargazers:0Issues:0Issues:0

Stable-Hybrid-Auditory-Filterbanks

Official Implementation of Interspeech 2024 Paper "Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement"

License:BSD-3-Clause-ClearStargazers:0Issues:0Issues:0

pykaldi

A Python wrapper for Kaldi

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

webMUSHRA

a MUSHRA compliant web audio API based experiment software

License:NOASSERTIONStargazers:0Issues:0Issues:0

wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0

NOTSOFAR1-Challenge

NOTSOFAR-1 Challenge: Distant Diarization and ASR

License:MITStargazers:0Issues:0Issues:0

speech_evaluation

A toolkit dedicate for speech evaluation.

License:Apache-2.0Stargazers:0Issues:0Issues:0

tf-locoformer

Transformer with Local Modeling by Convolution for Speech Separation and Enhancement

License:Apache-2.0Stargazers:0Issues:0Issues:0

PySDR

PySDR.org textbook source material, feel free to post issues/PRs

License:NOASSERTIONStargazers:0Issues:0Issues:0

penn

Pitch Estimating Neural Networks (PENN)

License:MITStargazers:0Issues:0Issues:0

X-TF-GridNet

The implementation of "X-TF-GridNet: A Time-Frequency Domain Target Speaker Extraction Network with Adaptive Speaker Embedding Fusion", which is accepted by Information Fusion.

Stargazers:0Issues:0Issues:0

peerRTF

robust RTFs by GCN

Stargazers:0Issues:0Issues:0

ears_dataset

Expressive Anechoic Recordings of Speech (EARS)

License:NOASSERTIONStargazers:0Issues:0Issues:0

SepReformer

Official repository of SepReformer for speech separation

Stargazers:0Issues:0Issues:0

torchcrepe

Pytorch implementation of the CREPE pitch tracker

License:MITStargazers:0Issues:0Issues:0

AudioDec

An Open-source Streaming High-fidelity Neural Audio Codec

License:NOASSERTIONStargazers:0Issues:0Issues:0

se-scaling

Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement"

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

silero-vad

Python Wrapper of Silero VAD

License:MITStargazers:0Issues:0Issues:0

SEtrain

A training code template for DNN-based speech enhancement.

Stargazers:0Issues:0Issues:0

BERP

The pytorch implementation of BERP: A Blind Estimator of Room acoustic and physical Parameters

License:GPL-3.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

gtcrn

The official implementation of GTCRN, an ultra-lite speech enhancement model.

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

ddsp

DDSP: Differentiable Digital Signal Processing

License:Apache-2.0Stargazers:0Issues:0Issues:0