There are 60 repositories under speech-separation topic.
A PyTorch-based Speech Toolkit
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
The PyTorch-based audio source separation toolkit for researchers
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Unofficial PyTorch implementation of Google AI's VoiceFilter system
A must-read paper for speech separation based on neural networks
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
The dataset of Speech Recognition
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Deep Recurrent Neural Networks for Source Separation
Deep learning based speech source separation using Pytorch
Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of separating sources from mixtures.
Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.
A PyTorch implementation of DNN-based source separation.
A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" (see recipes in aps framework https://github.com/funcwj/aps)
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
Deep neural network (DNN) for noise reduction, removal of background music, and speech separation
Executable code based on Google articles
A framework for quick testing and comparing multi-channel speech enhancement and separation methods, such as DSB, MVDR, LCMV, GEVD beamforming and ICA, FastICA, IVA, AuxIVA, OverIVA, ILRMA, FastMNMF.
A personal toolkit for single/multi-channel speech recognition & enhancement & separation.
Pytorch implements Deep Clustering: Discriminative Embeddings For Segmentation And Separation
A PyTorch implementation of Time-domain Audio Separation Network (TasNet) with Permutation Invariant Training (PIT) for speech separation.
deep clustering method for single-channel speech separation
Speech separation with utterance-level PIT experiments
A unofficial Pytorch implementation of Google's VoiceFilter
Script to calculate SNR and SDR using python
According to funcwj's uPIT, the training code supporting multi-gpu is written, and the Dataloader is reconstructed.