A PyTorch implementation of DNN-based source separation.
- v0.6.1: Add modules.
Module | Reference | Done |
---|---|---|
Depthwise-separable convolution | ✔ | |
Gated Linear Units | ✔ | |
FiLM (Feature-wise Linear Modulation) | FiLM: Visual Reasoning with a General Conditioning Layer | ✔ |
PoCM (Point-wise Convolutional Modulation) | LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation | ✔ |
Method | Reference | Done |
---|---|---|
Pemutation invariant training (PIT) | Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks | ✔ |
One-and-rest PIT | Recursive Speech Separation for Unknown Number of Speakers | ✔ |
Probabilistic PIT | Probabilistic Permutation Invariant Training for Speech Separation | |
Sinkhorn PIT | Towards Listening to 10 People Simultaneously: An Efficient Permutation Invariant Training of Audio Source Separation Using Sinkhorn's Algorithm | ✔ |
LibriSpeech example using Conv-TasNet
You can check other tutorials in <REPOSITORY_ROOT>/egs/tutorials/
.
cd <REPOSITORY_ROOT>/egs/tutorials/common/
. ./prepare_librispeech.sh --dataset_root <DATASET_DIR> --n_sources <#SPEAKERS>
cd <REPOSITORY_ROOT>/egs/tutorials/conv-tasnet/
. ./train.sh --exp_dir <OUTPUT_DIR>
If you want to resume training,
. ./train.sh --exp_dir <OUTPUT_DIR> --continue_from <MODEL_PATH>
cd <REPOSITORY_ROOT>/egs/tutorials/conv-tasnet/
. ./test.sh --exp_dir <OUTPUT_DIR>
cd <REPOSITORY_ROOT>/egs/tutorials/conv-tasnet/
. ./demo.sh