NSNet

This in an implementation of NSNet [1] in PyTorch and PyTorch Lightning. NSNet is a recurrent neural network for single channel speech enhancement. This was implemented as part of my thesis for the Master in Electrical Engineering at Ghent University.

Prerequisites

torch 1.4
pytorch_lightning 0.7.6
torchaudio 1.4
soundfile 0.10.3.post1

How to train

A dataset containing both clean speech and corresponding noisy speech (i.e. clean speech with noise added) is required.

Running train_nn.py starts the training.

The train_dir variable should contain the path to a folder containing a clean and a noisy folder, containing the clean WAV files and the noisy WAV files respectively. The filename of a noisy WAV file must be the same as the corresponding clean WAV file, with optionally a suffix added delimited by +, e.g. clean01.wav → clean01+noise.wav

The val_dir follows the same convention, but this folder is used for validation.

How to test

Running the test_nn.py file results in the output (denoised) WAV files.

testing_dir should point to a folder with the same structure as train_dir and val_dir.

Acknowledgements

[1] Y. Xia, S. Braun, C. K. A. Reddy, H. Dubey, R. Cutler, and I. Tashev, “Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement,” arXiv:2001.10601 [cs, eess], Feb. 2020.

About

This in an implementation of NSNet in PyTorch and PyTorch Lightning. NSNet is a recurrent neural network for single channel speech enhancement.

Languages

Language:Python 100.0%