EHNet

This in an implementation of EHNet [1] in PyTorch and PyTorch Lightning. EHNet is a convolutional-recurrent neural network for single channel speech enhancement.

Prerequisites

torch 1.4
pytorch_lightning 0.7.6
torchaudio 1.4
soundfile 0.10.3.post1

How to train

A dataset containg both clean speech and corresponding noisy speech (i.e. clean speech with noise added) is required. 3 notebooks are included to generate this dataset from a dataset consisting of clean speech recordings and noise recordings.

Running train_nn.py starts the training.

The train_dir variable should contain the path to a folder containing a clean and a noisy folder, containing the clean WAV files and the noisy WAV files respectively. The filename of a noisy WAV file must be the same as the corresponding clean WAV file, with optionally a suffix added delimited by +, e.g. clean01.wav → clean01+noise.wav

The val_dir follows the same convention, but this folder is used for validation.

How to test

Running the test_nn.py file results in the output (denoised) WAV files.

testing_dir should point to a folder with the same structure as train_dir and val_dir.

Acknowledgements

[1] H. Zhao, S. Zarar, I. Tashev, and C.-H. Lee, "Convolutional-Recurrent Neural Networks for Speech Enhancement," arXiv:1805.00579 [cs, eess], May 2018.

About

This in an implementation of EHNet in PyTorch and PyTorch Lightning. EHNet is a convolutional-recurrent neural network for single channel speech enhancement.

Languages

Language:Jupyter Notebook 86.2%Language:Python 13.8%