Multichannel Subband-Fullband Gated Convolutional Recurrent Neural Network For Direction-Based Speech Enhancement With Head-Mounted Microphone Arrays

This repository contains the code for reproducing the results shown in the paper "Multichannel Subband-Fullband Gated Convolutional Recurrent Neural Network For Direction-Based Speech Enhancement With Head-Mounted Microphone Arrays".

WASPAA Paper

Demo

Note

The code in the spear-tools submodule is subject to its own licenses. If not licensed, all rights remain with the original author (Imperial College London).

The FullSubNet sub-repository also is third-party code and has separate licensing.

Installation

First cd into ./spear-tools and set up the SPEAR paths and symbolic links and the conda environment according to spear-tools/README.md.

Instructions

To run the models, you will need the packages defined in requirements.txt. Install them into your environment <your-env-name> using pip install -r requirements.txt.

For inference with the MaxDirAndFullsubnet method, you need to download the FullSubNet checkpoint model weights here into ./FullSubNet.

Model checkpoints are available here

Please adjust paths and other variables in the scripts train.py, validate.py, process, validate_baseline_unprocessed.py, view_metrics.py and in the config files as you need.

Author

This repository is authored by Benjamin Stahl and was created at the Intitute of Elctronic Music and Acoustics in Graz, Austria in 2022/23.

About

Multichannel Subband-Fullband Gated Convolutional Recurrent Neural Network For Direction-Based Speech Enhancement With Head-Mounted Microphone Arrays

Apache License 2.0

Languages

Language:Python 99.8%Language:Makefile 0.2%Language:CSS 0.0%