YoshikiMas / spear-tools

SPEAR Challenge scripts and tools.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SPEAR-tools

logo

This repository countains the code to compute the baseline and run the metrics used in the SPEAR challenge. To download the data go here.

Terminal commands to download are also provided below. Use the flag --continue-at if the download aborted to early.

cd <your-spear-directory>

# All core Train data. Size of 14, 20, 20 an 24GB compressed for D1, D2, D3 and D4 respectively.
curl https://spear2022data.blob.core.windows.net/spear-data/CoreTrainDataset1_v1.1.tar -o CoreTrainDataset1_v1.1.tar
curl https://spear2022data.blob.core.windows.net/spear-data/CoreTrainDataset2_v1.1.tar -o CoreTrainDataset2_v1.1.tar
curl https://spear2022data.blob.core.windows.net/spear-data/CoreTrainDataset3_v1.1.tar -o CoreTrainDataset3_v1.1.tar
curl https://spear2022data.blob.core.windows.net/spear-data/CoreTrainDataset4_v1.1.tar -o CoreTrainDataset4_v1.1.tar

# ATFs
curl https://spear2022data.blob.core.windows.net/spear-data/Device_ATFs.h5 --create-dirs -o Miscellaneous/Array_Transfer_Functions/Device_ATFs.h5

# All core Dev data (size of 24GB unpacked, 35GB decompressed)
curl https://spear2022data.blob.core.windows.net/spear-data/CoreDev_1.1.tar -o CoreDev_v1.1.tar

Note that the ATFs should be downloaded in a directory called Miscellaneous/Array_Transfer_Functions at the same level as the extracted Main and Extra folders for the example bash script to run. The file is already included in CoreDev but not in the CoreTrain files.

Analysis tools

diagram

The provided framework consists mainly of three python scripts (more details are found in the powerpoint presentation in assets):

  • spear_enhance.py computes the baseline enhancement (MVDR beamformer) and output processed audio files.
  • spear_evaluate.py computes the metrics on the processed audio files and output a csv file with all metrics for all chunks.
  • spear_visualise.py plot the results of the computed metrics csv file and save them.

The resulting metrics csv files for passthrough and baseline enhancements are already generated and provided in metrics. To compute those files yourself, a bash script run_example.sh is provided. Instructions are found below.

Installation

We assume that conda is installed. If not follow the instruction here.

# First clone the repo
git clone https://github.com/ImperialCollegeLondon/spear-tools
cd spear-tools

# Second create & activate environment with conda
conda env create -n <your-env-name> -f env-spear-metrics.yml
conda activate <your-env-name>
cd analysis

For windows users, Microsoft C++ build tools are needed when building the environment.

How to Use

To run an example, use the command bash run_example.sh in the analysis folder. The following parameters can be modified

# Setup soft links to make paths easier
ln -sf <outputs-folder-full-path> my_results  # Output path for the processed audio, metrics csv and plots
ln -sf <dataset-folder-full-path> spear_data  # Root folder of the SPEAR datasets containing the folders Main and Extra

# Define variables
SET='Dev'               # 'Train', 'Dev' or 'Eval'.
DATASET=2               # 1 to 4.
SESSION='10'            # 1 to 9 for Train, 10 to 12 for Dev, 13 to 15 for Eval. Select '' for all session of current Dev.
MINUTE='00'             # 00 to 30 (depends on the current session). '' for all minutes of current session.
PROCESSING='baseline'   # Name of desired processing. 'baseline' by default
REFERENCE='passthrough' # Name of desired reference enhancement. 'passthrough' by default.

The rest of the script computes the enhancement, metrics evaluation and plotting for both passthrough and the desired METHOD.

python spear_enhance.py   <input-root> <audio-dir> --method_name <method-name> --list_cases <list-cases>
python spear_evaluate.py  <input-root> <audio-dir> <segments-csv> <metrics-csv> --list_cases <list-cases>
python spear_visualise.py <output-root> <metrics-csv-ref> <metrics-proc> --reference_name <reference-name> --method_name <method-name>

List of metrics

Category Metric Metric Abbreviation Reference Python Package
SNR Signal to Noise Ratio SNR [-] PySEPM
SNR Frequency weighted Segmental SNR fwSegSNR Hu et al. 2008 1 PySEPM
SI Short Time Objective Intelligibility STOI Taal et al. 2011 2 PYSTOI
SI Extended STOI ESTOI Jensen et al. 2016 3 PYSTOI
SI Modified Binaural STOI MBSTOI Andersen et al. 2018 4 CLARITY
SI Speech to Artifacts Ratio SAR Vincent et al. 2006 5 Speech Metric
SI Image to Spatial Ratio ISR Vincent et al. 2006 5 Speech Metric
SI Speech to Distortion Ratio SDR Vincent et al. 2006 5 Speech Metric
SI Scale Invariant SDR SI-SDR Roux et al. 2019 6 Speech Metric
SI Hearing Aid Speech Perception Index HASPI Kates et al. 2014 7 8 CLARITY
SQ Perceptual Evaluation of Speech Quality PESQ Rix et al 2001 9 Speech Metric
SQ PESQ Narrow Band PESQ-NB Rix et al 2001 9 Speech Metric

References

Footnotes

  1. Y. Hu and P. C. Loizou, “Evaluation of objective quality measures for speech enhancement,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 1, pp. 229–238, 2008.

  2. C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, “An algorithm for intelligibility prediction of time–frequency weighted noisy speech,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 7, pp. 2125–2136, 2011.

  3. J. Jensen and C. H. Taal, “An algorithm for predicting the intelligibility of speech masked by modulated noise maskers,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 11, pp. 2009–2022, 2016.

  4. A. H. Andersen, J. M. de Haan, Z.-H. Tan, and J. Jensen, “Refinement and validation of the binaural short time objective intelligibility measure for spatially diverse conditions,” Speech Communication, vol. 102, pp. 1–13, 2018.

  5. E. Vincent, R. Gribonval, and C. F ́evotte, “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1462–1469, 2006. 2 3

  6. J. Barker, S. Graetzer, and T. Cox, “Software to support the 1st Clarity Enhancement Challenge [software and data collection],” https://doi.org/10.5281/zenodo.4593856, 2021.

  7. J. M. Kates and K. H. Arehart, “The hearing-aid speech perception index (haspi),” Speech Communication, vol. 65, pp. 75–93, 2014.

  8. J. M. Kates and K. H. Arehart, “The hearing-aid speech perception index (haspi) version 2,” Speech Communication, vol. 131, pp. 35–46, 2021.

  9. A. Rix, J. Beerends, M. Hollier, and A. Hekstra, “Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs,” in 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), vol. 2, pp. 749–752 vol.2, 2001. 2

About

SPEAR Challenge scripts and tools.


Languages

Language:Python 97.2%Language:Shell 2.8%