qqsantaclaus / MetricGAN

MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement (ICML 2019, with Travel awards)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement (ICML 2019)

Introduction

MetricGAN is a Generative Adversarial Networks (GAN) based black-box metric scores optimization method. By associating the discriminator (D) with the metrics of interest, MetricGAN can be treated as an iterative process between surrogate loss learning and generator learning as shown in the following figure.

This code (developed with Keras) applies MetricGAN to optimize PESQ or STOI score for Speech Enhancement. It can be easily extended to optimize other metrics.

For more details and evaluation results, please check out our paper.

teaser

Dependencies:

  • Python 2.7
  • keras=2.0.9
  • librosa=0.5.1

Note!

  1. As mentioned in the paper, the input features and activation functions used in table 2 are different from those provided here.

  2. The following codes are created by others:

  • SpectralNormalizationKeras: SpectralNormalization in Keras
  • pystoi: stoi calculatuin in python (modified by me)
  • The PESQ file can only be implemented in Linux environment.

Citation

If you find the code useful in your research, please cite:

@inproceedings{fu2019metricGAN,
  title     = {MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement},
  author    = {Fu, Szu-Wei and Liao, Chien-Feng and Tsao, Yu and Lin, Shou-De},
  booktitle = {International Conference on Machine Learning (ICML)},
  year      = {2019}
}

Contact

e-mail: jasonfu@iis.sinica.edu.tw or d04922007@ntu.edu.tw

About

MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement (ICML 2019, with Travel awards)


Languages

Language:Python 50.3%Language:MATLAB 41.7%Language:Jupyter Notebook 8.0%