hwang9u / simple-speech-enhancement

Simple Convolutional Auto-Encoder for variable-length speech enhancement

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simple CAE(Convolutional Auto-Encoder) for Variable-Length Speech Denoising

Check notebook and Listen examples here


In this toyproject, I built simplest CAE(Convolutional Auto-Encoder) architecture for speech denoising.


Model

Encoder

  • A stack of 3 encoder blocks.
  • Since encdoer does not flatten the output, speech input with variable-length can be used.
  • Encoder block: Conv2d -> BatchNorm2d -> LeakyReLU

Decoder

  • A stack of 3 decoder blocks.
  • Decoder block: ConvTranspose2d -> LeakyReLU
  • Last decoder block contains only ConvTranspose2d.

+) Loss function

  • MaskedMSELoss: ignoring padding area in MSE loss computation.

Dataset

  • Clean speech dataset: "YesNo" dataset(torchaudio.datasets.yesno)

  • Noise signal: "noisesB" dataset (Libri Speech Noise Datase)

  • Noisy signal: clean speech signal + noise signal (with a specified SNR) <-- noisy.py



Examples(on Validation dataset)


About

Simple Convolutional Auto-Encoder for variable-length speech enhancement

License:MIT License


Languages

Language:Python 100.0%