Kelvinson / spec_augment

A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SpecAugment.py

A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment is a SOTA-achieving data augmentation approach on speech recognition. The paper's authors did not publish code that I could find and their implementation was in TensorFlow.

To use:

  1. run install.sh (I recommend to use a unique conda env for the project)
  2. Check out SpecAugment.ipynb (a Jupyter notebook) for the functions.

Augmentations

  1. Time Warp (Coming Soon) This augmentation relies on a lot of functionality not yet in Pytorch, so I have to write it from scratch. I am working on it.

  2. Time Mask (DONE!)

  3. Frequency Mask (DONE!)

Let's be friends! @zachcaceres zach.dev

About

A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition


Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%Language:Shell 0.0%