WernerDreier / openseq2seq

Fork from original openseq2seq

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

License Documentation

OpenSeq2Seq

Forked OpenSeq2Seq

Adopting for usage of wav2vec features produced by fairseq-library

Documentation and installation instructions

https://nvidia.github.io/OpenSeq2Seq/

Acknowledgments

NVIDIA Openseq2seq Pytorch Fairseq

Usage

  • Use Fairseq Library to train a wav2vec model
  • Use wav2vec model to featurize audio-files
  • put wav2vec-files (.h5context file extension) in a folder called 'wav2vec_files' next to a folder containing original audio-files called 'wav_files'
  • adjust your openseq2seq-config-file according to next section:
train_params = {
    "data_layer": Speech2TextDataLayer,
    "data_layer_params": {
        "cache_features": True,
        "cache_regenerate": False,
        "cache_format": "wav2vec",
        "num_audio_features": 512, #irrelevant but corrected
        ...
    },
}

About

Fork from original openseq2seq

License:Apache License 2.0


Languages

Language:Python 91.2%Language:C++ 5.6%Language:Shell 2.0%Language:Jupyter Notebook 0.6%Language:Perl 0.3%Language:Makefile 0.1%Language:Starlark 0.1%Language:SWIG 0.1%