Forked OpenSeq2Seq

Adopting for usage of wav2vec features produced by fairseq-library

Documentation and installation instructions

https://nvidia.github.io/OpenSeq2Seq/

Acknowledgments

Usage

Use Fairseq Library to train a wav2vec model
Use wav2vec model to featurize audio-files
put wav2vec-files (.h5context file extension) in a folder called 'wav2vec_files' next to a folder containing original audio-files called 'wav_files'
adjust your openseq2seq-config-file according to next section:

train_params = {
    "data_layer": Speech2TextDataLayer,
    "data_layer_params": {
        "cache_features": True,
        "cache_regenerate": False,
        "cache_format": "wav2vec",
        "num_audio_features": 512, #irrelevant but corrected
        ...
    },
}

About

Fork from original openseq2seq

Apache License 2.0

Languages

Language:Python 91.2%Language:C++ 5.6%Language:Shell 2.0%Language:Jupyter Notebook 0.6%Language:Perl 0.3%Language:Makefile 0.1%Language:Starlark 0.1%Language:SWIG 0.1%