Adopting for usage of wav2vec features produced by fairseq-library
https://nvidia.github.io/OpenSeq2Seq/
NVIDIA Openseq2seq Pytorch Fairseq
- Use Fairseq Library to train a wav2vec model
- Use wav2vec model to featurize audio-files
- put wav2vec-files (.h5context file extension) in a folder called 'wav2vec_files' next to a folder containing original audio-files called 'wav_files'
- adjust your openseq2seq-config-file according to next section:
train_params = {
"data_layer": Speech2TextDataLayer,
"data_layer_params": {
"cache_features": True,
"cache_regenerate": False,
"cache_format": "wav2vec",
"num_audio_features": 512, #irrelevant but corrected
...
},
}