zsc / End-to-end-ASR-Transformer

An end to end ASR Transformer model training repo

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

END TO END ASR TRANSFORMER

  • 本项目基于transformer 6encopder+6decoder的基本结构构造的端到端的语音识别系统

Model

Instructions

  • 1.数据准备:
    • 自行下载数据,遵循文件结构如下:
├── data
│   ├── train
│   ├── dev
│   ├── test
  • 2.数据预处理:
    • 运行prepare_data.py对数据进行预处理, 获得整个词表,每个样本音频的mel-scale-spectrogram,文本的token-ids
  • 3.模型训练:
    • 运行train_transformer.py --ngpus 8进行transformer网络的训练. 该网络输入mel-scale-spectrogram, 输出token-ids
  • 4.模型推理:
    • 运行evlauate.py在dev/test上测试准确率

Acknowledgements

Reference

  • Ashish Vaswani et al. “Attention Is All You Need” (2017).
  • Abdel-rahman Mohamed et al. “Transformers with convolutional context for ASR” arXiv: Computation and Language (2019): n. pag.
  • Albert Zeyer et al. “Improved Training of End-to-end Attention Models for Speech Recognition” Conference of the International Speech Communication Association (2018).

About

An end to end ASR Transformer model training repo

License:Apache License 2.0


Languages

Language:Python 100.0%