NeurST: Neural Speech Translation Toolkit

NeurST aims at easily building and training end-to-end speech translation, which has the careful design for extensibility and scalability. We believe this design can make it easier for NLP researchers to get started. In addition, NeurST allows researchers to train custom models for translation, summarization and so on.

Features

Models

NeurST provides reference implementations of various models, including:

Transformer (self-attention) networks
- Attention Is All You Need (Vaswani et al., 2017)
- Pay Less Attention With Lightweight and Dynamic Convolutions (Wu et al., 2019)
comming soon...

Recipes and Benchmarks

NeurST provides several strong and reproducible benchmarks for various tasks:

Translation
- coming soon...
Speech-to-Text
- Augmented Librispeech

Additionally

multi-GPU (distributed) training on one machine or across multiple machines
- MirroredStrategy / MultiWorkerMirroredStrategy
- Byteps / Horovod
mixed precision training (trains faster with less GPU memory)
multiple search algorithms implemented:
- beam search
- sampling (unconstrained, top-k and top-p)
large mini-batch training even on a single GPU via delayed updates (gradient accumulation)
TensorFlow savedmodel for TensorFlow-serving
TensorFlow XLA support for speeding up training
extensible: easily register new datasets, models, criterions, tasks, optimizers and learning rate schedulers

Requirements and Installation

Python version >= 3.6
TensorFlow >= 2.3.0

Install NeurST from source:

git clone https://github.com/bytedance/neurst.git
cd neurst/
pip3 install -e .

If there exists ImportError during running, manually install the required packages at that time.

Contact

Any questions or suggestions, please feel free to contact us: zhaochengqi.d@bytedance.com, wangmingxuan.89@bytedance.com.

neopro12 / neurst