HabanaAI / Fairseq

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool



Support Ukraine MIT License Latest Release Build Status Documentation Status CicleCI Status

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. This repo is forked from Fairseq and includes changes to run models on Intel® Gaudi® AI accelerators.

We provide reference implementations of various sequence modeling papers:

List of implemented papers

Requirements and Installation

  • Please follow the instructions provided in the Gaudi Installation Guide to set up the environment. To achieve the best performance, please follow the methods outlined in the Optimizing Training Platform guide. The guides will walk you through the process of setting up your system to run the model on Gaudi.
  • To install fairseq and develop locally:
git clone https://github.com/HabanaAI/fairseq
cd fairseq
pip install --editable ./

Getting Started

List of models for which training has been tested on Gaudi devices:

In order to train another model available in fairseq (other than those listed above) on Gaudi device, please follow the instructions below,

  • Use "--hpu" argument when invoking command-line tools such as fairseq-train, fairseq-interactive, fairseq-generate etc.
  • Enable mixed precision training by using "--hpu-mixed-precision-mode autocast" when invoking command-line tools such as fairseq-train.

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

Please cite as:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}

About

License:MIT License


Languages

Language:Python 98.0%Language:Cuda 0.9%Language:C++ 0.5%Language:Cython 0.3%Language:Lua 0.1%Language:Shell 0.1%