chenshifei / Recipes

Recipes for training OpenNMT systems

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Recipes

Recipes for training OpenNMT systems

You will find here some "recipes" which basically script the end-to-end data preparation, preprocessing, training and evaluation.

Requirements

  • You do need OpenNMT - see here. If you clone Recipes.git repo at the same level as OpenNMT.git on your local computer, you don't need to update the PATH in the scripts. Otherwise update the line OPENNMT_PATH=../../OpenNMT
  • for evaluation scripts, you do need perl XML::Twig module (perl -MCPAN -e 'install XML::Twig)

The recipes

Baseline-1M-enfr

Train a baseline English-French model, use case feature and onmt reversible tokenization. GPU highly recommended. Training takes 75 minutes per epoch on a single GTX 1080. Parameters: 2x500 layers, 13 epochs. See script for the details. Data: set of 1 million parallel sentences (extract of Europarl, Newscommentaries, ..) See the results file for the evaluation.

Romance Multi-way

See http://forum.opennmt.net/t/training-romance-multi-way-model/86
GPU highly recommended. Training takes 4 1/2 hours per epoch on a single GTX 1080. Parameters: 2x500 layers, 13 epochs. See script for the details.

About

Recipes for training OpenNMT systems

License:MIT License


Languages

Language:Python 49.3%Language:Emacs Lisp 18.2%Language:Perl 13.5%Language:Shell 12.1%Language:Smalltalk 2.0%Language:Ruby 1.8%Language:NewLisp 1.7%Language:JavaScript 0.9%Language:Slash 0.3%Language:SystemVerilog 0.2%