mjeensung / transformer

Re-implementation of Attention Is All You Need (NIPS 2017)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Transformer

Re-implementation of Attention Is All You Need (NIPS 2017)

Requirements

  • python=3.6.0
  • pytorch=0.4.1
  • sencencepiece=0.1.82
  • tensorboardX

Download datasets

Use IWSLT17 fr-en. The following script downloads datasets and preprocess them.

$ download.sh

Sentence Piece Model

Train SentencePiece tokenizer for subword-level segmentation in sentences.

$ python tokenizer.py --vocab_size 16000

Train model

$ MODEL_NAME="model"
$ python main.py --mode train \
                 --datapath './datasets/iwslt17.fr.en'\
                 --langpair 'fr-en'\
                 --epoch 100\
                 --learning_rate 0.0001\
                 --max_seq_len 50\
                 --model_name ${MODEL_NAME}

Plot train_loss and val_loss via TensorboardX. Alt text

Evaluate

Use SacreBLEU to evaluate the model based on BLEU score.

MODEL_NAME="model"
INPUT_NAME="./iwslt17-fr-en.in"
OUTPUT_NAME="./iwslt17-fr-en.out"
sacrebleu -t iwslt17 -l fr-en --echo src > ${INPUT_NAME}
python main.py --mode test \
    --model_name ${MODEL_NAME} \
    --eval_input ${INPUT_NAME} \
    --eval_output ${OUTPUT_NAME}
cat ${OUTPUT_NAME} | sacrebleu -t iwslt17 -l fr-en
checkout dataset decoding BLEU
135bdabaabae068eb1319094ac4817ea4f72c934 iwslt17 fr-en greedy 30.7

TODO

  • Implement beam search
  • Evaluate the model on various testsets (eg. en->fr, de<->en)

About

Re-implementation of Attention Is All You Need (NIPS 2017)


Languages

Language:Python 94.1%Language:Shell 5.9%