wenmt

This system has been tested in the following environment.

Name the file names of the datasets according to the variables in the wargs.py file

Source side: val_tst_dir + val_prefix + '.' + val_src_suffix
Target side:
- One reference
  val_tst_dir + val_prefix + '.' + val_ref_suffix
- multiple references
  val_tst_dir + val_prefix + '.' + val_ref_suffix + '0'
  val_tst_dir + val_prefix + '.' + val_ref_suffix + '1'
  ......

Source side: val_tst_dir + test_prefix + '.' + val_src_suffix
Target side:
for test_prefix in tests_prefix
- One reference
  val_tst_dir + test_prefix + '.' + val_ref_suffix
- multiple references
  val_tst_dir + test_prefix + '.' + val_ref_suffix + '0'
  val_tst_dir + test_prefix + '.' + val_ref_suffix + '1'
  ......

Before training, parameters about training in the file wargs.py should be configured
then, run sh train.sh

Assume that the trained model is named best.model.pt
Before decoding, parameters about inference in the file wargs.py should be configured

translate one sentence
run python bin/wtrans.py -m best.model.pt
translate one file
- put the test file to be translated into the path val_tst_dir + '/'
- run sh trans.sh filename

evaluate alignment

score-alignments.py -d path/900 -s zh -t en -g wa -i force_decoding_alignment