Update Evaluation used for V.1.1.1

Question

Update Evaluation used for V.1.1.1

jordiclive opened this issue 3 years ago · comments

The references in /evaluation/dart_reference are not for the current version. Can you replace with the new references and share the tokenization script that is done to the predictions.

I am getting very different BLEU scores depending on tokenization, and how many references I use.
As there are up to ~30 for a few examples.

I would like to directly compare against the README leaderboard.

Xuechen Li · Answer 1 · Sun Aug 01 2021 08:01:00 GMT+0800 (China Standard Time)

Upvoting this, since having the same issue here.

LemonQC · Answer 2 · Wed Aug 18 2021 15:59:38 GMT+0800 (China Standard Time)

How to run BART in the model. Could you provide more details about running environment and python script.