max length not being used in moverscore.py
Alex-Fabbri opened this issue · comments
For pytorch_pretrained_bert==0.6.2, tokenizer.max_len=1000000000000 for BertTokenizer. The max_len works as expected in moverscore_v2.py as it uses the updated transformers repo. Put in a hard-coded len to fix it but just wanted to point it out!
Thanks for the feedback, Alex! The repo is indeed old-fashioned... Besides max-length, I will add two more things soon: normalizing scores (1/1+score) and parallelly-running WMD.
Sounds good, thanks!