About BLEU scores with default settings

Question

About BLEU scores with default settings

ganymedetitan opened this issue 6 years ago · comments

Hi! Thank you for opensourcing your qg work.

I tried running with default parameters and because I got errors when running qgevalcap, I evaluated with bleus with coco-caption :
https://github.com/XgDuan/coco-caption/
And got 34.79/19.04/12.29/8.44 for BLEU1~4, which is a bit behind scores reported in ACL paper.
I am not sure did I accidentally do something wrong since I am not familiar with lua codes, or is it just a different implementation/parameter with bleu scores (if there is)?

Sorry for very poor English. :/
Thank you very much!

xinyadu · Answer 1 · Sat May 05 2018 00:05:48 GMT+0800 (China Standard Time)

my evaluation scripts are slightly different from the coco-caption scripts, because they use images as keys, i used the input sentences as keys and compare the output with *multiple* gold references of the input sentence, i guess that's why you are getting lower scores.

…

On Fri, May 4, 2018 at 3:13 AM, ganymedetitan ***@***.***> wrote: Hi! Thank you for opensourcing your qg work. I tried running with default parameters and because I got errors when running qgevalcap, I evaluated with bleus with coco-caption : https://github.com/XgDuan/coco-caption/ And got 34.79/19.04/12.29/8.44 for BLEU1~4, which is a bit behind scores reported in ACL paper. I am not sure did I accidentally do something wrong since I am not familiar with lua codes, or is it just a different implementation/parameter with bleu scores (if there is)? Sorry for very poor English. :/ Thank you very much! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#21>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJb8YWy9ufesZNkp3brKD914ZqkvvrLpks5tu_-tgaJpZM4TyM7w> .

ganymedetitan · Answer 2 · Mon May 14 2018 14:25:08 GMT+0800 (China Standard Time)

Thank you for quick reply!