Reproduced F1 Score Different

Question

Reproduced F1 Score Different

ashokchhetri7 opened this issue a year ago · comments

Hi, there!
Thank you for outsourcing this code. I attended your tutorial in the ACL2023 too, and that was great.

Getting back to the issue, I tried to reproduce the code, but the F1 Score is a bit different than the paper; Is there some suggestion on how can I improve the score?

Variants and Metics : | F1 | PPL | B-2 | B-4 | R-L
Strat Esconv Dataset | 24.66 | 15.92 | 8.31 | 2.51 | 17.05
Strat Esc Reproduced | 20.80 | 15.52 | 8.62 | 2.59 | 17.68

Strat MI Dataset | 25.91 | 13.84 | 8.52 | 2.72 | 18.00
Strat MI Reprod: |21.59| 13.38 | 9.50 | 2.94 | 18.48

I have used the same configurations, and the GPU used is NVIDIA RTX A5000.

Also, Can you provide the vanilla baseline code too? Thanks in Advance.

dengyang17 · Answer 1 · Tue Aug 15 2023 00:00:49 GMT+0800 (China Standard Time)

Thanks so much.
Please make sure whether you get the "Macro" F1 score. In the paper, we report the Macro F1 score, but in the code, there are three different F1 scores recorded in the results.

The vanilla baseline code is also in the repo. You may need to change the config.