Clarification re: --num_samples value

Question

Clarification re: --num_samples value

nelson-liu opened this issue 3 years ago · comments

Hi!

I enjoyed reading your paper, and thanks for releasing this nice codebase. I had a quick question: In appendix B, it's mentioned that When finetuning with demonstrations, we sample 16 different sets of demonstrations for each input and average the predicted log probability for each class during inference. However, I noticed that QQP, MNLI, and SNLI seem to use --num_sample 4 by default in the run_experiment.sh script (e.g., https://github.com/princeton-nlp/LM-BFF/blob/main/run_experiment.sh#L41 ). If I wanted to faithfully reproduce the results of the paper, should I set num_sample to 16 for these tasks?

Thanks!

Tianyu Gao · Answer 1 · Wed Apr 07 2021 08:22:22 GMT+0800 (China Standard Time)

Hi,

Indeed for those large datasets we only take --num_sample 4 in the experiments for efficiency issues, for we found that using 16 does not bring significant improvement. To faithfully reproduce the results, you should keep the script unchanged (i.e., take --num_sample 4). Thanks for noticing this and we will add more details in the appendix in our next revision.

Nelson Liu · Answer 2 · Wed Apr 07 2021 09:32:55 GMT+0800 (China Standard Time)

Thanks for the prompt response!