Reproduction Issue

Question

Reproduction Issue

coderalo opened this issue 3 years ago · comments

Hi, thanks for releasing the code, it's well-organized and easy to run!
However, when I tried to reproduce the result reported in paper for Huffpost and FewRel datasets by running bin/our.sh, the result seems to differ, and cannot be simply explained by the randomness. I ran each experiments for 5 random seeds, for the test acc I got:

Huffpost: 1-shot 41.9(0.72), 5-shot 62.8(0.27).
FewRel: 1-shot 64.5(0.39), 5-shot 82.5(0.50).

All were lower than the reported ones from 1~3%.
Thanks for any help in advance.

Yujia Bao · Answer 1 · Thu Nov 04 2021 01:52:27 GMT+0800 (China Standard Time)

Thank you for your interest in our work!

I modified the code to fit the latest PyTorch and Python version this morning and got similar results as yours when running it on our local machine. Moreover, the results for the baselines (attached below) are also lower than the number reported in Table 1. The main conclusion of our paper still holds.

The experiments in Table 1 were conducted 3 years ago on IBM cloud servers. I suspect the performance variation is due to the infrastructure / library difference. Please let me know if there is anything I can help.

Rep.	Alg.	HuffPost (1 shot)	FewRel (1 shot)
Avg	Proto	32.95	40.9
Idf	Proto	32.79	43.2
CNN	Proto	30.63	45.84
Avg	RR	33.33	52.09
Idf	RR	34.15	55.44
CNN	RR	35.07	55.84

Kai-Ling Lo · Answer 2 · Thu Nov 04 2021 02:33:18 GMT+0800 (China Standard Time)

Thanks for the quick response! I actually ran the code with the same library version you provided in README.md though, so it probably wasn't the issue.

If it's possible, would you provide any result on your side with the same library version you provided? Thanks.