princeton-nlp / ALCE

[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to reproduce Qampari Closedbook results.

raghavlite opened this issue · comments

I ran your config "qampari_turbo_shot2_closedbook.yaml". I am unable to reproduce the correctness numbers for chatgpt.

Numbers I get are
{'length': 14.978, 'str_em': 0, 'str_hit': 0, 'num_preds': 5.331, 'qampari_prec': 22.26018716946968, 'qampari_rec': 13.996595485987104, 'qampari_rec_top5': 23.6, 'qampari_f1': 15.88350010581282, 'qampari_f1_top5': 22.38301992719087, 'citation_rec': 12.659782645716858, 'citation_prec': 12.659782645716858}

Could you please help.

Hi,

Sorry for the trouble. Note that our results were from the 0301 version of gpt-3.5 and make sure you use the correct version. With that being said, we can't predict changes that OpenAI makes to the API and those changes make cause performance difference.

You can also try running other configs and see how the results look. If they are all different from the reported one, that means either it's not the same API model or OpenAI has changed settings on their end.