tianyi-lab / Cherry_LLM

[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

关于Direct Answer Score sθ(A)

DryPilgrim opened this issue · comments

想请教下面这个问题,非常感谢您的回答:)

为什么das越高,对模型越有挑战呢?das越高不是表明模型预测的概率越大,掌握得越好吗?(from paper: A higher direct answer score may suggest that the answer is inherently more challenging or intricate for the model to generate.)

我理解的das的自回归计算过程:

对于数据:{"instruction": "what do you like to eat?", "answer": "I like eating apples."}
das要衡量模型对answer本身的生成难度,das的自回归计算方式为:
i
i like 
I like eating
I like eating apples

Thank you so much for your interest in our work!
We are sorry for your misunderstanding, there should be minus signs in the DAS and CAS equations. With the minus sign, the logic is the same as loss or perplexity.
Sorry for the typos, we can not modify our manuscript yet due to the anonymous period 😂😂

Please refer to #7 and #4.

tks for your reply :-)