Description not found for p_extra
pg2455 opened this issue · comments
I couldn't find the reason in Appendix to account for p_extra
in NLL/D calculation. Could you please comment on this step? If I missed something, can you please point me to the right place?
Line 121 in adefc38
I am also curious whether this function will ensure a non-negative constraint on the return values.
Thanks in advance!
Apologies for my delay in responding!
p_extra
is the probability mass associated with tokens that do not have a role in our numerical encoding scheme (tokens that are not digits, separators, or signs). We can adjust the original log probabilities by p_extra
in order to obtain a discrete distribution over fixed precision numbers (which correspond to bins as described in the paper) and then a corresponding a continuous density.
Unfortunately it is not possible to do this filtering of non-numerical tokens exactly, because the OpenAI API only returns log probabilities for the top 5 tokens beyond the sampled token. Thus the normalization values are larger in some cases than they could be (and corresponding probabilities smaller). In our experiments with LLaMA-2 models, there is no such limit, and we can perform the filtering perfect.
We will add a note explaining this detail to the Appendix, as originally intended. Thank you for pointing this oversight out!
Nate