ngruver / llmtime

I couldn't find the reason in Appendix to account for p_extra in NLL/D calculation. Could you please comment on this step? If I missed something, can you please point me to the right place?

llmtime/models/gpt.py

Line 121 in adefc38

    
           # adjust logprobs by removing extraneous and renormalizing (see appendix of paper)

I am also curious whether this function will ensure a non-negative constraint on the return values.
Thanks in advance!

Apologies for my delay in responding!

p_extra is the probability mass associated with tokens that do not have a role in our numerical encoding scheme (tokens that are not digits, separators, or signs). We can adjust the original log probabilities by p_extra in order to obtain a discrete distribution over fixed precision numbers (which correspond to bins as described in the paper) and then a corresponding a continuous density.

Unfortunately it is not possible to do this filtering of non-numerical tokens exactly, because the OpenAI API only returns log probabilities for the top 5 tokens beyond the sampled token. Thus the normalization values are larger in some cases than they could be (and corresponding probabilities smaller). In our experiments with LLaMA-2 models, there is no such limit, and we can perform the filtering perfect.

We will add a note explaining this detail to the Appendix, as originally intended. Thank you for pointing this oversight out!

Nate

Description not found for p_extra