about the code of top-k filtering
DavidKong96 opened this issue · comments
thanks for your sharing.
what's the meaning of
x_exp = torch.exp(values - values[:, 0])
?
why need to do values -values[:0]?
Looking forward to your reply.
That is normalization for numerical stability. See also https://discuss.pytorch.org/t/how-to-implement-the-exactly-same-softmax-as-f-softmax-by-pytorch/44263