qwopqwop200 / GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the right perplexity number?

JianbangZ opened this issue · comments

for base FP16 model
--eval gives 5.68 PPL on wikitext2
--benchmark 2048 gives 6.43 on wikitext2

What's the difference?