Wrong tokens / second
EugeoSynthesisThirtyTwo opened this issue · comments
EugeoSynthesisThirtyTwo commented
It says
Processing Prompt [BLAS] (1676 / 1676 tokens)
Generating (78 / 387 tokens)
(EOS token triggered!)
(Special Stop Token Triggered! ID:128009)
CtxLimit: 1754/8192, Process:25.05s (14.9ms/T = 66.89T/s), Generate:59.05s (152.6ms/T = 6.55T/s), Total:84.11s (4.60T/s)
But 6.55T/s is the speed that would have been achieved if the model generated 387 tokens. The model actually generated only 78 tokens, so the real generation speed is 78 / 59.05 = 1.32 tokens / s
LostRuins Concedo commented
Will try to fix
LostRuins Concedo commented
Can you see if the latest version solves this issue?
EugeoSynthesisThirtyTwo commented
LostRuins Concedo commented
Don't worry about that, probably just a minor thing.