Wrong tokens / second

Question

Wrong tokens / second

EugeoSynthesisThirtyTwo opened this issue 4 months ago · comments

EugeoSynthesisThirtyTwo commented 4 months ago

It says

Processing Prompt [BLAS] (1676 / 1676 tokens)
Generating (78 / 387 tokens)
(EOS token triggered!)
(Special Stop Token Triggered! ID:128009)
CtxLimit: 1754/8192, Process:25.05s (14.9ms/T = 66.89T/s), Generate:59.05s (152.6ms/T = 6.55T/s), Total:84.11s (4.60T/s)

But 6.55T/s is the speed that would have been achieved if the model generated 387 tokens. The model actually generated only 78 tokens, so the real generation speed is 78 / 59.05 = 1.32 tokens / s

LostRuins Concedo · Answer 1 · Sun May 19 2024 10:30:43 GMT+0800 (China Standard Time)

Will try to fix

LostRuins Concedo · Answer 2 · Fri May 24 2024 18:36:36 GMT+0800 (China Standard Time)

Can you see if the latest version solves this issue?

EugeoSynthesisThirtyTwo · Answer 3 · Tue May 28 2024 16:43:01 GMT+0800 (China Standard Time)

Can you see if the latest version solves this issue?

It's good thank you

However, as you can see, if I abort the generation, there is a new log "Generating (301 / 300 tokens)" which is wrong. I don't know if it's related. Let me know if I should open a new issue for this.

LostRuins Concedo · Answer 4 · Tue May 28 2024 18:40:10 GMT+0800 (China Standard Time)

Don't worry about that, probably just a minor thing.