mybigday / llama.rn

React Native binding of llama.cpp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Early stopping inference

JEF1056 opened this issue · comments

commented

Shouldn't there be a function that allows the user to stop inference? Could be implemented as a callback function just like in whisper.rn's realtimeInference()

context.stopCompletion() is a way to stop inference.

This was simply designed to perform only one completion on the context at the same time, but now we're able to do parallel decoding, so that may change in the future.