slot unavailable + infinite
gretadolcetti opened this issue · comments
Greta Dolcetti commented
This is the code
def run(model, task, port):
client = OpenAI(
base_url=f"http://localhost:{port}/v1",
api_key="sk-no-key-required"
)
start_time = time.time()
try:
answer = client.chat.completions.create(
model=model,
temperature=1.0,
timeout=300,
messages=[
{"role": "system", "content": "<SYS_PROMPT">},
{"role": "user", "content": f'{task}. }
]
)
except Exception as e:
client.close()
raise e
except KeyboardInterrupt:
raise KeyboardInterrupt
response_time = time.time() - start_time
return answer.choices[0].message.content, response_time
that I am running inside a for loop to generate a list of task.
Just to understand,
for task in tasks:
run(model, task, port)
Sometimes everything goes great and I obtain an answer which is later written on an output file, while sometimes I obtain an infinite loop of slot 0: context shift - n_keep = 0, n_left = 510, n_discard = 255
and an error slot unavailable
.
I am running the server in another shell with
./models/codeninja-1.0-openchat-7b.Q4_K_M-server.llamafile --port 8081 --nobrowser --threads 8 -ngl 9999
how can I solve this: