Long context result (>450 tokens) from server will stop / return incorrect json (stream mode)

Question

Long context result (>450 tokens) from server will stop / return incorrect json (stream mode)

x4080 opened this issue 6 months ago · comments

Hello,

I'm using chatglm3-32k-ggml-q4_0.bin, openai_api, when trying to create question and answer that returns >450 tokens it will display error on my frontend javascript (stream mode) :

Uncaught SyntaxError: Unexpected end of JSON input

I tested it using llama cpp (server) and same prompt -> no error occured

Should I modify openai_api parameter to enable long tokens ? I think already change max_token to 8192 and max_context_length to 8192 also

Thanks