Long context result (>450 tokens) from server will stop / return incorrect json (stream mode)
x4080 opened this issue · comments
Hello,
I'm using chatglm3-32k-ggml-q4_0.bin, openai_api, when trying to create question and answer that returns >450 tokens it will display error on my frontend javascript (stream mode) :
Uncaught SyntaxError: Unexpected end of JSON input
I tested it using llama cpp (server) and same prompt -> no error occured
Should I modify openai_api parameter to enable long tokens ? I think already change max_token to 8192 and max_context_length to 8192 also
Thanks