li-plus / chatglm.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Long context result (>450 tokens) from server will stop / return incorrect json (stream mode)

x4080 opened this issue · comments

Hello,

I'm using chatglm3-32k-ggml-q4_0.bin, openai_api, when trying to create question and answer that returns >450 tokens it will display error on my frontend javascript (stream mode) :

Uncaught SyntaxError: Unexpected end of JSON input

I tested it using llama cpp (server) and same prompt -> no error occured

Should I modify openai_api parameter to enable long tokens ? I think already change max_token to 8192 and max_context_length to 8192 also

Thanks