Responses are truncated

Question

Responses are truncated

cleesmith opened this issue 10 months ago · comments

I've attached a screen capture of responses being truncated.

Also, an image of my Settings; just in case, and I am trying the:
prometheus-13b-v1.0.Q5_K_M.gguf which seems similar to GPT-4 (sort of).

Changing the System Prompt did not seem to have any affect.

While less often, the truncation did happen with the default model too.

Thoughts?

Peter Sugihara · Answer 1 · Fri Dec 01 2023 03:13:00 GMT+0800 (China Standard Time)

Hm, my guess here is that it's an issue with Prompt format. It looks like this model is trained on a very unusual prompt format meant more for using programmatically than as a chat model: https://huggingface.co/kaist-ai/prometheus-13b-v1.0#prompt-format

I'd recommend trying MythoMist, Openhermes 2.5 or any of the top ranked ones here: https://www.otherbrain.world/?columnFilters=%5B%7B%22id%22%3A%22numParameters%22%2C%22value%22%3A%5B1%2C3%2C6%2C7%2C11%2C12%2C13%5D%7D%5D

Peter Sugihara · Answer 2 · Fri Dec 01 2023 03:14:09 GMT+0800 (China Standard Time)

There is an update in main that will make it into the app store soon that fixes an issue with longer chats and may fix the truncation you're hitting occasionally with the default model.

Christopher · Answer 3 · Fri Dec 01 2023 06:53:16 GMT+0800 (China Standard Time)

Thanks I will keep an eye out, and try one of the other models.

Peter Sugihara · Answer 4 · Fri Dec 01 2023 12:19:03 GMT+0800 (China Standard Time)

Sounds good. I appreciate all the candid feedback.

Christopher · Answer 5 · Sat Dec 02 2023 00:51:08 GMT+0800 (China Standard Time)

Thanks again, switching to: openhermes-2.5-neural-chat-7b-v3-1-7b.Q5_K_M.gguf on my old iMac 3.3GHz 6-core Intel i5 with 16GB, so now it does work, slowly, but it's time to upgrade to a M3 perhaps with more memory (if that helps). So far your app works well, and look forward to the updates.

Peter Sugihara · Answer 6 · Sat Dec 02 2023 01:56:38 GMT+0800 (China Standard Time)

Beautiful. Yes, apple silicon is incredible for this stuff and I definitely recommend as much RAM as you can reasonably afford. With multi-modal models and breakthroughs at the 70B level I'm wishing I had more too :)