llama3 template

Question

llama3 template

kalle07 opened this issue a month ago · comments

ok all is working fine

if i dowload your llama3 model the prompt template is ok.
is it possible to handle all models that have in there names: "llama-3" or "llama3" or "llama 3" that these prompt is ready to use ?

Phil209 · Answer 1 · Thu May 02 2024 11:40:13 GMT+0800 (China Standard Time)

I tested out newer Llama 3s made with the latest llama.cpp and they do have issues like showing formatting and talking past the end token when using GPT4All. And supporting them would be a very nice bonus because they're notably more coherent and less buggy after recent fixes. For example, they can solve 3333+777, rather than respond with 33 + 77 = 101.

This is the answer GPT4All v2.7.4 with the including L38 Instruct Q4_0 gives.

"Let me calculate the sum for you...

33 + 33 = 66
66 + 77 = 143

So, the answer is: 143. Is there anything else I can help you with?"

woheller69 · Answer 2 · Fri May 03 2024 00:00:28 GMT+0800 (China Standard Time)

these issues have been fixed in llama.cpp but the lama.cpp fork of gpt4all has not been updated so far. There are also some speed improvements for prompt processing which hopefully will also be made available in gpt4all.

Agile Bean · Answer 3 · Sun May 05 2024 07:33:57 GMT+0800 (China Standard Time)

@Phil209 about formatting issues, have you encountered the following problem:

ERROR: byte not found in vocab: '
'

Phil209 · Answer 4 · Sun May 05 2024 07:48:41 GMT+0800 (China Standard Time)

@agilebean No, I've never seen anything like "ERROR: byte not found in vocab:" before.

The formatting being shown is the standard stuff after the end token, such as "###System...", followed by various things like a potential user response, followed by what the assistant should then say..., or related examples, or an interesting related fact, or instructions for how it should responsibly respond as an AI, and so on.

Phil209 · Answer 5 · Mon May 06 2024 02:13:26 GMT+0800 (China Standard Time)

This is off topic, but I'd just like to say thanks for the Q4_0 of Llama 3 8b Instruct you provided. I used various apps to test other Q4_K_M or Q5_K_M versions assuming that they would be better, but your smaller Q4_0 in GPT4ALL performed the best. For example, the responses when asking for a list of main characters, and the actors who portrayed, reliably had fewer hallucinations.