GPT-2 load errors

Question

GPT-2 load errors

pabl-o-ce opened this issue a year ago · comments

Hi guys,

Been trying to run GPT-2 on llm:

command:

llm gpt2 infer -m ~/v1/ai-models/WizardCoder-15B-1.0.ggmlv3.q4_0.bin -p "Write a Javascript function findMissingNumber(arr: number[]) that accepts an array of integers and returns the missing number."

no it wont work I get some error using WizardCoder-15B-1.0.ggmlv3.q4_0.bin:

Error:
   0: Could not load model
   1: unsupported f16_: 2002

and for the other model wizardcoder-guanaco-15b-v1.0.ggmlv1.q4_0.bin:

llm gpt2 infer -m ~/v1/ai-models/wizardcoder-guanaco-15b-v1.0.ggmlv1.q4_0.bin -p "Write a Javascript function findMissingNumber(arr: number[]) that accepts an array of integers and returns the missing number."

⣾ Loading model...Error:
   0: Could not load model
   1: multipart models are not supported

any ideas what can it be?

Philpax · Answer 1 · Mon Aug 07 2023 05:52:56 GMT+0800 (China Standard Time)

Thanks for reporting this here! Something I just noticed, though - how old is your version of llm? The current version of llm uses llm infer -a gpt2 instead of llm gpt2 infer - is it possible that it predates the changes to support the v3 quantization format?

As for the wizardcoder-guanaco issue, that implies that you have another file with a very similar filename that makes llm think that it's a multipart model. I made that check stricter some time ago, so it shouldn't fire a false-positive as it is here. I'd be interested in seeing if that issue still occurs when you update your llm. (It probably won't work, though - v1 is not supported)

If you still have issues after this, I'll download the models and investigate soon-ish!

PΔBLØ ᄃΞ · Answer 2 · Mon Aug 07 2023 07:32:27 GMT+0800 (China Standard Time)

I look at releases is seems that is on version 0.1.1 ?

what can be wrong here:

maybe install from source code?

Lukas Kreussel · Answer 3 · Tue Aug 08 2023 17:20:02 GMT+0800 (China Standard Time)

If you want to use V3 models you have to install from source as llm-v0.2 isn't finished yet. Then you can just use the commands listed in the README to perform inference.

Philpax · Answer 4 · Sun Aug 13 2023 07:49:53 GMT+0800 (China Standard Time)

Hey, have you had a chance to try it with the latest llm?

PΔBLØ ᄃΞ · Answer 5 · Mon Aug 28 2023 09:23:39 GMT+0800 (China Standard Time)

Hi guys I would close this issue.. all my attention goes to TheBloke/WizardCoder-Python-34B-V1.0-GGML or GPTQ that is based on code llama 2.

test it is really good