rustformers / llm

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models

Home Page:https://docs.rs/llm/latest/llm/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GPT-2 load errors

pabl-o-ce opened this issue · comments

Hi guys,

Been trying to run GPT-2 on llm:

command:

llm gpt2 infer -m ~/v1/ai-models/WizardCoder-15B-1.0.ggmlv3.q4_0.bin -p "Write a Javascript function findMissingNumber(arr: number[]) that accepts an array of integers and returns the missing number."

no it wont work I get some error using WizardCoder-15B-1.0.ggmlv3.q4_0.bin:

Error:
   0: Could not load model
   1: unsupported f16_: 2002

and for the other model wizardcoder-guanaco-15b-v1.0.ggmlv1.q4_0.bin:

llm gpt2 infer -m ~/v1/ai-models/wizardcoder-guanaco-15b-v1.0.ggmlv1.q4_0.bin -p "Write a Javascript function findMissingNumber(arr: number[]) that accepts an array of integers and returns the missing number."
⣾ Loading model...Error:
   0: Could not load model
   1: multipart models are not supported

any ideas what can it be?

Thanks for reporting this here! Something I just noticed, though - how old is your version of llm? The current version of llm uses llm infer -a gpt2 instead of llm gpt2 infer - is it possible that it predates the changes to support the v3 quantization format?

As for the wizardcoder-guanaco issue, that implies that you have another file with a very similar filename that makes llm think that it's a multipart model. I made that check stricter some time ago, so it shouldn't fire a false-positive as it is here. I'd be interested in seeing if that issue still occurs when you update your llm. (It probably won't work, though - v1 is not supported)

If you still have issues after this, I'll download the models and investigate soon-ish!

I look at releases is seems that is on version 0.1.1 ?

what can be wrong here:
Screen Shot 2023-08-06 at 6 28 28 PM

maybe install from source code?

If you want to use V3 models you have to install from source as llm-v0.2 isn't finished yet. Then you can just use the commands listed in the README to perform inference.

Hey, have you had a chance to try it with the latest llm?

Hi guys I would close this issue.. all my attention goes to TheBloke/WizardCoder-Python-34B-V1.0-GGML or GPTQ that is based on code llama 2.

test it is really good