rustformers / llm

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models

Home Page:https://docs.rs/llm/latest/llm/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does it support the new GGMLv3 quantization methods?

Exotik850 opened this issue · comments

Tried using the cli application to see how far it had come from being llama-rs, and noticed that an error popped up using one of the newer WizardLM uncensored models using the GGMLv3 method,

llm llama chat --model-path .\Wizard-Vicuna-7B-Uncensored.ggmlv3.q5_1.bin
⣾ Loading model...Error:
   0: Could not load model
   1: invalid file format version 3

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Am I using it the wrong way or is it not supported yet?

Hi there! Yes, it's supported, but only on the latest version (main) - we haven't cut a new release yet. Hope to have that sorted soon!

My apologies, should've tried the main branch instead of just trying the release 😅

No worries - I'll keep this up for now and pin it for people's reference until we get it out the door :)

@philpax have you considered making some 0.2.0-beta.1 etc. releases on crates.io? This pattern has worked very well for some of my own projects in the past.

Hi there! Yeah, I've considered it, but the main blocker is #221 - I don't want to cut a release where the interface is going to be radically different in the next release. I'm hoping to have this all closed out within the next week or two, especially with GGUF on the horizon, but I've been quite busy.