stablelm-2-zephyr-1_6b-Q8_0.gguf does not work

Question

stablelm-2-zephyr-1_6b-Q8_0.gguf does not work

batrlatom opened this issue 6 months ago · comments

Hello,

I've been working on getting the stablelm-2-zephyr-1_6b-Q8_0.gguf operational (link: https://huggingface.co/spaces/stabilityai/stablelm-2-1_6b-zephyr), especially since the 3B version seems to function quite well. However, I'm encountering an issue with the 1.6B version where it fails to initialize the context. Currently, I'm using the latest version of your master branch to compile the library. Is there a straightforward modification I can make on my end to resolve this?

from logs:

01-29 22:50:05.365 3017 20732 E RNLLAMA_LOG_ANDROID: llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 340, got 268

Thank you.

Vali-98 · Answer 1 · Tue Jan 30 2024 09:51:37 GMT+0800 (China Standard Time)

I quickly tested this, it seems to be non-functional on older versions of llamacpp, but works on latest. I suppose you could try pull the latest llamacpp files to fix this. Otherwise, @jhen0409 will have to bump the llamacpp version.

batrlatom · Answer 2 · Tue Jan 30 2024 17:09:06 GMT+0800 (China Standard Time)

I will try... thanks

batrlatom · Answer 3 · Tue Jan 30 2024 17:54:12 GMT+0800 (China Standard Time)

@Vali-98 Tried to update submodule to latest version, but still the same problem ... you can take a look at:
https://github.com/batrlatom/llama.rn . for reference, I installed it via git using

npm install git+https://github.com/batrlatom/llama.rn

Vali-98 · Answer 4 · Tue Jan 30 2024 20:24:49 GMT+0800 (China Standard Time)

Supposedly its was added on this commit:
ggerganov/llama.cpp@d6bd4d4

Checking the npm install that you suggested, it seems to pull the incorrect version of llamacpp, these lines are missing:

// optional bias tensors, present in Stable LM 2 1.6B
layer.bq = ml.create_tensor(ctx_layer, tn(LLM_TENSOR_ATTN_Q,   "bias", i), {n_embd},     false);
layer.bk = ml.create_tensor(ctx_layer, tn(LLM_TENSOR_ATTN_K,   "bias", i), {n_embd_gqa}, false);
layer.bv = ml.create_tensor(ctx_layer, tn(LLM_TENSOR_ATTN_V,   "bias", i), {n_embd_gqa}, false);

Something seems to have gone wrong with your compilation. It seems to be in your repo, but it doesn't exist when using npm install.

batrlatom · Answer 5 · Tue Jan 30 2024 20:45:13 GMT+0800 (China Standard Time)

ok, it's important that it works for you so I know that I need to find a problem on my side. Thanks!

batrlatom · Answer 6 · Wed Jan 31 2024 14:01:23 GMT+0800 (China Standard Time)

solved, needed to run bash scripts/bootstrap.sh to update needed files