[model] StableLM 2 Zephyr 1.6b

Question

[model] StableLM 2 Zephyr 1.6b

flatsiedatsie opened this issue 4 months ago · comments

I stumbled upon this two week old discussion here, about the StableLM 2 Zephyr 1.6b model becoming available for web-lmm soon.

https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b/discussions/9

I'd really love to work with that model, as my testing so far has shown it to work surprisingly well for its size.

Is there any way to use this model already?

flatsiedatsie · Answer 1 · Tue Feb 20 2024 02:47:36 GMT+0800 (China Standard Time)

I think I just found the files:

https://huggingface.co/OO8/1_6B_dev/tree/main

flatsiedatsie · Answer 2 · Tue Feb 20 2024 05:49:20 GMT+0800 (China Standard Time)

Got stuck on:

[FATAL] /workspace/mlc-llm/3rdparty/tvm/include/tvm/runtime/packed_func.h:1908: Function tvmjs.array.decode_storage(0: runtime.NDArray, 1: basic_string<char>, 2: basic_string<char>, 3: basic_string<char>) -> void expects 4 arguments, but 3 were provided.
put_char @ web-llm.bundle.mjs:3421

Charlie Ruan · Answer 3 · Wed Feb 21 2024 09:32:49 GMT+0800 (China Standard Time)

This is likely due to an old version of the web-llm npm (if you are not building from source). If you are building from source, this is likely due to the repo not up to date; try pull the recent changes

flatsiedatsie · Answer 4 · Fri Mar 15 2024 17:08:23 GMT+0800 (China Standard Time)

It would be fantastic if this model could become part of the default supported models.

The multi-language ability is fantastic. I'm very impressed with it, especially for its size.

flatsiedatsie · Answer 5 · Sat Apr 06 2024 00:42:15 GMT+0800 (China Standard Time)

Awesome, it seems the model has already become available in the Huggingface repo. The chunks exist:

https://huggingface.co/mlc-ai

However, the .wasm files are missing from binary-mlc-llm-libs. I've created an issue about that.

mlc-ai/binary-mlc-llm-libs#111

Charlie Ruan · Answer 6 · Sat Apr 06 2024 00:47:18 GMT+0800 (China Standard Time)

Thanks for the request! We should be able to add the prebuilt wasm files in shortly. cc @YiyanZhai

flatsiedatsie · Answer 7 · Sat Apr 06 2024 21:05:00 GMT+0800 (China Standard Time)

Fantastic! Thank you!

flatsiedatsie · Answer 8 · Sat Apr 06 2024 21:11:20 GMT+0800 (China Standard Time)

For the record, I think there are more models for which the shards are available, but the wasm files are not (yet).

Music
~~WizardMath~~
Gorilla
Gemma 7B
~~CodeLlama~~
~~OpenHermes~~

Charlie Ruan · Answer 9 · Wed Apr 10 2024 09:44:35 GMT+0800 (China Standard Time)

Thanks for the list! WizardMath and OpenHermes can reuse the wasm of Mistral (as shown in prebuiltAppConfig in src/config.ts); CodeLlama should be able to reuse that of Llama-2, as long as they share the same quantization (e.g. q4f16_1) and number of params (e.g. 7B or 13B).