LlamaEdge / LlamaEdge

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Home Page:https://llamaedge.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failed to run model on windows

fankaiLiu opened this issue · comments

Summary

I tried to follow the tutorial here to run the BaiChuanmodel on windows, but got an error

Reproduction steps

Step 1: Install WasmEdge via the following command line.
Step 2: Download the GGUF file for the model .
Step 3: Download the cross-platform portable Wasm file for the chat application.
Step 4. wasmedge --dir . :. --nn-preload default:GGML:AUTO:Baichuan2-13B-Chat-ggml-model-q4_0.gguf llama-chat.wasm -p baichuan-2 -r 'user:'

Screenshots

D:\baichuan>wasmedge -v
wasmedge version 0.13.5

run wasmedge --dir .:. --nn-preload default:GGML:AUTO:Baichuan2-13B-Chat-ggml-model-q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'
get

unknown option: nn-preload

run wasmedge --dir .:. baichuan2-13b-chat.Q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'an-2 -r '用户:''用户:'an-2 -r '用户:'-2 -r '用户:'
get

[2024-01-11 11:01:34.566] [error] loading failed: magic header not detected, Code: 0x23
[2024-01-11 11:01:34.568] [error]     Bytecode offset: 0x00000000
[2024-01-11 11:01:34.568] [error]     At AST node: component
[2024-01-11 11:01:34.568] [error]     File name: "D:\\baichuan\\baichuan2-13b-chat.Q4_0.gguf"

Any logs you want to share for showing the specific issue

I'm guessing it's because the tutorial is based on mac os, where else might you see more instructions for running the model using windows.

Model Information

BaiChuan

Operating system information

win10

ARCH

x86_64

CPU Information

i5-4460

Memory Size

16GB

GPU Information

RTX3060

VRAM Size

16GB

Hi @fankaiLiu

You're using WindowOS not the WSL, right?

Hi @fankaiLiu

You're using WindowOS not the WSL, right?

you are right, Is wasmedge only supported under wls?

WasmEdge can run on Windows. However, the ggml plugin cannot. Please use WSL to execute the AI related workloads.

WasmEdge can run on Windows. However, the ggml plugin cannot. Please use WSL to execute the AI related workloads.

I installed wsl, ubuntu 20.04.3 linux distribution. Then followed wasmedge and ran wasmedge --dir .:. baichuan2-13b-chat.Q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'an-2 -r '用户:''用户:'an-2 -r '用户:'-2 -r '用户:'
Got the same error, is there any other pre-dependencies that need to be installed for the ggml plugin?

INFO    - Downloading WasmEdge

|============================================================|100.00 %INFO    - Downloaded

INFO    - Installing WasmEdge

INFO    - WasmEdge Successfully installed

INFO    - Run:

source /root/.bashrc

root@DESKTOP-6608JCO:/home/root1/ai# wasmedge --dir .:. --nn-preload default:GGML:AUTO:Baichuan2-13B-Chat-ggml-model-q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'

unknown option: nn-preload

Please share your whole command history and logs.

Please share your whole command history and logs.

ok!
1: install wasmedge

root@DESKTOP-6608JCO:/home/root1/ai#  curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- -p /usr/local
Using Python: /usr/bin/python3
ERROR   - Exception on process - rc= 127 output= b'' command= ['/usr/local/cuda/bin/nvcc --version 2>/dev/null']
INFO    - Compatible with current configuration
INFO    - Running Uninstaller
WARNING - SHELL variable not found. Using bash as SHELL
INFO    - shell configuration updated
INFO    - Downloading WasmEdge
|============================================================|100.00 %INFO    - Downloaded
INFO    - Installing WasmEdge
INFO    - WasmEdge Successfully installed
INFO    - Run:
source /root/.bashrc

2: Work paths

root@DESKTOP-6608JCO:/home/root1/ai# pwd
/home/root1/ai
root@DESKTOP-6608JCO:/home/root1/ai# ls -al
total 7802472
drwxr-xr-x 1 root  root        4096 Jan 12 13:13 .
drwxr-x--- 1 root1 root1       4096 Jan 12 12:44 ..
-rw-r--r-- 1 root  root  7987103264 Jan 11 10:35 baichuan2-13b-chat.Q4_0.gguf
-rw-r--r-- 1 root  root     2623282 Jan 11 10:31 llama-chat.wasm

3:system version

root@DESKTOP-6608JCO:/home/root1/ai# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy

4:command : wasmedge --dir .:. --nn-preload baichuan2-13B-chat-ggml-model-q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:' full output is

root@DESKTOP-6608JCO:/home/root1/ai# wasmedge --dir .:. --nn-preload baichuan2-13b-chat-ggml-model-q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'
unknown option: nn-preload

5:command :wasmedge --dir .:. Baichuan2-13B-Chat-ggml-model-q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'wasmedge --dir .:. baichuan2-13b-chat.Q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'

root@DESKTOP-6608JCO:/home/root1/ai# wasmedge --dir .:.  baichuan2-13b-chat.Q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'
[2024-01-12 13:09:28.586] [error] loading failed: magic header not detected, Code: 0x23
[2024-01-12 13:09:28.587] [error]     Bytecode offset: 0x00000000
[2024-01-12 13:09:28.587] [error]     At AST node: component
[2024-01-12 13:09:28.588] [error]     File name: "/home/root1/ai/baichuan2-13b-chat.Q4_0.gguf"

There's no obvious hint as to where the detailed logs are generated, if you know, please let me know, happy to provide more detailed error messages!

You didn't install the ggml plugin.
According to the tutorial, you should run the following command:

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugin wasi_nn-ggml

However, in your command history, you didn't ask the installer to get the wasi_nn-ggml plugin. That's why the nn-preload is not found.

You didn't install the ggml plugin. According to the tutorial, you should run the following command:

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugin wasi_nn-ggml

However, in your command history, you didn't ask the installer to get the wasi_nn-ggml plugin. That's why the nn-preload is not found.
Thank you very much for your help,Now I'm getting a new error.

[2024-01-12 14:00:07.806] [error] [WASI-NN] Preload Model's Backend or Device is Not Support.
[INFO] Model alias: default
[INFO] Prompt context size: 512
[INFO] Number of tokens to predict: 1024
[INFO] Number of layers to run on the GPU: 100
[INFO] Batch size for prompt processing: 512
[INFO] Temperature for sampling: 0.8
[INFO] Penalize repeat sequence of tokens: 1.1
[INFO] Reverse prompt: 用户:
[INFO] Use default system prompt
[INFO] Prompt template: Baichuan2
[INFO] Log prompts: false
[INFO] Log statistics: false
[INFO] Log all information: false
Error: "Fail to load model into wasi-nn: Backend Error: WASI-NN Backend Error: Not Found"

Which command do you use?
Is this one from the tutorial?

wasmedge --dir .:. --nn-preload default:GGML:AUTO:Baichuan2-13B-Chat-ggml-model-q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:'

or this one if you are using the server mode:

wasmedge --dir .:. --nn-preload default:GGML:AUTO:Baichuan2-13B-Chat-ggml-model-q4_0.gguf llama-api-server.wasm -p baichuan-2 -r '用户:'

I found the problem, the mac os command to find file case insensitivity,but the linux is case sensitive
run wasmedge --dir .:. --nn-preload default:GGML:AUTO:baichuan2-13b-chat.Q4_0.gguf llama-chat.wasm -p baichuan-2 -r '用户:' is work!

Thank you very much for your help, if possible I will summarize how to use wasmedge on windos and share it. Now have typed my prompt and waiting for a reply from the bots~~