bug: Can't interact WizardCoder-Python-7B-V1.0 with API server
alabulei1 opened this issue · comments
alabulei1 commented
Summary
Can't interact WizardCoder-Python-7B-V1.0 with API server on my Mac M2.
shasum -a 256 llama-api-server.wasm
e672175a3003038cfb97e29d3a77bc8bbc7e4eec8ec234122436df63fd582a28 llama-api-server.wasm
shasum -a 256 ggml-metal.metal
f8ac2d9ddc60232f6836a2730828d739a8892c2fa829bdf484c560f5f7fba655 ggml-metal.metal
shasum -a 256 libwasmedgePluginWasiNN.dylib
8fb6908818d9daf88ad3aa8e5fdf77635a56a157f755a6dcd75f6731e64d3cad libwasmedgePluginWasiNN.dylib
Reproduction steps
- Run the run-llm.sh
- Choose 24) WizardCoder-Python-7B-V1.0
- choose run with API server
- Open localhost:8080, and ask a question "Write a hello world progam in Rust", the error message will be the following:
Starting llama-api-server ...
+ '[' -n '' ']'
+ wasmedge --dir .:. --nn-preload default:GGML:AUTO:WizardCoder-Python-7B-V1.0-ggml-model-q4_0.gguf llama-api-server.wasm -p wizard-coder -m WizardCoder-Python-7B-V1.0
[INFO] Socket address: 0.0.0.0:8080
[INFO] Model name: WizardCoder-Python-7B-V1.0
[INFO] Model alias: default
[INFO] Prompt context size: 4096
[INFO] Number of tokens to predict: 1024
[INFO] Number of layers to run on the GPU: 100
[INFO] Batch size for prompt processing: 4096
[INFO] Prompt template: WizardCoder
[INFO] Log prompts: false
[INFO] Log statistics: false
[INFO] Log all information: false
[INFO] Starting server ...
[INFO] Listening on http://0.0.0.0:8080
GGML_ASSERT: /Users/hydai/workspace/WasmEdge/plugins/wasi_nn/thirdparty/ggml/ggml-metal.m:1459: false
/dev/fd/11: line 356: 921 Abort trap: 6 wasmedge --dir .:. --nn-preload default:GGML:AUTO:$model_file llama-api-server.wasm -p $prompt_template -m "${model}"
+ set +x
Screenshots
Any logs you want to share for showing the specific issue
No response
Model Information
WizardCoder-Python-7B-V1.0
Operating system information
macOS 14.1.1 (23B81)
ARCH
M2
CPU Information
M2
Memory Size
16GB
GPU Information
M2
VRAM Size
I don't know