WasmEdge / WasmEdge

WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud native, edge, and decentralized applications. It powers serverless apps, embedded functions, microservices, smart contracts, and IoT devices.

Home Page:https://WasmEdge.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bug: `[WASI-NN] RPC client is not implemented for LoadByNameWithConfig`

AkihiroSuda opened this issue · comments

Summary

WASI-NN RPC currently doesn't work due to [WASI-NN] RPC client is not implemented for LoadByNameWithConfig.

Something seems to have changed since:

Current State

Seems currently broken:

$  /opt/wasmedge/bin/wasmedge --nn-rpc-uri unix:///tmp/wasi_nn_rpc.sock ~/gopath/src/github.com/second-state/WasmEdge-WASINN-examples/wasmedge-ggml/llama/wasmedge-ggml-llama.wasm default
[2024-03-17 20:21:05.477] [error] [WASI-NN] RPC client is not implemented for LoadByNameWithConfig
thread 'main' panicked at 'Failed to build graph: BackendError(UnsupportedOperation)', src/main.rs:83:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2024-03-17 20:21:05.477] [error] execution failed: unreachable, Code: 0x40a
[2024-03-17 20:21:05.477] [error]     In instruction: unreachable (0x00) , Bytecode offset: 0x000143d7
[2024-03-17 20:21:05.477] [error]     When executing function name: "_start"

Expected State

It should work

Reproduction steps

  1. Build WasmEdge v0.14.0-rc.1 with WASMEDGE_BUILD_WASI_NN_RPC=ON
  2. /opt/wasmedge/bin/wasi_nn_rpcserver --nn-rpc-uri=unix:///tmp/wasi_nn_rpc.sock --nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf
  3. /opt/wasmedge/bin/wasmedge --nn-rpc-uri unix:///tmp/wasi_nn_rpc.sock ~/gopath/src/github.com/second-state/WasmEdge-WASINN-examples/wasmedge-ggml/llama/wasmedge-ggml-llama.wasm default
    (with https://github.com/second-state/WasmEdge-WASINN-examples/tree/c828a39783e2bd1389dfc0ec3ff31b56e7fc5b41 )

Screenshots

$  /opt/wasmedge/bin/wasmedge --nn-rpc-uri unix:///tmp/wasi_nn_rpc.sock ~/gopath/src/github.com/second-state/WasmEdge-WASINN-examples/wasmedge-ggml/llama/wasmedge-ggml-llama.wasm default
[2024-03-17 20:21:05.477] [error] [WASI-NN] RPC client is not implemented for LoadByNameWithConfig
thread 'main' panicked at 'Failed to build graph: BackendError(UnsupportedOperation)', src/main.rs:83:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2024-03-17 20:21:05.477] [error] execution failed: unreachable, Code: 0x40a
[2024-03-17 20:21:05.477] [error]     In instruction: unreachable (0x00) , Bytecode offset: 0x000143d7
[2024-03-17 20:21:05.477] [error]     When executing function name: "_start"

Any logs you want to share for showing the specific issue

No response

Components

CLI

WasmEdge Version or Commit you used

0.14.0-rc.1

Operating system information

macOS 14.4

Hardware Architecture

x86_64

Compiler flags and options

Homebrew clang version 17.0.6

export LLVM_DIR="/usr/local/opt/llvm/lib/cmake"
export CC="/usr/local/opt/llvm/bin/clang"
export CXX="/usr/local/opt/llvm/bin/clang++"
cmake -S. -B /tmp/build -GNinja \
  -DCMAKE_BUILD_TYPE=Release \
  -DWASMEDGE_PLUGIN_WASI_NN_BACKEND=ggml \
  -DLLAMA_BLAS_VENDOR=Apple \
  -DWASMEDGE_BUILD_TESTS=ON \
  -DWASMEDGE_BUILD_WASI_NN_RPC=ON
cmake --build /tmp/build
cmake --install /tmp/build --prefix=/opt/wasmedge 
mkdir -p /opt/wasmedge/plugin
cp -a /tmp/build/plugins/wasi_nn/libwasmedgePluginWasiNN.dylib /opt/wasmedge/plugin

It is weird that even e78fedf (Jan 9, [WASI-NN] Support RPC mode) with second-state/WasmEdge-WASINN-examples@c1d4dba (Jan 9) doesn't work, although I'm sure it was working fine when I submitted the PR #3128 .

Looks like second-state/WasmEdge-WASINN-examples@f3b00f6 (Nov 10) seems the last version to work. second-state/WasmEdge-WASINN-examples@476417c (Nov 10) does not.

Hi @AkihiroSuda

We extended the WASI-NN spec with a new function called LoadByNameWithConfig1 and made it the default in our examples.
This function's design purpose is to set the GPU configuration(e.g., the number of the tensor layers that need to be offloaded on the GPU) when loading the model. The original spec has no way to do this.

So, we now have two ways to fix this issue:

  1. Provide a new example that uses the original WASI-NN spec only or falls back to the LoadByName function when the NNRPC tries to call LoadByNameWithConfig and drops all of the configuration related to the GPU and other components that need to be set up when loading the model.
  2. Implment the LoadByNameWithConfig function in the NNRPC component.

cc @dm4, WDYT?

cc @q82419, let's hold the 0.14.0 release until we can reach a consensus on the NNRPC.

commented

I believe we should implement the LoadByNameWithConfig function in the NNRPC component. However, before proceeding with its implementation, I can provide a new example that does not utilize the LoadByNameWithConfig functionality.

I believe we should implement the LoadByNameWithConfig function in the NNRPC component. However, before proceeding with its implementation, I can provide a new example that does not utilize the LoadByNameWithConfig functionality.

Let's do this first. Let's ensure the previous behavior works.

commented

I have added a new example, wasmedge-ggml/nnrpc, which avoids using load_by_name_with_config to ensure the current version of wasi_nn_rpcserver works properly.

Additionally, in #3296, I implemented the LoadByNameWithConfig function in the wasi_nn_rpcserver tool and added the corresponding tests.

commented

PR #3296 is merge. This issue should be fixed after the next release.

Thank you @dm4 !

@q82419
Let's have rc.3 to include this NNRPC bugfix.