llama_server.exe build by llama.cpp d7cfe1f crashed when using ipex-llm to improve performance.

Question

llama_server.exe build by llama.cpp d7cfe1f crashed when using ipex-llm to improve performance.

cjsdurj opened this issue 7 months ago · comments

cjsdurj commented 7 months ago

enviroment

os： win11 ， cpu： ultra7-155H ，Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205)

Reproduce bug

clone llama.cpp & checkout d7cfe1f ，build llama-server.exe using DPC++ Compiler
copy *.dll to llama.cpp/build/bin overwirte origin dll
start llama_server and crashed when load model

binbin Deng · Answer 1 · Wed Mar 26 2025 14:52:37 GMT+0800 (China Standard Time)

Hi, we provide llama_server.exe in our nightly package, you could directly use it following our guide (https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md).

cjsdurj · Answer 2 · Wed Mar 26 2025 19:19:23 GMT+0800 (China Standard Time)

in my use case: I had added openai style video & image chat api using vl and some other code in llama_server . so build from source code and replace ggml.dll & llama.dll with dlls in ipex-llm package is useful.