Is there a way to run ollama with IPEX-LLM on CPU
reeingal opened this issue · comments
I want to run ollama with IPEM-LLM on a machine with 4 Intel Xeon CPU E7-4830 v3 processors and 256GB of memory. The operating system is Ubuntu 24.04. I followed the steps in the official tutorial as follows:
- Install Intel® oneAPI Base Toolkit:
apt-get install intel-basekit apt-get install intel-hpckit
- Install ipex-llm-cpp:
pip install --pre --upgrade ipex-llm[cpp]
- Execute
init-ollama
:init-ollama
- Run ollama:
source /opt/intel/oneapi/setvars.sh ollama serve
- Pull the model:
ollama pull qwen:7b-chat
- Chat with ollama through open-webui.
But when I selected the model on open-webui and sent the question, I received a response with error code 500
.
I checked the console, and the last output was as follows:
[SYCL] call ggml_init_sycl
ggml_init_sycl: GGML_SYCL_DEBUG: 0
ggml_init_sycl: GGML_SYCL_F16: no
found 2 SYCL devices:
| | | | |Max compute|Max work|Max sub| | |
|ID| Device Type| Name|Version|units |group |group |Global mem size| Driver version|
|--|------------------|---------------------------------------------|-------|-----------|--------|-------|---------------|----------------------------------|
| 0| [opencl:cpu:0]| Intel Xeon CPU E7-4830 v3 ® 2.10GHz| 3.0| 96| 8192| 64| 270317M|2024.17.5.0.08_160000.xmain-hotfix|
| 1| [opencl:acc:0]| Intel FPGA Emulation Device| 1.2| 96|67108864| 64| 270317M|2024.17.5.0.08_160000.xmain-hotfix|
ggml_backend_sycl_set_mul_device_mode: true
llama_model_load: error loading model: DeviceList is empty. -30 (PI_ERROR_INVALID_VALUE)
llama_load_model_from_file: exception loading model
terminate called after throwing an instance of 'sycl::_V1::invalid_paramter_error'
what(): DeviceList is empty. -30 (PI_ERROR_INVALID_VALUE)
I couldn't find a method to run ollama with IPEX-LLM on a CPU in the official documentation. I hope someone can point out the problem for me.