Is there a way to run ollama with IPEX-LLM on CPU

Question

Is there a way to run ollama with IPEX-LLM on CPU

reeingal opened this issue a month ago · comments

I want to run ollama with IPEM-LLM on a machine with 4 Intel Xeon CPU E7-4830 v3 processors and 256GB of memory. The operating system is Ubuntu 24.04. I followed the steps in the official tutorial as follows:

Install Intel® oneAPI Base Toolkit:

apt-get install intel-basekit
apt-get install intel-hpckit

Install ipex-llm-cpp:

pip install --pre --upgrade ipex-llm[cpp]

Execute init-ollama:
```
init-ollama
```

Run ollama:

source /opt/intel/oneapi/setvars.sh
ollama serve

Pull the model:
```
ollama pull qwen:7b-chat
```
Chat with ollama through open-webui.

But when I selected the model on open-webui and sent the question, I received a response with error code 500.

I checked the console, and the last output was as follows:

[SYCL] call ggml_init_sycl
ggml_init_sycl: GGML_SYCL_DEBUG: 0
ggml_init_sycl: GGML_SYCL_F16: no
found 2 SYCL devices:
|  |                  |                                             |       |Max compute|Max work|Max sub|               |                                  |
|ID|       Device Type|                                         Name|Version|units      |group   |group  |Global mem size|                    Driver version|
|--|------------------|---------------------------------------------|-------|-----------|--------|-------|---------------|----------------------------------|
| 0|    [opencl:cpu:0]|          Intel Xeon CPU E7-4830 v3 ® 2.10GHz|    3.0|         96|    8192|     64|        270317M|2024.17.5.0.08_160000.xmain-hotfix|
| 1|    [opencl:acc:0]|                  Intel FPGA Emulation Device|    1.2|         96|67108864|     64|        270317M|2024.17.5.0.08_160000.xmain-hotfix|
ggml_backend_sycl_set_mul_device_mode: true
llama_model_load: error loading model: DeviceList is empty. -30 (PI_ERROR_INVALID_VALUE)
llama_load_model_from_file: exception loading model
terminate called after throwing an instance of 'sycl::_V1::invalid_paramter_error'
  what(): DeviceList is empty. -30 (PI_ERROR_INVALID_VALUE)

I couldn't find a method to run ollama with IPEX-LLM on a CPU in the official documentation. I hope someone can point out the problem for me.

SONG Ge · Answer 1 · Fri Jun 07 2024 12:05:44 GMT+0800 (China Standard Time)

Hi @reeingal, ollama with IPEX-LLM does not support running on a pure CPU platform, as we haven't optimized ollama for CPU. You may switch to a GPU device to enable IPEX-LLM optimizations.