intel / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

Repository from Github https://github.comintel/ipex-llm

intel/ipex-llm Issues

running qwq-32b-awq with A770 * 2 is extremely slow, only 6t/s
Updated 7 months ago7
Met segment fault while running Whisper on Arc
Closed 7 months ago16
怎么限制 gpu使用率
Updated 7 months ago4
Running `intelanalytics/ipex-llm-inference-cpp-xpu` image with A770 GPU and AMD EPYC CPU
Closed 7 months ago12
Is Ollama Portable Zip has open source code?
Updated 7 months ago3
MTL NPU Error Output
Closed 7 months ago
support >= 4GB SYCL compute buffer size for longer context length
Updated 7 months ago4
Unable to use ollama create to load custom gguf model
Updated 7 months ago2
Unable to run gemma3
Closed 7 months ago3
Unable to use GLM model
Updated 7 months ago4
Ollama stopped working after Intel driver updates
Closed 7 months ago2
UHD Graphics 730 run DeepSeek-R1-14B error
Closed 7 months ago3
MiniCPM-O cannot use TTS
Updated 7 months ago5
unable to run llama.cpp on 2 A770 cards with x99 platform
Updated 7 months ago1
"llama runner process has terminated: exit status 2" on Ryzen 5600/Arc A770
Closed 7 months ago2
模型加载可以成功，对话报错了
Updated 7 months ago3
Ollama ipex-llm Code Generation quality issue
Updated 7 months ago3
Documentation for running ipex-llm (at least Ollama) is incorrect and no device will be found
Updated 7 months ago2
ollama ps shows CPU use though start-ollama.sh indicates that model is loaded into GPU?
Updated 7 months ago7
Support for gemma3 from google
Updated 7 months ago34
llama_load_model_from_file: failed to load model with SYCL/Level Zero on Intel Arc B580 GPU
Updated 7 months ago9
Harness does not work properly
Updated 7 months ago6
Run Qwen GGUF by ipex-llm transformers python
Updated 7 months ago
Fail to run gemma3 in the IPEX Ollama released in 3.19
Updated 7 months ago11
cpu memory increased same size with GPU when run ollama with ipex_llm? is this expected?
Closed 7 months ago2
B580 Unable to run model larger than GPU memory
Updated 7 months ago1
vllm failure in intelanalytics/ipex-llm-serving-xpu:2.2.0-b13
Closed 7 months ago2
llama.cpp portable gemma3 sample - getting low GPU usage
Updated 7 months ago1
Access to Ollama from outside docker with no net:host
Closed 7 months ago3
When using the NPU inference model, when the Prompt length exceeds a certain range, the model reports an error OSError
Updated 7 months ago
ipex-llm vllm not support glm-edge-4b-chat model
Closed 7 months ago2
llama_server.exe build by llama.cpp d7cfe1f crashed when using ipex-llm to improve performance.
Updated 7 months ago2
Failed to register worker to Raylet: IOError: [RayletClient]
Closed 7 months ago2
Failed to run Qwen2-vl with ipex-llm + a760
Updated 7 months ago1
failed to load Qwen2-VL with ipex-llm[xpu] under a760
Closed 7 months ago1
Ollama Portable Zip SIGSEV
Updated 7 months ago5
vllm on tensor parallel - RuntimeError: oneCCL: ze_fd_manager.cpp:144 init_device_fds: EXCEPTION: opendir failed: could not open device directory
Closed 7 months ago2
ipex-llm(ollama)是否有一些可设置环境变量供参考
Updated 7 months ago1
ipex-llm[xpu] is not compatible with ipex-llm[cpp]
Updated 7 months ago2
Arc B580 Has Faster TPS on llama.cpp Vulkan API Than IPEX-LLM Portable llama.cpp Build
Closed 7 months ago7
Unable to fully load model into Vram using ollama zip gpu
Updated 8 months ago5
RuntimeError: UR error with LlamaIndex
Updated 8 months ago
RuntimeErrror ipex_duplicate_import_error
Updated 8 months ago1
Docker Image not updated
Updated 8 months ago3
TTFT of the distill qwen model is worser than the base model, is it a expected behavior?
Updated 8 months ago3
torch depend on cuda-libs when installed bigdl-core-cpp in linux
Updated 8 months ago1
Gemma 3 Context Shift Causes Gibberish Output (llama.cpp IPEX build)
Closed 8 months ago3
Unable to run gemma3:27b
Closed 8 months ago2
EOF POST predict error
Updated 8 months ago3
ipex-llm-2.2.0b20250312 silently crash
Updated 8 months ago1