intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error: Failed to load the llama dynamic library. Segmentation fault

eugeooi opened this issue · comments

platform:Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz
os: Suse 13
model:mistralai/Mistral-7B-Instruct-v0.2
ipex-llm:2.1.0b20240515
transformers: 4.37.0
ldd: 2.22
gcc/g++: 11.1.0

After Loading checkpoint shards 100%, it shows:
Error: Failed to load the llama dynamic library. Segmentation fault

Hi, @eugeooi !

We tried running Mistral CPU example with Mistral-7B-Instruct-v0.2 model. However, we were not able to reproduce your result.

Could you please provide more information e.g. output of pip list, full error message (include traceback) and minimal reproducible code example about this issue?

Please feel free to ask if you have any questions. :)

pip list:

Package                  Version
------------------------ --------------
accelerate               0.21.0
antlr4-python3-runtime   4.9.3
certifi                  2024.6.2
charset-normalizer       3.3.2
filelock                 3.14.0
fsspec                   2024.6.0
huggingface-hub          0.23.2
idna                     3.7
intel-openmp             2024.1.2
ipex-llm                 2.1.0b20240515
Jinja2                   3.1.4
MarkupSafe               2.1.5
mpmath                   1.3.0
networkx                 3.3
numpy                    1.26.4
nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.20.5
nvidia-nvjitlink-cu12    12.5.40
nvidia-nvtx-cu12         12.1.105
omegaconf                2.3.0
packaging                24.0
pandas                   2.2.2
pip                      22.3.1
protobuf                 5.27.0
psutil                   5.9.8
py-cpuinfo               9.0.0
python-dateutil          2.9.0.post0
pytz                     2024.1
PyYAML                   6.0.1
regex                    2024.5.15
requests                 2.32.3
safetensors              0.4.3
sentencepiece            0.2.0
setuptools               65.5.0
six                      1.16.0
sympy                    1.12.1
tabulate                 0.9.0
tokenizers               0.15.2
torch                    2.3.0
tqdm                     4.66.4
transformers             4.37.0
triton                   2.3.0
typing_extensions        4.12.1
tzdata                   2024.1
urllib3                  2.2.1

When I tried to run the sample script, it shows error:

python generate.py
/user/ipex_llm/lib/python3.11/site-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
/user/ipex_llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Error: Failed to load the llama dynamic library.
Segmentation fault

But when I ran the same script on other Core-based system on Ubuntu 22.04, it works fine.

hi @JinBridger, is there any update on this issue?

hi @JinBridger, is there any update on this issue?

Hi, @eugeooi Could you please provide your glibc version? :)

$ ldd --version

ldd (GNU libc) 2.22
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Hi @eugeooi,

We recommend glibc version >= 2.28. :)