[Bug] when cache read on Phi-3 example, error occur.

Question

[Bug] when cache read on Phi-3 example, error occur.

invent00 opened this issue 2 months ago · comments

Describe the bug
The first Run of the following sample works fine, but the second Run causes an Error.
When cache deleted, it works.

https://github.com/intel/intel-npu-acceleration-library/blob/main/examples/phi-3.py

To Reproduce
Steps to reproduce the behavior:

install intel-npu-acceleration-library 1.2.0 via pip
execute phi-3 example and it works(create cache file),
re- execute phi-3 example, error occur

Expected behavior
works properly

Screenshots

Desktop (please complete the following information):

OS: win 11 pro 23H2
NPU driver Version : 32.0.100.2408
CPU: Ultra 5 125U

Additional context

pip environments

PS C:\sisakucode\npu_test> pip freeze
certifi==2024.6.2
charset-normalizer==3.3.2
colorama==0.4.6
contourpy==1.2.1
cycler==0.12.1
Deprecated==1.2.14
filelock==3.14.0
fonttools==4.53.0
fsspec==2024.6.0
huggingface-hub==0.23.3
idna==3.7
importlib_resources==6.4.0
intel-npu-acceleration-library==1.2.0
intel-openmp==2021.4.0
Jinja2==3.1.4
joblib==1.4.2
kiwisolver==1.4.5
MarkupSafe==2.1.5
matplotlib==3.9.0
mkl==2021.4.0
mpmath==1.3.0
networkx==3.2.1
neural_compressor==2.5.1
numpy==1.26.4
opencv-python-headless==4.10.0.82
packaging==24.0
pandas==2.2.2
pillow==10.3.0
prettytable==3.10.0
psutil==5.9.8
py-cpuinfo==9.0.0
pycocotools==2.0.7
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.1
regex==2024.5.15
requests==2.32.3
safetensors==0.4.3
schema==0.7.7
scikit-learn==1.5.0
scipy==1.13.1
six==1.16.0
sympy==1.12.1
tbb==2021.12.0
threadpoolctl==3.5.0
tokenizers==0.19.1
torch==2.3.0
tqdm==4.66.4
transformers==4.41.2
typing_extensions==4.12.1
tzdata==2024.1
urllib3==2.2.1
wcwidth==0.2.13
wrapt==1.16.0
zipp==3.19.2

I encountered this when checking issue 33. Thank you for your support

Alessandro Palla commented a month ago

Solved

Alessandro Palla · Answer 1 · Wed Jun 05 2024 22:26:41 GMT+0800 (China Standard Time)

Very interesting. It seems to be an issue of torch.save + torch.load and trust_remote_code=True for transformers models: https://stackoverflow.com/questions/76090753/loading-a-pretrained-model-using-torch-load-gives-modulenotfounderror-no-modul

If you remove trust_remote_code=True it will work as a charm. I'll disable caching and trust_remote code in the next PR. Many thanks for the feedback!

Alessandro Palla · Answer 2 · Wed Jun 05 2024 23:16:01 GMT+0800 (China Standard Time)

Fix in #43