Phi-3-mini-4k-instruct (INT4) model compile error
xyang2013 opened this issue · comments
Meteor Lake 155H laptop
Windows 11
I ran the following command to update the Python package:
pip install intel-npu-acceleration-library --upgrade
Then, when running the following cell in Jupyter notebook
model = intel_npu_acceleration_library.compile(model, dtype=torch.int4)
I have experienced the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[4], [line 2](vscode-notebook-cell:?execution_count=4&line=2)
[1](vscode-notebook-cell:?execution_count=4&line=1) print("Compile model for the NPU")
----> [2](vscode-notebook-cell:?execution_count=4&line=2) model = intel_npu_acceleration_library.compile(model, dtype=torch.int4)
File c:\Users\xiaoy\anaconda3\envs\nlp\Lib\site-packages\torch\__init__.py:2003, in __getattr__(name)
[2000](file:///C:/Users/xiaoy/anaconda3/envs/nlp/Lib/site-packages/torch/__init__.py:2000) import importlib
[2001](file:///C:/Users/xiaoy/anaconda3/envs/nlp/Lib/site-packages/torch/__init__.py:2001) return importlib.import_module(f".{name}", __name__)
-> [2003](file:///C:/Users/xiaoy/anaconda3/envs/nlp/Lib/site-packages/torch/__init__.py:2003) raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
AttributeError: module 'torch' has no attribute 'int4'
Torch does not have int4
attribute. Instead, you should use intel_npu_acceleration_library.int4
. A good example is here: https://github.com/intel/intel-npu-acceleration-library/blob/main/examples/phi-3.py#L13
Also, since quantization take a bit, I implemented NPUModelForCausalLM
that will cache the quantized model for convenience I suggest you to use that as well. Be sure to use the latest version of this library and the latest driver (https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html) to make full use of the new features like int4 quantization: