Phi-3-mini-4k-instruct (INT4) model compile error

Question

Phi-3-mini-4k-instruct (INT4) model compile error

xyang2013 opened this issue a month ago · comments

xyang2013 commented a month ago

Meteor Lake 155H laptop
Windows 11

I ran the following command to update the Python package:

pip install intel-npu-acceleration-library --upgrade

Then, when running the following cell in Jupyter notebook

model = intel_npu_acceleration_library.compile(model, dtype=torch.int4)

I have experienced the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[4], [line 2](vscode-notebook-cell:?execution_count=4&line=2)
      [1](vscode-notebook-cell:?execution_count=4&line=1) print("Compile model for the NPU")
----> [2](vscode-notebook-cell:?execution_count=4&line=2) model = intel_npu_acceleration_library.compile(model, dtype=torch.int4)

File c:\Users\xiaoy\anaconda3\envs\nlp\Lib\site-packages\torch\__init__.py:2003, in __getattr__(name)
   [2000](file:///C:/Users/xiaoy/anaconda3/envs/nlp/Lib/site-packages/torch/__init__.py:2000)     import importlib
   [2001](file:///C:/Users/xiaoy/anaconda3/envs/nlp/Lib/site-packages/torch/__init__.py:2001)     return importlib.import_module(f".{name}", __name__)
-> [2003](file:///C:/Users/xiaoy/anaconda3/envs/nlp/Lib/site-packages/torch/__init__.py:2003) raise AttributeError(f"module '{__name__}' has no attribute '{name}'")

AttributeError: module 'torch' has no attribute 'int4'

Alessandro Palla · Answer 1 · Thu Jun 06 2024 18:57:35 GMT+0800 (China Standard Time)

Torch does not have int4 attribute. Instead, you should use intel_npu_acceleration_library.int4. A good example is here: https://github.com/intel/intel-npu-acceleration-library/blob/main/examples/phi-3.py#L13

Also, since quantization take a bit, I implemented NPUModelForCausalLM that will cache the quantized model for convenience I suggest you to use that as well. Be sure to use the latest version of this library and the latest driver (https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html) to make full use of the new features like int4 quantization: