miniCPM run benchmark get error in iGPU
violet17 opened this issue · comments
Crystal Liu commented
log:
python run.py
C:\Users\mi\miniconda3\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
warn(
2024-05-21 20:54:40,883 - INFO - intel_extension_for_pytorch auto imported
C:\Users\mi\miniconda3\lib\site-packages\transformers\deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
C:\Users\mi\miniconda3\lib\site-packages\torch\_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
2024-05-21 20:54:46,922 - INFO - Converting the current model to sym_int4 format......
>> loading of model costs 19.866792400000122s and 2.544921875GB
<class 'transformers_modules.MiniCPM-2B-dpo-bf16.modeling_minicpm.MiniCPMForCausalLM'>
C:\Users\mi\miniconda3\lib\site-packages\transformers\generation\configuration_utils.py:515: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.8` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
warnings.warn(
C:\Users\mi\miniconda3\lib\site-packages\transformers\generation\configuration_utils.py:520: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.8` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
warnings.warn(
2024-05-21 20:55:01,157 - WARNING - The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
2024-05-21 20:55:01,157 - WARNING - Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Assertion failed: nb % SBS == 0, file dequantize.cpp, line 23
version:
ipex-llm 2.1.0b20240519
intel-extension-for-pytorch 2.1.20+git4849f3b
pytorch-lightning 2.2.2
pytorch-wpe 0.0.1
rotary-embedding-torch 0.5.3
torch 2.1.0a0+git7bcf7da
torch-complex 0.4.3
torchaudio 2.1.0+6ea1133
torchmetrics 1.3.2
torchvision 0.16.0a0+cxx11.abi
model:
MiniCPM-2B-dpo-fp32
MiniCPM-2B-dpo-bf16
Crystal Liu commented
Solved in 20240523 nightly build. Thanks