ubuntu 22.04 MTL 165h benchmark Aborted (core dumped)
taotao1-1 opened this issue · comments
(llm) peiyuan@peiyuan:~/ipex-llm/python/llm/dev/benchmark/all-in-one$ python run.py
/home/peiyuan/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/home/peiyuan/miniconda3/envs/llm/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from torchvision.io
, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg
or libpng
installed before building torchvision
from source?
warn(
ZE_LOADER_DEBUG_TRACE:Loading Driver libze_intel_gpu.so.1
ZE_LOADER_DEBUG_TRACE:Loading Driver libze_intel_vpu.so.1
ZE_LOADER_DEBUG_TRACE:Load Library of libze_intel_vpu.so.1 failed with libze_intel_vpu.so.1: cannot open shared object file: No such file or directory
ZE_LOADER_DEBUG_TRACE:check_drivers(flags=0(ZE_INIT_ALL_DRIVER_TYPES_ENABLED))
ZE_LOADER_DEBUG_TRACE:init driver libze_intel_gpu.so.1 zeInit(0(ZE_INIT_ALL_DRIVER_TYPES_ENABLED)) returning ZE_RESULT_SUCCESS
ZE_LOADER_DEBUG_TRACE:Loading Driver libze_intel_gpu.so.1
ZE_LOADER_DEBUG_TRACE:Loading Driver libze_intel_vpu.so.1
ZE_LOADER_DEBUG_TRACE:Load Library of libze_intel_vpu.so.1 failed with libze_intel_vpu.so.1: cannot open shared object file: No such file or directory
ZE_LOADER_DEBUG_TRACE:check_drivers(flags=0(ZE_INIT_ALL_DRIVER_TYPES_ENABLED))
ZE_LOADER_DEBUG_TRACE:init driver libze_intel_gpu.so.1 zeInit(0(ZE_INIT_ALL_DRIVER_TYPES_ENABLED)) returning ZE_RESULT_SUCCESS
2024-06-07 11:29:58,545 - INFO - intel_extension_for_pytorch auto imported
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 10.09it/s]
2024-06-07 11:29:59,727 - INFO - Converting the current model to sym_int4 format......
ZE_LOADER_DEBUG_TRACE:Loading Driver libze_intel_gpu.so.1
ZE_LOADER_DEBUG_TRACE:Loading Driver libze_intel_vpu.so.1
ZE_LOADER_DEBUG_TRACE:Load Library of libze_intel_vpu.so.1 failed with libze_intel_vpu.so.1: cannot open shared object file: No such file or directory
ZE_LOADER_DEBUG_TRACE:check_drivers(flags=0(ZE_INIT_ALL_DRIVER_TYPES_ENABLED))
ZE_LOADER_DEBUG_TRACE:init driver libze_intel_gpu.so.1 zeInit(0(ZE_INIT_ALL_DRIVER_TYPES_ENABLED)) returning ZE_RESULT_SUCCESS
loading of model costs 7.7113322770019295s and 6.005859375GB
<class 'transformers_modules.Qwen-7B-Chat.modeling_qwen.QWenLMHeadModel'>
/home/peiyuan/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/configuration_utils.py:394: UserWarning:do_sample
is set toFalse
. However,top_p
is set to0.8
-- this flag is only used in sample-based generation modes. You should setdo_sample=True
or unsettop_p
.
warnings.warn(
/home/peiyuan/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/configuration_utils.py:404: UserWarning:do_sample
is set toFalse
. However,top_k
is set to0
-- this flag is only used in sample-based generation modes. You should setdo_sample=True
or unsettop_k
.
warnings.warn(
LLVM ERROR: Diag: aborted
LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
LIBXSMM_TARGET: adl [Intel(R) Core(TM) Ultra 7 165H]
Registry and code: 13 MB
Command: python run.py
Uptime: 17.947757 s
Aborted (core dumped)
Find a lots of build error files in all-in-one folder:
_ZTSZZN3gpu5xetla4fmha32fmha_forward_causal_strided_implINS0_22fmha_policy_64x256x256EN4sycl3_V16detail9half_impl4halfELb1ELb0ELb1ELb1ELb0EEEvRNS5_5queueEPT0_SC_SC_SC_SC_PhSC_fffjjjjjjjENKUlRNS5_7handlerEE_clESF_EUlNS5_7nd_itemILi3EEEE_.errors.txt
_ZTSZZN3gpu5xetla4fmha32fmha_forward_causal_strided_implINS0_22fmha_policy_64x256x256EN4sycl3_V16detail9half_impl4halfELb1ELb1ELb0ELb0ELb0EEEvRNS5_5queueEPT0_SC_SC_SC_SC_PhSC_fffjjjjjjjENKUlRNS5_7handlerEE_clESF_EUlNS5_7nd_itemILi3EEEE_.errors.txt
_ZTSZZN3gpu5xetla4fmha32fmha_forward_causal_strided_implINS0_22fmha_policy_64x256x256EN4sycl3_V16detail9half_impl4halfELb1ELb1ELb0ELb1ELb0EEEvRNS5_5queueEPT0_SC_SC_SC_SC_PhSC_fffjjjjjjjENKUlRNS5_7handlerEE_clESF_EUlNS5_7nd_itemILi3EEEE_.errors.txt
_ZTSZZN3gpu5xetla4fmha32fmha_forward_causal_strided_implINS0_22fmha_policy_64x256x256EN4sycl3_V16detail9half_impl4halfELb1ELb1ELb1ELb1ELb0EEEvRNS5_5queueEPT0_SC_SC_SC_SC_PhSC_fffjjjjjjjENKUlRNS5_7handlerEE_clESF_EUlNS5_7nd_itemILi3EEEE_.errors.txt
The content is:
EEvRNS5_5queueEPT0_SC_SC_SC_SC_PhSC_fffjjjjjjjENKUlRNS5_7handlerEE_clESF_EUlNS5_7nd_itemILi3EEEE_.errors.txt
Instruction / Operand / Region Errors:
/--------------------------------------------!!!INSTRUCTION ERROR FOUND!!!---------------------------------------------\
Error in CISA routine with name: _ZTSZZN3gpu5xetla4fmha32fmha_forward_causal_strided_implINS0_22fmha_policy_64x256x256EN4sycl3_V16detail9half_impl4halfELb1ELb1ELb1ELb1ELb0EEEvRNS5_5queueEPT0_SC_SC_SC_SC_PhSC_fffjjjjjjjENKUlRNS5_7handlerEE_clESF_EUlNS5_7nd_itemILi3EEEE_
Error Message: vISA instruction not supported on this platform
Diagnostics:
Instruction variables' decls:
.decl V93 v_type=G type=b num_elts=4 align=dword
.decl V93 v_type=G type=b num_elts=4 align=dword
Violating Instruction: nbarrier.wait V93(0,0)<0;1,0> /// $83
\----------------------------------------------------------------------------------------------------------------------/
This is caused by wrong result of has_xetla, the code go to
![image](https://private-user-images.githubusercontent.com/4495653/337544862-89fef427-80e5-4e39-b7b0-bcd9943303e0.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE4NTk1MjUsIm5iZiI6MTcyMTg1OTIyNSwicGF0aCI6Ii80NDk1NjUzLzMzNzU0NDg2Mi04OWZlZjQyNy04MGU1LTRlMzktYjdiMC1iY2Q5OTQzMzAzZTAucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcyNCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MjRUMjIxMzQ1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YmIxYTcyZjBiY2U0ZmE3YzdkYTRjYmQyNmQ2ZGRhNzZiNWI1Y2FlZGE2MTM5NGIyNDE5OWRlYmViNjAwNzU5YiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.uaYkTV7Ddv8b2HsTLKEIWO07o5-WI9DZlBFOVCI8820)