artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Home Page:https://arxiv.org/abs/2305.14314

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[XPU] CUDA error when running on arc770 with Intel extension for pytorch

delock opened this issue · comments

@abhilash1910 is XPU support for qlora still working? I tried to run it on a linux arc770 system at home but got the following error:
$ python qlora.py --model_name_or_path facebook/opt-350m
Num processes: 1
Process index: 0
Local process index: 0
Device: xpu:0
, _n_gpu=0, __cached__setup_devices=device(type='cpu'), deepspeed_plugin=None)
loading base model facebook/opt-350m...
/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/modeling_utils.py:2193: FutureWarning: The use_auth_token argument is deprecated and will be
warnings.warn(
Traceback (most recent call last):
File "/home/akey/machine_learning/qlora/qlora.py", line 841, in
train()
File "/home/akey/machine_learning/qlora/qlora.py", line 704, in train
model, tokenizer = get_accelerate_model(args, checkpoint_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akey/machine_learning/qlora/qlora.py", line 311, in get_accelerate_model
model = AutoModelForCausalLM.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3260, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/modeling_utils.py", line 725, in _load_state_dict_into_meta_model
set_module_quantized_tensor_to_device(
File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/utils/bitsandbytes.py", line 109, in set_module_quantized_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/torch/cuda/init.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

My pip list, is there any special package needed for xpu device
(lora) 20:15:50|~/machine_learning/qlora$ pip list|grep torch
intel-extension-for-pytorch 2.0.110+xpu
torch 2.0.1a0+cxx11.abi
torchvision 0.15.2a0+cxx11.abi
bitsandbytes 0.40.0
transformers 4.31.0
accelerate 0.21.0

Hi @delock , the bits & bytes quantization support is in progress; without it the quantized compute will not be in xpu device. I will update once it is completed.