Macbook "Torch not compiled with CUDA enabled" Error
LanLanBoom opened this issue · comments
Lan Wang commented
This is my code
from airllm import AutoModel
MAX_LENGTH = 128
model = AutoModel.from_pretrained("garage-bAInd/Platypus2-70B-instruct", compression='4bit', profiling_mode=True, delete_original=True)
input_text = [
'What is the capital of United States?',
]
input_tokens = model.tokenizer(input_text,
return_tensors="pt",
return_attention_mask=False,
truncation=True,
max_length=MAX_LENGTH,
padding=False)
generation_output = model.generate(
input_tokens['input_ids'].mps(),
max_new_tokens=20,
use_cache=True,
return_dict_in_generate=True)
output = model.tokenizer.decode(generation_output.sequences[0])
print(output)
This is the result I got
File "/Users/xxx/miniconda3/envs/torch-playground/lib/python3.9/site-packages/airllm/airllm_llama_mlx.py", line 224, in __init__
self.model_local_path, self.checkpoint_path = find_or_create_local_splitted_path(model_local_path_or_repo_id,
File "/Users/xxx/miniconda3/envs/torch-playground/lib/python3.9/site-packages/airllm/utils.py", line 382, in find_or_create_local_splitted_path
return Path(hf_cache_path), split_and_save_layers(hf_cache_path, layer_shards_saving_path,
File "/Users/xxx/miniconda3/envs/torch-playground/lib/python3.9/site-packages/airllm/utils.py", line 303, in split_and_save_layers
layer_state_dict = compress_layer_state_dict(layer_state_dict, compression)
File "/Users/xxx/miniconda3/envs/torch-playground/lib/python3.9/site-packages/airllm/utils.py", line 162, in compress_layer_state_dict
v_quant, quant_state = bnb.functional.quantize_nf4(v.cuda(), blocksize=64)
File "/Users/xxx/miniconda3/envs/torch-playground/lib/python3.9/site-packages/torch/cuda/__init__.py", line 289, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
I don't get it. Could anyone help me?
Demon Su commented
same output... seems not resolved.
Demon Su commented
--> 162 v_quant, quant_state = bnb.functional.quantize_nf4(v.cuda(), blocksize=64)
seems compression is not fit for mac