problem with loading my finetuned Llama2 model - type object 'Params4bit' has no attribute 'from_prequantized'

Question

problem with loading my finetuned Llama2 model - type object 'Params4bit' has no attribute 'from_prequantized'

lujain89 opened this issue 5 months ago · comments

System Info

I finetuned a model

training_arguments = TrainingArguments(
    output_dir=output_dir,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_steps=logging_steps,
    learning_rate=learning_rate,
    fp16=fp16,
    bf16=bf16,
    max_grad_norm=max_grad_norm,
    max_steps=max_steps,
    warmup_ratio=warmup_ratio,
    group_by_length=group_by_length,
    lr_scheduler_type=lr_scheduler_type,
    report_to="tensorboard",
    auto_find_batch_size=False,
)

and then pushed it HF hub:

model.push_to_hub(new_model, max_shard_size='2GB')

Then, when I tried to load it back from HF hub
I encountered this error:

/usr/local/lib/python3.10/dist-packages/transformers/quantizers/quantizer_bnb_4bit.py in create_quantized_param(self, model, param_value, param_name, target_device, state_dict, unexpected_keys)
196 unexpected_keys.remove(k)
197
--> 198 new_value = bnb.nn.Params4bit.from_prequantized(
199 data=param_value,
200 quantized_stats=quantized_stats,

AttributeError: type object 'Params4bit' has no attribute 'from_prequantized'

Reproduction

Now you can use `model` and `tokenizer` for inference

finetuned_model, f_tokenizer, peft_config = load_model("huggingfaceModelname")

Expected behavior

load the finetuned model

liuweikarlie · Answer 1 · Tue May 28 2024 18:04:00 GMT+0800 (China Standard Time)

i got the same problem, did you find the solution ?

Matthew Douglas · Answer 2 · Tue May 28 2024 22:09:54 GMT+0800 (China Standard Time)

4bit serialization requires bitsandbytes>=0.42.0. Please verify that you have a newer version installed, or upgrade with pip install -U bitsandbytes.

liuweikarlie · Answer 3 · Tue May 28 2024 23:24:34 GMT+0800 (China Standard Time)

4bit serialization requires bitsandbytes>=0.42.0. Please verify that you have a newer version installed, or upgrade with pip install -U bitsandbytes.

Thank you !

problem with loading my finetuned Llama2 model - type object 'Params4bit' has no attribute 'from_prequantized'

System Info

Reproduction

Now you can use model and tokenizer for inference

Expected behavior

Now you can use `model` and `tokenizer` for inference