TypeError: __init__() got an unexpected keyword argument 'merge_weights'
IsraelAbebe opened this issue · comments
Check before submitting issues
- Make sure to pull the latest code, as some issues and bugs have been fixed.
- Due to frequent dependency updates, please ensure you have followed the steps in our Wiki
- I have read the FAQ section AND searched for similar issues and did not find a similar problem or solution
- Third-party plugin issues - e.g., llama.cpp, text-generation-webui, LlamaChat, we recommend checking the corresponding project for solutions
- Model validity check - Be sure to check the model's SHA256.md. If the model is incorrect, we cannot guarantee its performance
Type of Issue
Model training and fine-tuning
Base Model
LLaMA-7B
Operating System
Linux
Describe your issue in detail
lr=1e-4
lora_rank=8
lora_alpha=32
lora_trainable="q_proj,v_proj,k_proj,o_proj,gate_proj,down_proj,up_proj"
modules_to_save="embed_tokens,lm_head"
lora_dropout=0.05
pretrained_model='daryl149/llama-2-7b-hf'
chinese_tokenizer_path='chinese_llama_lora_7b/'
dataset_dir='../../data/'
per_device_train_batch_size=1
per_device_eval_batch_size=1
gradient_accumulation_steps=8
output_dir=output_dir
peft_model=chinese_llama_lora_7b/
validation_file=../../data/alpaca_data_zh_51k.json
deepspeed_config_file=ds_zero2_no_offload.json
torchrun --nnodes 1 --nproc_per_node 1 run_clm_sft_with_peft.py \
--deepspeed ${deepspeed_config_file} \
--model_name_or_path ${pretrained_model} \
--tokenizer_name_or_path ${chinese_tokenizer_path} \
--dataset_dir ${dataset_dir} \
--validation_split_percentage 0.001 \
--per_device_train_batch_size ${per_device_train_batch_size} \
--per_device_eval_batch_size ${per_device_eval_batch_size} \
--do_train \
--do_eval \
--seed $RANDOM \
--fp16 \
--num_train_epochs 1 \
--lr_scheduler_type cosine \
--learning_rate ${lr} \
--warmup_ratio 0.03 \
--weight_decay 0 \
--logging_strategy steps \
--logging_steps 10 \
--save_strategy steps \
--save_total_limit 3 \
--evaluation_strategy steps \
--eval_steps 100 \
--save_steps 200 \
--gradient_accumulation_steps ${gradient_accumulation_steps} \
--preprocessing_num_workers 8 \
--max_seq_length 512 \
--output_dir ${output_dir} \
--overwrite_output_dir \
--ddp_timeout 30000 \
--logging_first_step True \
--lora_rank ${lora_rank} \
--lora_alpha ${lora_alpha} \
--trainable ${lora_trainable} \
--modules_to_save ${modules_to_save} \
--lora_dropout ${lora_dropout} \
--torch_dtype float16 \
--validation_file ${validation_file} \
--gradient_checkpointing \
--ddp_find_unused_parameters False \
--peft_path ${peft_model} \
Execution logs or screenshots
model = PeftModel.from_pretrained(model, training_args.peft_path)
File "/home/azime/.local/lib/python3.9/site-packages/peft/peft_model.py", line 323, in from_pretrained
config = PEFT_TYPE_TO_CONFIG_MAPPING[
File "/home/azime/.local/lib/python3.9/site-packages/peft/config.py", line 137, in from_pretrained
config = config_cls(**kwargs)
TypeError: init() got an unexpected keyword argument 'merge_weights'
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.