ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Home Page:http://ludwig.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Issues fine tuning Mistral

vinven7 opened this issue · comments

Describe the bug
I am trying to finetune Mistral Instruct V0.2 and I keep getting this error.
TypeError: LoraConfig.__init__() got an unexpected keyword argument 'use_rslora
**To Reproduce** Steps to reproduce the behavior:qlora_fine_tuning_config = yaml.safe_load(
"""
model_type: llm
base_model: mistralai/Mistral-7B-Instruct-v0.2

input_features:
  - name: Prompt
    type: text
    preprocessing:
      max_sequence_length: 256

output_features:
  - name: Responses
    type: text
    preprocessing:
      max_sequence_length: 256

prompt:
   template: >-

     ### Prompt: {Prompt}

     ### responses : 

generation:
  temperature: 0.1
  max_new_tokens: 256

adapter:
  type: lora

quantization:
  bits: 4

preprocessing:
  split:
     probabilities:
      - 1.0
      - 0.0
      - 0.0

trainer:
  type: finetune
  # epochs: 5
  # epochs: 3
  train_steps: 5
  batch_size: 1
  eval_batch_size: 2
  gradient_accumulation_steps: 16  # effective batch size = batch size * gradient_accumulation_steps
  learning_rate: 2.0e-4
  enable_gradient_checkpointing: true
  learning_rate_scheduler:
    decay: cosine
    warmup_fraction: 0.03
    reduce_on_plateau: 0
"""

)

new_model = LudwigModel(config=qlora_fine_tuning_config, logging_level=logging.INFO)
results = new_model.train(dataset=train_df)`

Here is the full trace:
`TypeError Traceback (most recent call last)
Cell In[9], line 60
1 qlora_fine_tuning_config = yaml.safe_load(
2 """
3 model_type: llm
(...)
56 """
57 )
59 new_model = LudwigModel(config=qlora_fine_tuning_config, logging_level=logging.INFO)
---> 60 results = new_model.train(dataset=train_df)

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/api.py:647, in LudwigModel.train(self, dataset, training_set, validation_set, test_set, training_set_metadata, data_format, experiment_name, model_name, model_resume_path, skip_save_training_description, skip_save_training_statistics, skip_save_model, skip_save_progress, skip_save_log, skip_save_processed_input, output_directory, random_seed, **kwargs)
644 detected_learning_rate = get_auto_learning_rate(self.config_obj)
645 self.config_obj.trainer.learning_rate = detected_learning_rate
--> 647 with self.backend.create_trainer(
648 model=self.model,
649 config=self.config_obj.trainer,
650 resume=model_resume_path is not None,
651 skip_save_model=skip_save_model,
652 skip_save_progress=skip_save_progress,
653 skip_save_log=skip_save_log,
654 callbacks=train_callbacks,
655 random_seed=random_seed,
656 ) as trainer:
657 # auto tune batch size
658 self._tune_batch_size(trainer, training_set, random_seed=random_seed)
660 if (
661 self.config_obj.model_type == "LLM"
662 and trainer.config.type == "none"
663 and self.config_obj.adapter is not None
664 and self.config_obj.adapter.pretrained_adapter_weights is not None
665 ):

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/backend/base.py:293, in LocalBackend.create_trainer(self, config, model, **kwargs)
290 else:
291 trainer_cls = get_from_registry(model.type(), get_trainers_registry())
--> 293 return trainer_cls(config=config, model=model, **kwargs)

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/trainers/trainer_llm.py:427, in FineTuneTrainer.init(self, config, model, resume, skip_save_model, skip_save_progress, skip_save_log, callbacks, report_tqdm_to_ray, random_seed, distributed, device, **kwargs)
412 def init(
413 self,
414 config: FineTuneTrainerConfig,
(...)
425 **kwargs,
426 ):
--> 427 super().init(
428 config,
429 model,
430 resume,
431 skip_save_model,
432 skip_save_progress,
433 skip_save_log,
434 callbacks,
435 report_tqdm_to_ray,
436 random_seed,
437 distributed,
438 device,
439 **kwargs,
440 )

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/trainers/trainer.py:201, in Trainer.init(self, config, model, resume, skip_save_model, skip_save_progress, skip_save_log, callbacks, report_tqdm_to_ray, random_seed, distributed, device, **kwargs)
198 self.device = get_torch_device()
200 self.model = model
--> 201 self.model.prepare_for_training()
202 self.model = self.distributed.to_device(self.model)
203 self.model.metrics_to_device(self.device)

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/models/llm.py:212, in LLM.prepare_for_training(self)
210 if self.config_obj.quantization:
211 self.prepare_for_quantized_training()
--> 212 self.initialize_adapter()

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/models/llm.py:200, in LLM.initialize_adapter(self)
194 if self.config_obj.trainer.type != "finetune" and not self.config_obj.adapter.pretrained_adapter_weights:
195 raise ValueError(
196 "Adapter config was provided, but trainer type is not set to finetune. Either set the trainer to "
197 "finetune or remove the adapter config."
198 )
--> 200 self.model = initialize_adapter(self.model, self.config_obj)
202 logger.info("==================================================")
203 logger.info("Trainable Parameter Summary For Fine-Tuning")

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/utils/llm_utils.py:185, in initialize_adapter(model, config_obj)
182 from peft import get_peft_model, TaskType # noqa
184 # If no pretrained adapter is provided, we want to load untrained weights into the model
--> 185 peft_config = config_obj.adapter.to_config(
186 task_type=TaskType.CAUSAL_LM, tokenizer_name_or_path=config_obj.base_model
187 )
189 model = get_peft_model(model, peft_config)
191 return model

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/schema/llms/peft.py:147, in LoraConfig.to_config(self, task_type, **kwargs)
144 def to_config(self, task_type: str = None, **kwargs) -> "PeftConfig":
145 from peft import LoraConfig as _LoraConfig
--> 147 return _LoraConfig(
148 r=self.r,
149 lora_alpha=self.alpha,
150 lora_dropout=self.dropout,
151 bias=self.bias_type,
152 target_modules=self.target_modules,
153 task_type=task_type,
154 use_rslora=self.use_rslora,
155 use_dora=self.use_dora,
156 )

TypeError: LoraConfig.init() got an unexpected keyword argument 'use_rslora'`

Environment (please complete the following information):

  • OS: Linux, Jupyter Notebook
  • Version [e.g. 22]
  • 3.0
  • 0.10.3

The issue was resolved when I upgraded to peft version 0.10.0