Re-training PEFT model fails after loading with `Linear4bit` error

Question

Re-training PEFT model fails after loading with `Linear4bit` error

thelinuxkid opened this issue 6 months ago · comments

Describe the bug
When attempting to train on top of an already trained model (with new data), loading the model with Python throws the error:

Already found a `peft_config` attribute in the model. This will lead to having multiple adapters in the model. Make sure to know what you are doing!                  [3311/13921]
Traceback (most recent call last):
  File "train_enric_actions.py", line 115, in <module>
    train(config_path, base_model, dataset, model_name, output_directory)
  File "train_enric_actions.py", line 100, in train
    model.train(
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/ludwig/api.py", line 619, in train
    with self.backend.create_trainer(
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/ludwig/backend/base.py", line 293, in create_trainer
    return trainer_cls(config=config, model=model, **kwargs)
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/ludwig/trainers/trainer_llm.py", line 418, in __init__
    super().__init__(
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/ludwig/trainers/trainer.py", line 179, in __init__
    self.model.prepare_for_training()
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/ludwig/models/llm.py", line 259, in prepare_for_training
    self.initialize_adapter()
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/ludwig/models/llm.py", line 247, in initialize_adapter
    self.model = get_peft_model(self.model, peft_config)
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/peft/mapping.py", line 133, in get_peft_model
    return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config, adapter_name=adapter_name)
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/peft/peft_model.py", line 1041, in __init__
    super().__init__(model, peft_config, adapter_name)
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/peft/peft_model.py", line 123, in __init__
    self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/peft/tuners/lora/model.py", line 119, in __init__
    super().__init__(model, config, adapter_name)
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/peft/tuners/tuners_utils.py", line 95, in __init__
    self.inject_adapter(self.model, adapter_name)
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/peft/tuners/tuners_utils.py", line 252, in inject_adapter
    self._create_and_replace(peft_config, adapter_name, target, target_name, parent, **optional_kwargs)
  File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/peft/tuners/lora/model.py", line 200, in _create_and_replace
    new_module = self._create_new_module(lora_config, adapter_name, target, **kwargs)                                                                                               File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/peft/tuners/lora/model.py", line 286, in _create_new_module
    "compute_dtype": target.compute_dtype,                                                                                                                                          File "/home/ubuntu/.local/share/virtualenvs/ludwig-JgQxVRRw/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1695, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'Linear4bit' object has no attribute 'compute_dtype'

after attempting to load the model as so:

ludwig_model = LudwigModel(config)
ludwig_model.model = LudwigModel.create_model(config)
ludwig.model.load_weights("results/experiment_run_0/model")

To Reproduce
Steps to reproduce the behavior:

Train a single PEFT model
Attempt to retrain again but now loading the already trained model as above
See error

Config:

{
  "model_type": "llm",
  "base_model": "{{base_model}}",
  "generation": {"temperature": 0.1},
  "quantization": {"bits": 4},
  "adapter": {"type": "lora"},
  "prompt": {
    "template": "blas blash"
  },

  "input_features": [
    {
      "name": "context_and_intent",
      "type": "text"
    }
  ],

  "output_features": [
    {
      "name": "action",
      "type": "text",
      "preprocessing":  {
        "fallback_label": "unsure"
      },
      "decoder": {
        "type": "text_extractor",
        "match": {
          "unsure": {
            "type": "contains",
            "value": "unsure"
          },
          "cat1": {
            "type": "contains",
            "value": "cat1"
          }
        }
      }
    }
  ],

  "preprocessing": {
    "split": {
      "type": "random",
      "probabilities": [
        0.95,
        0,
        0.05
      ]
    }
  },

  "trainer": {
    "type": "finetune",
    "epochs": 13,
    "early_stop": -1,
    "optimizer": {
      "type": "paged_adam"
    },
    "weight_decay": 0.1,
    "batch_size": 1,
    "learning_rate": 0.0002,
    "eval_batch_size": 2,
    "learning_rate_scheduler": {
      "decay": "cosine",
      "warmup_fraction": 0.03
    },
    "gradient_accumulation_steps": 16,
    "enable_gradient_checkpointing": true
  }
}

Expected behavior
The training should resume without failure. Thanks to @geoffreyangus for helping me find a workaround by adding the previous training's weights to adapter.pretrained_adapter_weights in the config file.

Environment (please complete the following information):

OS: Ubuntu 20.04
Version: Cuda 12.1
Python version: 3.8.10
bitsandbytes 0.40.2
ludwig 0.8.6
peft 0.7.0
transformers 4.35.2