ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Home Page:http://ludwig.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it.

shripadk opened this issue · comments

Describe the bug

I have enabled 4-bit quantization for fine tuning mistralai/Mistral-7B-v0.1. Seems like Ludwig 0.10.1 depends on bitsandbytes < 0.41.0. But when I run the trainer I get the following warning:

You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it. 
If you want to save 4-bit models, make sure to have `bitsandbytes>=0.41.3` installed.

To Reproduce
Steps to reproduce the behavior:

  1. Install Ludwig
pip install ludwig[full]
  1. Config file (model.yaml):
model_type: llm
base_model: mistralai/Mistral-7B-v0.1

quantization:
  bits: 4

adapter:
  type: lora

prompt:
  template: |
    ### Instruction:
    {instruction}

    ### Input:
    {input}

    ### Response:

input_features:
  - name: prompt
    type: text

output_features:
  - name: output
    type: text

generation:
  temperature: 0.1

trainer:
  type: finetune
  epochs: 3
  optimizer:
    type: paged_adam
  batch_size: 1
  eval_steps: 100
  learning_rate: 0.0002
  eval_batch_size: 2
  steps_per_checkpoint: 1000
  learning_rate_scheduler:
    decay: cosine
    warmup_fraction: 0.03
  gradient_accumulation_steps: 16
  enable_gradient_checkpointing: true

preprocessing:
  sample_ratio: 0.1
  1. Train the model:
ludwig train --config model.yaml --dataset "ludwig://alpaca"

Expected behavior
Should not show the warning on bitsandbytes version not supporting save_pretrained for 4-bit quantization.

Environment (please complete the following information):

  • OS: Linux
  • Version: 6.7.6-arch1-1
  • Python: 3.10.8
  • Ludwig: v0.10.1

@alexsherstinsky

Here is the notebook showing the run... First run asked for a RESTART, after doing that and running all the cells, the output is https://colab.research.google.com/drive/1kmZhQKBzpHBJRJvvp9PEdPEUMfMu6dh7?usp=sharing Just FYI.... btw, the output of the model is "","", but that's most likely an issue with the base model!! [
[@shripadk @alexsherstinsky]

@shripadk Are you still having the issues? A new version of Ludwig will be release next week (you may wish to try again). Please keep an eye on the release announcement next week in our Discord. Thank you!

@alexsherstinsky thanks for the heads up. I'll definitely take a look at it and get back to you on this. Will surely keep an eye on the release. Thanks again 🎉