MaximeRobeyns / llm_finetuner

8-bit quantized language models with LoRA adapters

Home Page:https://maximerobeyns.github.io/llm_finetuner/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LLM Fine Tuner

generated by stablediffusion

Fine-tune multiple large language models in low-memory environments.

This repository provides wrappers around LLMs for

(*Logo generated by stablediffusion)

Install

Prerequisite: PyTorch with CUDA support (11.3 recommended, but will work with other versions up to 11.7). If using conda, use conda install -c conda-forge cudatoolkit=11.7.

Install with

make install

If you run into issues, see the troubleshooting guide.

Example

import transformers
import finetuna as ft
import bitsandbytes as bnb

model_name = 'facebook/opt-125m'
base_model = transformers.AutoModelForCausalLM.from_pretrained(model_name)

# If memory constraints require it, you can manually pre-quantize a model:
ft.prepare_base_model(base_model)

# Create new finetuned models using either the base or quantized model
model_1 = ft.new_finetuned(base_model)

# Can specify granular parameters, if required
model_2 = ft.new_finetuned(
    base_model,
    adapt_layers = {'embed_tokens', 'embed_posotions', 'q_proj', 'v_proj'},
    embedding_config=ft.EmbeddingAdapterConfig(r=4, alpha=1),
    linear_config={
        'q_proj': ft.LinearAdapterConfig(r=8, alpha=1, dropout=0.0, bias=False),
        'v_proj': ft.LinearAdapterConfig(r=4, alpha=1, dropout=0.1, bias=True),
    },
)

# Be sure to use the bitsandbytes optimisers
opt = bnb.optim.AdamW(model_1.parameters())

# Fine-tune as usual
with t.cuda.amp.autocast():
    opt.zero_grad()
    loss = mse_loss(model_1(prompt) - target)  # pseudo-notation
    model_1.backward(loss)
    opt.step()


# NOTE: saving not yet implemented:

# Either save complete state like a normal pytorch model
t.save(model_1.state_dict(), "/save/path.pt")

# Or save only the changed state to reload from base model
t.save(ft.state_dict(model_1), "/save/path_finetuned.pt")

# Load:
model_2.load_state_dict("/save/path.pt")

# Load only adapter state with strict=False
model_2.load_state_dict("/save/path_finetuned.pt", strict=False)

Usage

Please see the usage guide in the documentation for usage instructions.

About

8-bit quantized language models with LoRA adapters

https://maximerobeyns.github.io/llm_finetuner/


Languages

Language:Python 98.4%Language:Makefile 1.6%