This repository contains scripts and instructions for fine-tuning the Llama-2-7b model on the Kaludi Customer-Support-Responses dataset. The goal is to create an automated customer support agent capable of generating relevant and coherent responses to customer queries.
For the easiest and quickest way to test the fine-tuned model, please use the provided Google Drive link. This link contains all necessary files and Colab notebooks. Simply run the run_agent.ipynb
file in Google Colab, which will handle all the required installations and configurations for you.
By using the Colab notebook, you can avoid manual installation of dependencies and start testing the fine-tuned model immediately.
- Introduction
- Dataset
- Model
- Training
- Evaluation
- Usage
- Requirements
- Results
- Contributing
- License
- Additional Resources
This project fine-tunes the Llama-2-7b model using the Kaludi Customer-Support-Responses dataset. The objective is to train a model that can generate appropriate customer support responses.
The dataset used for training consists of customer queries and corresponding support responses. It can be loaded from Hugging Face:
from datasets import load_dataset
dataset = load_dataset("Kaludi/Customer-Support-Responses")
We use the NousResearch/Llama-2-7b-chat-hf as the base model and fine-tune it using LoRA (Low-Rank Adaptation) techniques to efficiently adjust the model weights.
The training script train.py
performs the following steps:
- Load the dataset.
- Format the data for Llama-2.
- Set up the model and tokenizer.
- Configure training parameters.
- Fine-tune the model using the
SFTTrainer
from thetrl
library.
To train the model, run:
python train.py
-
LoRA parameters:
lora_r = 64
: The rank of the LoRA matrices. Higher values may capture more information but require more computation.lora_alpha = 16
: Scaling factor for the LoRA layers. Adjusts the importance of the LoRA modifications.lora_dropout = 0.1
: Dropout rate for the LoRA layers. Helps prevent overfitting by randomly dropping units during training.
-
TrainingArguments parameters:
num_train_epochs = 5
: Number of times the model will iterate over the entire training dataset.per_device_train_batch_size = 4
: Number of samples processed before updating the model's parameters.gradient_accumulation_steps = 1
: Number of steps to accumulate gradients before performing a backpropagation pass.learning_rate = 2e-4
: Step size at each iteration while moving towards a minimum of the loss function.weight_decay = 0.001
: Regularization technique to reduce overfitting by penalizing large weights.lr_scheduler_type = "cosine"
: Learning rate schedule. Cosine decay can help the model converge more smoothly.
The model was trained on Google Colab using a T4 GPU. For better performance, more powerful GPUs such as V100 or A100 could be used, and the batch size could be increased. Gradient checkpointing can be disabled on machines with more memory to speed up training.
Evaluation is performed using evaluate_LLM.ipynb
, which calculates the perplexity score of the fine-tuned model:
perplexity = calculate_perplexity(model, tokenizer, validation_texts)
print(f'Perplexity: {perplexity}')
- Perplexity: Measures how well the model predicts the next token in a sequence. Lower perplexity indicates better performance.
You can download the fine-tuned model directly from the Colab link provided in the Google Drive. This ensures that you have all the necessary files to run the model locally if preferred.
You can interact with the fine-tuned model using run_agent.py
. This script allows you to input queries and get responses from the model in a loop:
python run_agent.py
or you can run the run_agent.ipynb file in Colab or Jupyter environment - tested.
Install the required packages using requirements.txt
:
pip install -r requirements.txt
accelerate==0.21.0
peft==0.4.0
bitsandbytes==0.40.2
transformers==4.31.0
trl==0.4.7
torch==2.3.0+cu121
After fine-tuning, the model achieves a perplexity score of less than 100 on the sample validation set, indicating good performance in generating coherent and relevant responses.
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
For more details and to access the trained model and related files, please visit the Google Drive link.