DrishtiShrrrma / llama-2-7b-chat-hf-dolly-15k-w-bigbench-hard-evaluation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fine-Tuning Llama-2-7b on Databricks-Dolly-15k Dataset and Evaluating with BigBench-Hard

In this project we will fine-tuning the Llama 2-7b-chat-hf model on the databricks-dolly-15k dataset using Google Colab.

Model and Dataset Configuration

  • Model: NousResearch/llama-2-7b-chat-hf
  • Dataset: databricks/dolly-15k

Dataset Overview

The databricks-dolly-15k dataset, hosted on Hugging Face, consists of over 15,000 records generated by Databricks employees across various behavioral categories. The dataset, available under the Creative Commons Attribution-ShareAlike 3.0 Unported License, is primarily intended for training large language models and synthetic data generation.

Training Parameters

  • LoRA attention dimension: 64
  • Alpha parameter for LoRA scaling: 16
  • Dropout probability for LoRA layers: 0.1
  • 4-bit precision base model loading: True
  • Quantization type: nf4
  • Nested quantization: False
  • Training epochs: 1
  • Training batch size per GPU: 16
  • Evaluation batch size per GPU: 16
  • Optimizer: paged_adamw_32bit

Training Result:

image

Post-Training Analysis

The trained model was subjected to general questions and also questions from the BigBench-Hard dataset. While it demonstrated proficiency in handling general questions, there were instances where disparities emerged between the responses generated by the model and the anticipated answers, specifically during evaluations on BigBench-Hard questions.

Conclusion

The Llama 2-7b model was successfully fine-tuned on the databricks-dolly-15k dataset. Despite certain limitations in its responses, the fine-tuned model offers promising potential for integration into platforms such as LangChain, presenting an alternative to the OpenAI API. Further work is necessary to refine the model's accuracy.

About


Languages

Language:Jupyter Notebook 100.0%