markredito / fine-tune-generativeAI

A collection of practice notebooks to learn fine-tuning, parameter efficient fine-tuning(PEFT), reinforcement learning via PPO.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Repository Overview

This repository contains Jupyter notebooks that detail advanced techniques in fine-tuning language models for specific tasks, including dialogue summarization and generating less toxic summaries.

1. Fine-Tuning FLAN T5 with Reinforcement Learning (PPO) and PEFT to Generate Less Toxic Summaries

This notebook showcases the process of refining the FLAN T5 model to reduce toxicity in generated summaries. Key highlights include:

  • Introduction to the FLAN T5 Model: An overview of the FLAN T5 model and the significance of adapters.
  • Reinforcement Learning (RL): A deep dive into RL, focusing on the Proximal Policy Optimization (PPO) algorithm, policies, and reward functions.
  • Reward Modeling: Using a RoBERTa-based hate speech model from Meta AI as a reward model to guide the detoxification process.
  • PEFT (Parameter Efficient Fine-Tuning): An efficient fine-tuning technique that offers the benefits of full fine-tuning without the extensive computational costs.
  • Model Evaluations: Assessing the model's performance both qualitatively (human evaluations) and quantitatively (using the ROGUE metric).

2. Fine-tuning Generative AI Model for Dialogue Summarization

This notebook serves as a guide to fine-tuning generative AI models for dialogue summarization. Major takeaways include:

  • Dialogue Summarization: Understanding the importance and challenges of summarizing dialogues using AI.
  • Generative AI for Summarization: The role of generative models in capturing the essence of dialogues.
  • Fine-tuning Process: Step-by-step instructions and insights into refining generative models to produce better dialogue summaries.

About

A collection of practice notebooks to learn fine-tuning, parameter efficient fine-tuning(PEFT), reinforcement learning via PPO.


Languages

Language:Jupyter Notebook 100.0%