Hamzenium / Reinforcement-learning-LLM

Fine-tuning Google's FLAN-T5 model for generating non-toxic dialogue summaries using advanced NLP tools like Hugging Face's Transformers. It includes setting up a toxicity detection model, evaluating and detoxifying generated summaries, and employing a PPO-based training loop.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This repository is not active

About

Fine-tuning Google's FLAN-T5 model for generating non-toxic dialogue summaries using advanced NLP tools like Hugging Face's Transformers. It includes setting up a toxicity detection model, evaluating and detoxifying generated summaries, and employing a PPO-based training loop.


Languages

Language:Jupyter Notebook 86.7%Language:Python 13.3%