There are 1 repository under reward-model topic.
[ICLR 2024] SemiReward: A General Reward Model for Semi-supervised Learning
This repository contains the lab work for Coursera course on "Generative AI with Large Language Models".
Developing a LLM response ranking reward model using HFRL except it's GPT-3.5 instead of human.
Fine-tuning FLAN-T5 with PPO and PEFT to generate less toxic text summaries. This notebook leverages Meta AI's hate speech reward model and utilizes RLHF techniques for improved safety.
POC library built on TextRL for easy training and usage of fine-tuned models using RLHF, a rewards model, and PPO