pickxiguapi / Clean-Offline-RLHF

Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)

Home Page:https://uni-rlhf.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pickxiguapi/Clean-Offline-RLHF Stargazers