There are 2 repositories under rlaif topic.
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
ZYN: Zero-Shot Reward Models with Yes-No Questions
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
distilled Self-Critique refines the outputs of a LLM with only synthetic data
A curated and updated list of relevant articles and repositories on Reinforcement Learning from AI Feedback (RLAIF)