PKU-Alignment's repositories
safety-gymnasium
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Safe-Policy-Optimization
NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms
AlignmentSurvey
AI Alignment: A Comprehensive Survey
beavertails
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
SafeDreamer
ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models