PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

China

PKU-Alignment's repositories

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonApache-2.01260 17 82

omnisafe

OmniSafe is an infrastructural framework for accelerating SafeRL research.

Language:PythonApache-2.0882 38 99

safety-gymnasium

NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Language:PythonApache-2.0352 9 24

Safe-Policy-Optimization

NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

Language:PythonApache-2.0310 7 10

AlignmentSurvey

AI Alignment: A Comprehensive Survey

beavertails

BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).

Language:MakefileApache-2.089 5 6

align-anything

Align Anything: Training Any Modality Model with Feedback

Language:PythonApache-2.04700

ProAgent

ProAgent: Building Proactive Cooperative Agents with Large Language Models

Language:JavaScriptMIT45 9 1

SafeDreamer

ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models

Language:PythonApache-2.035 3 2

safe-sora

SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs).

Language:Python22 30

ReDMan

ReDMan is an open-source simulation platform that provides a standardized implementation of safe RL algorithms for Reliable Dexterous Manipulation.

Language:PythonApache-2.015 30

ProgressGym

Alignment with a millennium of moral progress.

MIT500

llms-resist-alignment

Repo for paper "Language Models Resist Alignment"

Language:Python2 20

.github

020