PKU-Alignment

PKU-Alignment

Geek Repo

Loves Sharing and Open-Source, Making AI Safer.

Location:China

Github PK Tool:Github PK Tool

PKU-Alignment's repositories

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonLicense:Apache-2.0Stargazers:1260Issues:17Issues:82

omnisafe

OmniSafe is an infrastructural framework for accelerating SafeRL research.

Language:PythonLicense:Apache-2.0Stargazers:882Issues:38Issues:99

safety-gymnasium

NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Language:PythonLicense:Apache-2.0Stargazers:352Issues:9Issues:24

Safe-Policy-Optimization

NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

Language:PythonLicense:Apache-2.0Stargazers:310Issues:7Issues:10

AlignmentSurvey

AI Alignment: A Comprehensive Survey

beavertails

BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).

Language:MakefileLicense:Apache-2.0Stargazers:89Issues:5Issues:6

align-anything

Align Anything: Training Any Modality Model with Feedback

Language:PythonLicense:Apache-2.0Stargazers:47Issues:0Issues:0

ProAgent

ProAgent: Building Proactive Cooperative Agents with Large Language Models

Language:JavaScriptLicense:MITStargazers:45Issues:9Issues:1

SafeDreamer

ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models

Language:PythonLicense:Apache-2.0Stargazers:35Issues:3Issues:2

safe-sora

SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs).

Language:PythonStargazers:22Issues:3Issues:0

ReDMan

ReDMan is an open-source simulation platform that provides a standardized implementation of safe RL algorithms for Reliable Dexterous Manipulation.

Language:PythonLicense:Apache-2.0Stargazers:15Issues:3Issues:0

ProgressGym

Alignment with a millennium of moral progress.

License:MITStargazers:5Issues:0Issues:0

llms-resist-alignment

Repo for paper "Language Models Resist Alignment"

Language:PythonStargazers:2Issues:2Issues:0
Stargazers:0Issues:2Issues:0