Harryis Wang's starred repositories
alignment-handbook
Robust recipes to align language models with human and AI preferences
promptbench
A unified evaluation framework for large language models
alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
reward-bench
RewardBench: the first evaluation tool for reward models.
Finetune_LLAMA
简单易懂的LLaMA微调指南。
Stable-Alignment
Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".
LLM-Agent-Paper-Digest
papers related to LLM-agent that published on top conferences
Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
DA-in-visualRL
Collection of papers and resources for data augmentation (DA) in visual reinforcement learning (RL).
LLM-Extrapolation
Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"
reid-strong-baseline
Bag of Tricks and A Strong Baseline for Deep Person Re-identification
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.