Harryis Wang's starred repositories
Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
reward-bench
RewardBench: the first evaluation tool for reward models.
alignment-handbook
Robust recipes to align language models with human and AI preferences
LLM-Extrapolation
Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"
self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
reid-strong-baseline
Bag of Tricks and A Strong Baseline for Deep Person Re-identification
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Finetune_LLAMA
简单易懂的LLaMA微调指南。
Stable-Alignment
Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".
LLM-Agent-Paper-Digest
papers related to LLM-agent that published on top conferences
promptbench
A unified evaluation framework for large language models
DA-in-visualRL
Collection of papers and resources for data augmentation (DA) in visual reinforcement learning (RL).
DI-adventure
Decision Intelligence Adventure for Beginners