Yuancheng Xu's starred repositories
alignment-handbook
Robust recipes to align language models with human and AI preferences
llama2-fine-tune
Scripts for fine-tuning Llama2 via SFT and DPO.
rewardedsoups
Rewarded soups official implementation
awesome-llm-human-preference-datasets
A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.
RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
lm-evaluation-harness
A framework for few-shot evaluation of language models.
language-model-arithmetic
Controlled Text Generation via Language Model Arithmetic
reward-bench
RewardBench: the first evaluation tool for reward models.
awesome-RLAIF
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
UltraFeedback
A large-scale, fine-grained, diverse preference dataset (and models).
Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
curiosity_redteam
Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizXgXU)
awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
chain-of-hindsight
Chain-of-Hindsight, A Scalable RLHF Method
LLMAgentPapers
Must-read Papers on LLM Agents.
LLM-Agents-Papers
A repo lists papers related to LLM based agent
VLM-Poison.github.io
Project Website for the paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
VLM-Poisoning
Code for the paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
Academic-project-page-template
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/