Yuancheng-Xu

Yuancheng Xu's starred repositories

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonApache-2.0415600

llama2-fine-tune

Scripts for fine-tuning Llama2 via SFT and DPO.

Language:Python16100

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonApache-2.0122000

rewardedsoups

Rewarded soups official implementation

Language:HTML3900

SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward

Language:Python47700

awesome-llm-human-preference-datasets

A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.

MIT27400

HALOs

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Language:PythonApache-2.061400

RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Language:PythonApache-2.035100

args

Language:Python2400

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonMIT570300

language-model-arithmetic

Controlled Text Generation via Language Model Arithmetic

Language:PythonMIT18200

reward-bench

RewardBench: the first evaluation tool for reward models.

Language:PythonApache-2.026800

dpo-rlaif

Language:Jupyter NotebookApache-2.08000

awesome-RLAIF

A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)

Apache-2.08900

UltraFeedback

A large-scale, fine-grained, diverse preference dataset (and models).

Language:PythonMIT27700

Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

274300

VGen

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Language:Python273200

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.01995000

HarmBench

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

Language:Jupyter NotebookMIT20500

ToolEmu

A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use

Language:PythonApache-2.09500

curiosity_redteam

Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizXgXU)

Language:Jupyter NotebookMIT4700

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

Apache-2.0298800

chain-of-hindsight

Chain-of-Hindsight, A Scalable RLHF Method

Language:PythonApache-2.021000

LLMAgentPapers

Must-read Papers on LLM Agents.

140300

opacus

Training PyTorch models with differential privacy

Language:Jupyter NotebookApache-2.0164200

WAVES

Code for our paper "Benchmarking the Robustness of Image Watermarks"

Language:Python3000

LLM-Agents-Papers

A repo lists papers related to LLM based agent

Language:Python80300

VLM-Poison.github.io

Project Website for the paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"

Language:JavaScript100

VLM-Poisoning

Code for the paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"

Language:Python1600

Academic-project-page-template

A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/

Language:JavaScript145200