Huizhuo Angela Yuan's repositories

SPPO

The official implementation of Self-Play Preference Optimization (SPPO)

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

SELM

The official implementation of Self-Exploring Language Models (SELM)

Language:PythonStargazers:0Issues:0Issues:0

SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

v202

Proceedings of ICML 2023

Language:TeXStargazers:0Issues:0Issues:0