RikkiXu

Xu_Ruijie's starred repositories

NCD_PC

Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation (ECCV2024)

300

RLCD

Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment

Language:PythonMIT6000

self-rewarding-lm-pytorch

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Language:PythonMIT127300

DeepSeek-LLM

DeepSeek LLM: Let there be answers

Language:MakefileMIT134800

ML-Papers-of-the-Week

🔥Highlighting the top ML papers every week.

945800

arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Language:Jupyter NotebookApache-2.034700

SELM

The official implementation of Self-Exploring Language Models (SELM)

Language:Python5400

SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward

Language:PythonMIT55000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.02356800

hh-rlhf

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

MIT152000

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT312000

dove

Language:PythonMIT1100

relative-preference-optimization

Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts

Language:PythonApache-2.01300

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION2395400

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonApache-2.0426400

train-with-fsdp

Language:PythonMIT8900

DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.

Language:PythonMIT109300

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonApache-2.0846500

Self-Contrast

Extensive Self-Contrast Enables Feedback-Free Language Model Alignment

Language:PythonApache-2.01600

chain-of-thought-hub

Benchmarking large language models' complex reasoning ability with chain-of-thought prompting

Language:Jupyter NotebookMIT245200

test

Measuring Massive Multitask Language Understanding | ICLR 2021

Language:PythonMIT108400

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonMIT594500

lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences

Language:PythonMIT118000

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Language:Jupyter NotebookApache-2.0134100

llama-trl

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

Language:PythonApache-2.016900

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.02764600

trl

Train transformer language models with reinforcement learning.

Language:PythonApache-2.0886800

alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Language:PythonApache-2.074300

awesome-instruction-dataset

A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)

104500

UltraChat

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)

Language:PythonMIT218300