Xu_Ruijie (RikkiXu)

RikkiXu

Geek Repo

Company:ShanghaiTech

Location:Shanghai China

Github PK Tool:Github PK Tool

Xu_Ruijie's starred repositories

NCD_PC

Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation (ECCV2024)

Stargazers:3Issues:0Issues:0

RLCD

Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment

Language:PythonLicense:MITStargazers:60Issues:0Issues:0

self-rewarding-lm-pytorch

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Language:PythonLicense:MITStargazers:1273Issues:0Issues:0

DeepSeek-LLM

DeepSeek LLM: Let there be answers

Language:MakefileLicense:MITStargazers:1348Issues:0Issues:0

ML-Papers-of-the-Week

🔥Highlighting the top ML papers every week.

Stargazers:9458Issues:0Issues:0

arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:347Issues:0Issues:0

SELM

The official implementation of Self-Exploring Language Models (SELM)

Language:PythonStargazers:54Issues:0Issues:0

SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward

Language:PythonLicense:MITStargazers:550Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:23568Issues:0Issues:0

hh-rlhf

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

License:MITStargazers:1520Issues:0Issues:0

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

License:MITStargazers:3120Issues:0Issues:0
Language:PythonLicense:MITStargazers:11Issues:0Issues:0

relative-preference-optimization

Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts

Language:PythonLicense:Apache-2.0Stargazers:13Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:23954Issues:0Issues:0

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4264Issues:0Issues:0
Language:PythonLicense:MITStargazers:89Issues:0Issues:0

DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.

Language:PythonLicense:MITStargazers:1093Issues:0Issues:0

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:8465Issues:0Issues:0

Self-Contrast

Extensive Self-Contrast Enables Feedback-Free Language Model Alignment

Language:PythonLicense:Apache-2.0Stargazers:16Issues:0Issues:0

chain-of-thought-hub

Benchmarking large language models' complex reasoning ability with chain-of-thought prompting

Language:Jupyter NotebookLicense:MITStargazers:2452Issues:0Issues:0

test

Measuring Massive Multitask Language Understanding | ICLR 2021

Language:PythonLicense:MITStargazers:1084Issues:0Issues:0

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:5945Issues:0Issues:0

lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences

Language:PythonLicense:MITStargazers:1180Issues:0Issues:0

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1341Issues:0Issues:0

llama-trl

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

Language:PythonLicense:Apache-2.0Stargazers:169Issues:0Issues:0

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:27646Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:8868Issues:0Issues:0

alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Language:PythonLicense:Apache-2.0Stargazers:743Issues:0Issues:0

awesome-instruction-dataset

A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)

Stargazers:1045Issues:0Issues:0

UltraChat

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)

Language:PythonLicense:MITStargazers:2183Issues:0Issues:0