sweetice

Johnny He's starred repositories

llama

Inference code for Llama models

Language:PythonNOASSERTION55095 519 949

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.038487 383 1639

autogen

A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap

Language:Jupyter NotebookCC-BY-4.029708 360 1560

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonGPL-3.05661 78 142

llama-dl

High-speed download of LLaMA, Facebook's 65B parameter GPT model

Language:ShellGPL-3.04165 68 15

Conference-Acceptance-Rate

Acceptance rates for the major AI conferences

Language:Jupyter NotebookMIT4064 127 28

Awesome-Pruning

A curated list of neural network pruning resources.

2304 87 28

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonApache-2.01763 21 179

bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent

Language:PythonApache-2.01492 61 31

llama2-chatbot

LLaMA v2 Chatbot

Language:Python1379 18 14

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Language:Jupyter NotebookApache-2.01379 7 136

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT1053 21 31

pytorch_sac

PyTorch implementation of Soft Actor-Critic (SAC)

Language:Jupyter NotebookMIT486 6 7

Awesome-LLM-System-Papers

451 13 1

realworldrl_suite

Real-World RL Benchmark Suite

Language:PythonApache-2.0342 14 4

tdmpc

Code for "Temporal Difference Learning for Model Predictive Control"

Language:PythonMIT316 6 18

Reinforcement-Learning-Papers

Related papers for reinforcement learning, including classic papers and latest papers in top conferences

MIT270 150

controlvideo

Official implementation for "ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing"

Language:PythonApache-2.0216 18 15

voltron-robotics

Voltron: Language-Driven Representation Learning for Robotics

Language:PythonMIT195 2 13

AAGPT

AAGPT is another experimental open-source application showcasing the capabilities of large language models, such as GPT-3.5 and GPT-4.

Language:PythonMIT154 180

Plan4MC

Reinforcement learning and planning for Minecraft.

Language:PythonMIT147 4 4

EcoAssistant

EcoAssistant: using LLM assistant more affordably and accurately

Language:PythonMIT127 1 1

TD7

Author's PyTorch implementation of TD7 for online and offline RL

Language:PythonMIT104 4 5

RLPHF

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

Language:Python83 1 2

DA-in-visualRL

Collection of papers and resources for data augmentation (DA) in visual reinforcement learning (RL).

68 30

generalized_dt

Generalized Decision Transformer for Offline Hindsight Information Matching (ICLR2022)

Language:Python6404

iclr2024-scores

Language:PythonMIT52 2 3

tqc

Implementation of Truncated Quantile Critics method for continuous reinforcement learning.

Language:PythonNOASSERTION18 30

reward-surfaces

Language:PythonMIT15 2 2

xihuai18.github.io

Language:HTMLMIT3 20