Johnny He (sweetice)

sweetice

Geek Repo

Location:Tuebingen, Germany

Home Page:sweetice.github.io

Github PK Tool:Github PK Tool

Johnny He's starred repositories

llama

Inference code for Llama models

Language:PythonLicense:NOASSERTIONStargazers:55095Issues:519Issues:949

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:38487Issues:383Issues:1639

autogen

A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:29708Issues:360Issues:1560

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonLicense:GPL-3.0Stargazers:5661Issues:78Issues:142

llama-dl

High-speed download of LLaMA, Facebook's 65B parameter GPT model

Language:ShellLicense:GPL-3.0Stargazers:4165Issues:68Issues:15

Conference-Acceptance-Rate

Acceptance rates for the major AI conferences

Language:Jupyter NotebookLicense:MITStargazers:4064Issues:127Issues:28

Awesome-Pruning

A curated list of neural network pruning resources.

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonLicense:Apache-2.0Stargazers:1763Issues:21Issues:179

bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent

Language:PythonLicense:Apache-2.0Stargazers:1492Issues:61Issues:31

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1379Issues:7Issues:136

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonLicense:MITStargazers:1053Issues:21Issues:31

pytorch_sac

PyTorch implementation of Soft Actor-Critic (SAC)

Language:Jupyter NotebookLicense:MITStargazers:486Issues:6Issues:7

realworldrl_suite

Real-World RL Benchmark Suite

Language:PythonLicense:Apache-2.0Stargazers:342Issues:14Issues:4

tdmpc

Code for "Temporal Difference Learning for Model Predictive Control"

Language:PythonLicense:MITStargazers:316Issues:6Issues:18

Reinforcement-Learning-Papers

Related papers for reinforcement learning, including classic papers and latest papers in top conferences

License:MITStargazers:270Issues:15Issues:0

controlvideo

Official implementation for "ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing"

Language:PythonLicense:Apache-2.0Stargazers:216Issues:18Issues:15

voltron-robotics

Voltron: Language-Driven Representation Learning for Robotics

Language:PythonLicense:MITStargazers:195Issues:2Issues:13

AAGPT

AAGPT is another experimental open-source application showcasing the capabilities of large language models, such as GPT-3.5 and GPT-4.

Language:PythonLicense:MITStargazers:154Issues:18Issues:0

Plan4MC

Reinforcement learning and planning for Minecraft.

Language:PythonLicense:MITStargazers:147Issues:4Issues:4

EcoAssistant

EcoAssistant: using LLM assistant more affordably and accurately

Language:PythonLicense:MITStargazers:127Issues:1Issues:1

TD7

Author's PyTorch implementation of TD7 for online and offline RL

Language:PythonLicense:MITStargazers:104Issues:4Issues:5

RLPHF

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

DA-in-visualRL

Collection of papers and resources for data augmentation (DA) in visual reinforcement learning (RL).

generalized_dt

Generalized Decision Transformer for Offline Hindsight Information Matching (ICLR2022)

Language:PythonStargazers:64Issues:0Issues:4

tqc

Implementation of Truncated Quantile Critics method for continuous reinforcement learning.

Language:PythonLicense:NOASSERTIONStargazers:18Issues:3Issues:0
Language:HTMLLicense:MITStargazers:3Issues:2Issues:0