Johnny He (sweetice)

sweetice

Geek Repo

Location:Tuebingen, Germany

Home Page:sweetice.github.io

Github PK Tool:Github PK Tool

Johnny He's repositories

Deep-reinforcement-learning-with-pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Language:PythonLicense:MITStargazers:3530Issues:34Issues:31

PEER-CVPR23

Authors' implementation of PEER

Language:PythonLicense:MITStargazers:8Issues:1Issues:0

ERC-ECML-23

Anonymous code for ICML submission 45

Language:PythonStargazers:1Issues:2Issues:0
Language:HTMLLicense:MITStargazers:1Issues:2Issues:0

BEER-ICLR2024

The present anonymous repository serves as a guide for reproducing the results of the "BEER" method proposed in our ICLR submission "Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation".

Language:PythonStargazers:0Issues:0Issues:0

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

dalai_llama

The simplest way to run LLaMA on your local machine

Stargazers:0Issues:0Issues:0

deep-successor-features-for-transfer

A reusable framework for successor features for transfer in deep reinforcement learning using keras.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

drqv2

DrQ-v2: Improved Data-Augmented Reinforcement Learning

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

ffn_geyang

Public Repo for the paper "Overcoming The Spectral-Bias of Neural Value Approximation"

Language:PythonStargazers:0Issues:0Issues:0

gulf

GULF: GUided Learning through successive Functional gradient optimization (author implementation of DPCNN included)

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

learned-fourier-features

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Language:PythonStargazers:0Issues:0Issues:0

LibMTL

A PyTorch Library for Multi-Task Learning

License:MITStargazers:0Issues:0Issues:0

llama

Inference code for LLaMA models

License:GPL-3.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

MEPE

Official implementation of MEPE

Language:PythonStargazers:0Issues:1Issues:0

mpo

PyTorch Implementation of the Maximum a Posteriori Policy Optimisation

Language:PythonLicense:GPL-3.0Stargazers:0Issues:1Issues:0

neural-approx-ss-lfi

Codes for ICLR 21 paper: Neural Approximate Sufficient Statistics for Implicit Models

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

pderl

Code for "Proximal Distilled Evolutionary Reinforcement Learning", accepted at AAAI 2020

Language:PythonStargazers:0Issues:1Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

RWKV-LM

RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

sweetice.github.io

A beautiful, simple, clean, and responsive Jekyll theme for academics

Language:HTMLLicense:MITStargazers:0Issues:1Issues:0
Language:JavaScriptLicense:MITStargazers:0Issues:1Issues:0

TD3_BC

Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

tqc_pytorch_1epo

Implementation of Truncated Quantile Critics method for continuous reinforcement learning. https://bayesgroup.github.io/tqc/

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

License:Apache-2.0Stargazers:0Issues:0Issues:0

voltron-robotics

Voltron: Language-Driven Representation Learning for Robotics

License:MITStargazers:0Issues:0Issues:0