szrlee

Yingru Li's repositories

muzero-cpp

A C++ pytorch implementation of MuZero

Language:C++Apache-2.02 20

Distributed-Multi-Label-Continual-Learning

This is a distributed training framework for continual and incremental learning for multi-label multi-class image tasks

Language:Python060

GPT-HyperAgent

The official code repo for HyperAgent for neural bandits and GPT-HyperAgent for content moderation.

Language:Python000

HyperAgent

The official code repo for HyperAgent algorithm published in ICML 2024.

Language:PythonMIT020

Information_Directed_Sampling

Implementation of Russo and Van Roy work on Information Directed Sampling (2017)

Language:Python020

academic-kickstart

Language:ShellMIT030

academic-website

Language:ShellMIT000

bror

Language:Python030

bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent

Language:PythonApache-2.0030

enn

Language:PythonApache-2.0010

Exploration-in-RL

Language:Jupyter Notebook020

graphbackup

Code release for Graph Backup: Data Efficient Backup Exploiting Markovian Transitions https://arxiv.org/abs/2205.15824

Language:PythonMIT010

hustthesis

:notebook_with_decorative_cover: An Unofficial Thesis Template in LaTeX for Huazhong University of Science and Technology

Language:TeXLPPL-1.3c020

HyperFQI

Language:Python020

LangevinDQN

Code for the Langevin DQN agent

Language:Jupyter NotebookMIT010

LMCTS

Language:Python010

logistic_bandit

Logistic Bandit experiments. Official code for the paper "Jointly Efficient and Optimal Algorithms for Logistic Bandits".

Language:Python010

model-based-muesli

muesli implementation based on muzero implementation from JimOhman (https://github.com/JimOhman/model-based-rl)

Language:PythonMIT010

MuZero-Tensor-Batch-MCTS

An idea to implement MCTS by tensors. This implementation is able to process a batch of observations on GPU.

Language:PythonMIT010

OB2I

Code for "Principled Exploration via Optimistic Bootstrapping and Backward Induction"

Language:Python020

offline-rl-neurips.github.io

000

omega

A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.

Language:PythonGPL-3.0020

optimistic-init

Accompanying code for "Optimistic Initialization for Exploration in Continuous Control"

Language:Python010

rlberry

An easy-to-use reinforcement learning library for research and education.

Language:PythonMIT020

rltf

Reinforcement Learning implementations and research prototyping in TensorFlow

Language:PythonMIT020

sigmazero

Generalizing DeepMind's MuZero algorithm on stochastic environments

000

TabulaRL

Language:PythonMIT020

ts_tutorial

Language:Jupyter NotebookMIT020

ucbmq_code

Language:Python020

vae-anomaly-detector

Experiments on unsupervised anomaly detection using variational autoencoder. The variational autoencoder is implemented in Pytorch.

Language:PythonMIT000