haanvid

followers

following

stars

KAIST

Daejeon, Republic of Korea

Haanvid Lee's repositories

DSTC10-SIMMC

Repository (preliminary codes) for DSTC10 SIMMC track.

Language:PythonMIT100

imitation-dice

Language:Python100

kmifqe

Kernel Metric learning for In-sample Fitted Q Evaluation (KMIFQE)

Language:PythonMIT100

kmis

local kernel metric learning for IS (KMIS) OPE estimation

Language:PythonMIT100

agents

TF-Agents is a library for Reinforcement Learning in TensorFlow

Language:PythonApache-2.0000

alberdice

Office PyTorch implementation of AlberDICE

Language:Python000

BCQ

PyTorch implementation of BCQ for "Off-Policy Deep Reinforcement Learning without Exploration"

Language:Python000

continuous-policy-learning

Language:Jupyter Notebook000

dice_rl

Language:PythonApache-2.0000

DJL

000

generative-models

Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.

Language:PythonUnlicense000

google-research

Google Research

Apache-2.0000

GPT-Critic

GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems

000

haanvid.github.io

Personal website

000

LSPI

LSPI(Least-Squares Policy Iteration) with TF1.5

Language:Python000

MC-LAVE-RL

ICLR 2021: "Monte-Carlo Planning and Learning with Language Action Value Estimates"

GPL-2.0000

models

Models built with TensorFlow

Language:PythonApache-2.0000

Nadaraya-Watson-Regression-Metric

Language:MATLAB000

NeuralPipeline_DSTC8

MIT000

palr

MIT000

probability

Probabilistic reasoning and statistical analysis in TensorFlow

Apache-2.0000

RepBM

Representation Balancing MDPs for Off-Policy Policy Evaluation

Language:PythonMIT000

rllab

rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.

Language:PythonNOASSERTION000

rllab-colab-tutorial

Language:Jupyter Notebook000

SBV

GPL-3.0000

slope-experiments

000

softlearning

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

NOASSERTION000

SVGD

TensorFlow Implementation of Stein Variational Gradient Descent (SVGD)

Language:Python000

tutorial-git

:blue_book: 어떻게 깃을 사용하는지 빠르게 알아봅시다. (Quick learn How to use Git.)

BSD-3-Clause000

zr-obp

Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation

Apache-2.0000