linyunfeng201203 / model-free-rl-algos

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Model-free Reinforcement Learning Algorithms

This repository contains source codes for the paper titled Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes authored by Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Hiteshi Sharma, Rahul Jain. The paper was accepted at ICML 2020, and the arXiv version can be found here.

This paper proposes two model-free algorithms for tabular MDPs. The first algorithm Optimistic Discounted Q-learning achieves a regret bound of O(T2/3) in weakly-communicating MDPs; the second algorithm MDP-OOMD achieves a regret bound of O(T1/2) in ergodic MDPs.

The codes are implemented jointly by Mehdi Jafarnia-Jahromi, Hiteshi Sharma, and Chen-Yu Wei.

About


Languages

Language:Python 100.0%