APO

Author's implementation of "Average-Reward Reinforcement Learning with Trust Region Methods".

Installation

This code is based my forked version of rlpyt. To reproduce the results in paper, please run python run_exp.py. Note that this python file contains all the hyperparameters I have tried on 8 GPUs. Please set your hyperparameters manually before your experiments.

Bibtex

@inproceedings{ma2021average-reward,
    title={Average-Reward Reinforcement Learning with Trust Region Methods},
    author={Ma, Xiaoteng and Tang, Xiaohang and Xia, Li and Yang, Jun and Zhao, Qianchuan},
    journal={International Joint Conferences on Artificial Intelligence},
    pages={2797--2803},
    year={2021}

About

Average-Reward Reinforcement Learning with Trust Region Methods

MIT License

Languages

Language:Jupyter Notebook 99.3%Language:Python 0.7%