TianQi-777 / mpo

PyTorch Implementation of the Maximum a Posteriori Policy Optimisation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MPO

PyTorch Implementation of the Maximum a Posteriori Policy Optimisation (paper1, paper2) Reinforcement Learning Algorithms for OpenAI gym environments.

How to Run

I tested on the below environment.

  • Windows 10
  • Python 3.7
  • PyTorch 1.8.1

INSTALL

Install PyTorch https://pytorch.org/

pip install gym Box2D IPython tqdm scipy tensorboard tensorboardx

Continuous Action Space

python train.py \
  --device cuda:0 \
  --env LunarLanderContinuous-v2 \
  --log log_continuous \
  --render

Discrete Action Space

python train.py \
  --device cuda:0 \
  --env LunarLander-v2 \
  --log log_discrete \
  --render

License

This repository is a clone of theogruner/rl_pro_telu, which is licensed under the GNU GPL3 License - see the LICENSE file for details

About

PyTorch Implementation of the Maximum a Posteriori Policy Optimisation

License:GNU General Public License v3.0


Languages

Language:Python 100.0%