An implementation of Phasic Policy Gradient, a proposed improvement on top of Proximal Policy Optimization (PPO), in Pytorch. It will be my very first project in Reinforcement Learning.
$ pip install -r requirements.txt
$ python train.py --render
@misc{cobbe2020phasic,
title={Phasic Policy Gradient},
author={Karl Cobbe and Jacob Hilton and Oleg Klimov and John Schulman},
year={2020},
eprint={2009.04416},
archivePrefix={arXiv},
primaryClass={cs.LG}
}