rhklite / Parallel-PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

Parallel PPO-PyTorch

A parallel agent training version of Proximal Policy Optimization with clipped objective.

Usage

To test a pre-trained network : run test.py
To train a new network : run parallel_PPO.py
All the hyperparameters are in the file, main function

Results

CartPole-v1	LunarLander-v2

Dependencies

Trained and tested on:

Python 3.6
PyTorch 1.3
NumPy 1.15.3
gym 0.10.8
Pillow 5.3.0

TODO

implement Conv net based training

Setting up Conda Environment

conda env export | grep -v "^prefix: " > environment.yml to export the file environment.yml
conda create -f environment.yml to create the conda environment used for training

References

PPO paper
PPO-PyTorch github

About

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

MIT License

Languages

Language:Python 100.0%