tomasspangelo / proximal-policy-optimization

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Proximal Policy Optimization

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates entropy.

The code contains a lot of comments and can be helpful to understand both PPO and PyTorch.

How to use

  1. Clone the repository to get the files locally on your computer (see https://git-scm.com/book/en/v2/Git-Basics-Getting-a-Git-Repository, Cloning an Existing Repository)

  2. Navigate into the root folder of the project: /ppo

  3. Download necessary dependencies. These dependencies can be found in the file requirements.txt. Use your favorite package manager/installer to install the requirements, we recommend using pip. To install the requirements, run the following command in the root folder of the project (where requirements.txt is located):

    pip install -r requirements.txt

  4. All you need is an instance of the Environment class (see source code for specification), two are already provided. You also need a Learner object. See the example in main.py.

About

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.


Languages

Language:Python 100.0%