HeadCrab65 / cross-entropy-cartpole-pytorch

Pytorch agent for solving the Cartpole environment using Policy gradient and cross entropy RL method

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CartPole Cross-entropy Solver Agent

Build Status

This is a Pytorch agent that uses cross entropy RL method for solving the Cartpole environment. The idea is adopted from Maxim Lapan's book. Changes:

  • I refactored the code and introduced some new classes to make it easier to follow.
  • Adjusted some hyperparameters for faster convergence.

Deployment

  • Install the dependencies and run the Jupyter notebook

Note: To view the result of training steps in tensorboard perfrom the following steps:

  • Install tensorboard and tensorboardX
  • Run the code and wait until the end of the training
  • Execute the following code in the notebook
%load_ext tensorboard
%tensorboard --logdir runs

Results:

The environment is relatively simple and converges very fast even on CPU. Here is the training results

Loss

image

Mean Reward

image

Hyperparameters

The following hyperparameters needs are used and can be fine tuned:

Parameter Description
HIDDEN_SIZE Number of hidden units in the linear layer
BATCh_SIZE Batch size used during the training of NN
PERCENTILE Percentile cut to select best performing episodes

About

Pytorch agent for solving the Cartpole environment using Policy gradient and cross entropy RL method


Languages

Language:Python 80.3%Language:Jupyter Notebook 19.7%