CartPole Cross-entropy Solver Agent

This is a Pytorch agent that uses cross entropy RL method for solving the Cartpole environment. The idea is adopted from Maxim Lapan's book. Changes:

I refactored the code and introduced some new classes to make it easier to follow.
Adjusted some hyperparameters for faster convergence.

Deployment

Note: To view the result of training steps in tensorboard perfrom the following steps:

%load_ext tensorboard
%tensorboard --logdir runs

The environment is relatively simple and converges very fast even on CPU. Here is the training results

The following hyperparameters needs are used and can be fine tuned:

Parameter	Description
HIDDEN_SIZE	Number of hidden units in the linear layer
BATCh_SIZE	Batch size used during the training of NN
PERCENTILE	Percentile cut to select best performing episodes

Pytorch agent for solving the Cartpole environment using Policy gradient and cross entropy RL method

Language:Python 80.3%Language:Jupyter Notebook 19.7%