Policy Gradient for CartPole-v1

This is a tensorflow implementation of a policy gradient algorithm for CartPole-v1 environment of OpenAI gym. In addition to the policy network, a value network is also lerned in order to reduce the variance during training.

Requirement

tensorflow 0.11
OpenAI gym

Training

	$ python main.py

About

Languages

Language:Python 100.0%