HimGautam/BipedalWalkerDDPG

BipedalWalkerDDPG

This project is my own implementation of DDPG Paper for solving continuous control problem in the Bipedal Walker environment of OpenAI Gym.

Overview

This project attempts to learn a walking gait for 2D Bipedal Walker in OpenAI Gym Environment. (Note: The reward function is already implemented in the gym environment). The original implementation requires Ornstein-Uhlenbeck process for exploration but in my experiment I've used a gaussian noise of 0.2 stddev.

In the training process, first 1000 steps of data is collected by randomly sampling actions from action space. This data acts as pure exploration. After this 1000 steps, we use (Policy Network's prediction + Gaussian Noise) as our action.

Another type of exploration can be done by adding noise to Neural Network Parameters. This method is discussed in a paper named PARAMETER SPACE NOISE FOR EXPLORATION by OpenAI. It gives better results because the exploratory actions are correlated to inputs. Code for this method is available at ParamNoise folder in this repo.

Requirements

This project requires following dependencies:

Python
Tensorflow
Numpy
OpenAI Gym
Jupyter Notebook (optional; code from each cell can be copied to create .py file)

Trying Out

For trying out download the weights and the main file BipedalDDPG.ipynb. Run all the cells except the one with #Training comment on it.

HimGautam / BipedalWalkerDDPG

BipedalWalkerDDPG

Overview

Requirements

Trying Out

About

Languages