biggzlar / NAF-tensorflow

"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Normalized Advantage Functions (NAF) in TensorFlow

TensorFlow implementation of Continuous Deep q-Learning with Model-based Acceleration.

algorithm

Environments:

  • InvertedPendulum-v1
  • InvertedDoublePendulum-v1
  • Reacher-v1
  • HalfCheetah-v1
  • Swimmer-v1
  • Hopper-v1
  • Walker2d-v1
  • Ant-v1
  • HumanoidStandup-v1

Installation and Usage

The code depends on outdated software, until it is updated to work with current versions of gym/ tensorflow /mujoco, set up a custom virtualenv (eg with conda) for this and run setup.sh:

$ conda create --name naf python=2.7
$ source actiavate naf
$ ./setup.sh

To train a model for an environment with a continuous action space:

$ python main.py --env=InvertedPendulum-v1 --is_train=True
$ python main.py --env=InvertedPendulum-v1 --is_train=True --display=True

To test and record the screens with gym:

$ python main.py --env=InvertedPendulum-v1 --is_train=False
$ python main.py --env=InvertedPendulum-v1 --is_train=False --monitor=True

Results

Training details of Pendulum-v0 with different hyperparameters.

$ python main.py --env=Pendulum-v0 # dark green
$ python main.py --env=Pendulum-v0 --action_fn=tanh # light green
$ python main.py --env=Pendulum-v0 --use_batch_norm=True # yellow
$ python main.py --env=Pendulum-v0 --use_seperate_networks=True # green

Pendulum-v0_2016-07-15

References

Original Author

Taehoon Kim / @carpedm20 Original git

About

"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow

License:MIT License


Languages

Language:Python 96.2%Language:Shell 3.8%