TensorFlow implementation of Continuous Deep q-Learning with Model-based Acceleration.
- InvertedPendulum-v1
- InvertedDoublePendulum-v1
- Reacher-v1
- HalfCheetah-v1
- Swimmer-v1
- Hopper-v1
- Walker2d-v1
- Ant-v1
- HumanoidStandup-v1
The code depends on outdated software, until it is updated to work with current versions of gym/ tensorflow /mujoco, set up a custom virtualenv (eg with conda) for this and run setup.sh:
$ conda create --name naf python=2.7
$ source actiavate naf
$ ./setup.sh
To train a model for an environment with a continuous action space:
$ python main.py --env=InvertedPendulum-v1 --is_train=True
$ python main.py --env=InvertedPendulum-v1 --is_train=True --display=True
To test and record the screens with gym:
$ python main.py --env=InvertedPendulum-v1 --is_train=False
$ python main.py --env=InvertedPendulum-v1 --is_train=False --monitor=True
Training details of Pendulum-v0
with different hyperparameters.
$ python main.py --env=Pendulum-v0 # dark green
$ python main.py --env=Pendulum-v0 --action_fn=tanh # light green
$ python main.py --env=Pendulum-v0 --use_batch_norm=True # yellow
$ python main.py --env=Pendulum-v0 --use_seperate_networks=True # green
Taehoon Kim / @carpedm20 Original git