Rowing0914 / wolpertinger_ddpg_torch

Orignal repo doesn't run by default so I added couple minor modification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wolpertinger Training with DDPG (Pytorch, Multi-GPU/single-GPU/CPU)

Overview

Pytorch version of Wolpertinger Training with DDPG (paper: Deep Reinforcement Learning in Large Discrete Action Spaces).
The code is compatible with training in multi-GPU, single-GPU or CPU.
It is also compatible with both continuous and discrete control of OpenAI gym.
In continuous case, I discretize the action space to use wolpertinger-DDPG training algorithm.

Dependencies

  • python 3.6.8
  • torch 1.1.0
  • OpenAI gym
    • If you get an RunTimeError:NotImplementedError in ActionWrapper.step while training with gym, replace your gym/core.py file with core.py in openai/gym.
  • pyflann
    • This is the library (FLANN, Muja & Lowe, 2014) with approximate nearest-neighbor methods allowed for logarithmic-time lookup complexity relative to the number of actions. However, the python binding of FLANN (pyflann) is written for python 2 and is no longer maintained.
    • To use this package, please put the whole directory pyflann into your (virtual) python environment.
    • Please refer to pyflann for a more detailed instruction if needed.

Usage

  • In Pendulum-v0 (continuous control), discretize the continuous action space to a discrete action spaces with 200000 actions.
    python main.py --env 'Pendulum-v0' --max-actions 200000
  • In CartPole-v1 (discrete control), --max-actions is not needed.
    python main.py --env 'CartPole-v1'
  • To use CPU only:
    python main.py --gpu-ids -1
  • To use single-GPU only:
    python main.py --gpu-ids 0 --gpu-nums 1
  • To use multi-GPU (e.g., use GPU-0 and GPU-1):
    python main.py --gpu-ids 0 1 --gpu-nums 2

Result

  • Please refer to output for the trained models and training log.
    • Pendulum-v0: a gym environment with continuous action space.
    • CartPole-v1: a gym environment with discrete action space.

Project Reference

About

Orignal repo doesn't run by default so I added couple minor modification


Languages

Language:Python 100.0%