DeepReinforcementLearningThatMatters

Accompanying code for "Deep Reinforcement Learning that Matters"

Our checkpointed version of the baselines code is found in the baselines folder. We make several modifications, mostly to allow for passing network structures as arguments to the MuJoCo-related run scripts.

Our only change internally was to the DDPG evaluation code. We do this to allow for comparison against other algorithms. In the DDPG code, evaluation is done across N different policies where N is the number of "epoch_cycles", we did not find this to be consistent for comparison against other methods, so we modify this to match the rllab version of DDPG evaluation. That is, we run on the target policy for 10 full trajectories at the end of an epoch.

For t-test and KS test we use the scipy tools.

Citation

@article{hendersonRL2017,
   author = {{Henderson}, Peter and {Islam}, Riashat and {Bachman}, Philip and {Pineau}, Joelle and {Precup}, Doina and {Meger}, David},
    title = "{Deep Reinforcement Learning that Matters}",
  journal = {arXiv preprint arXiv:1709.06560},
     year = 2017,
       url={https://arxiv.org/pdf/1709.06560.pdf}
}

4SkyNet / DeepReinforcementLearningThatMatters

DeepReinforcementLearningThatMatters

Baselines Experiments

rllab experiments

rllabplusplus experiments

modular_rl experiments

Tools

Citation

About

Languages