jvmncs / ParamNoise

A comparison of parameter space noise methods for exploration in deep reinforcement learning

ParamNoise

A comparison of parameter space noise methods for exploration in deep reinforcement learning

NOTE: This project is not maintained. Reach out if you'd like to help reboot it.

Links to papers

Parameter Space Noise for Exploration : https://openreview.net/forum?id=ByBAl2eAZ&noteId=ByBAl2eAZ

Noisy Networks For Exploration : https://openreview.net/forum?id=rywHCPkAW&noteId=rywHCPkAW

Resources

OpenAI Baselines for useful Atari wrappers and replay buffer
bearpaw's pytorch-classification repo for utilities, logging, training framework
ikostrikov's PPO implementation for other utilities and PPO guidance
pytorch-rl for DQN help
PyTorch DQN tutorial for PyTorch tricks
Original DQN paper since both papers use the original hyperparameters, for the most part

TODOs

Implement PPO and MuJoCo env handling
Revisit logging; make sure everything is there to reproduce results in papers
Implement plotting (matplotlib is in Logger object; maybe try out visdom)
More tests (figure out different combinations of arguments to ensure everything's interacting well)
Begin experiments (start with Mujoco; it's cheaper)

Atari Games to Test

Alien: Adaptive helps a lot, learned shows no improvement
Enduro: Both methods improve
Seaquest: Adaptive helps, learned performs worse than baseline
Space Invaders: Adaptive helps, but learned helps more
WizardOfWor: Adaptive worse than baseline, but learned helps a lot

MuJoCo enviroments to test

Hopper
Walker2d
HalfCheetah
Sparse versions of these? (from rllab)

About

A comparison of parameter space noise methods for exploration in deep reinforcement learning

MIT License

Languages

Language:Python 100.0%