Creating a Policy class

Question

Creating a Policy class

ADGEfficiency opened this issue 6 years ago · comments

In RL it's possible to use a variety of different policies - most common being epsilon-greedy. This is currently hard coded into DQN - would be nice to pull it out and be able to use other methods (ie random, softmax, Boltzman)

See
https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-7-action-selection-strategies-for-exploration-d3a97b7cceaf

In the link below the policy is defined as a method
https://ewanlee.github.io/2017/07/09/Using-Tensorflow-and-Deep-Q-Network-Double-DQN-to-Play-Breakout/

Adam Green · Answer 1 · Sat Jun 09 2018 11:14:10 GMT+0800 (China Standard Time)

This has been done in the recent DQN rebuild (see energy_py/common/policies)