Creating a Policy class
ADGEfficiency opened this issue · comments
In RL it's possible to use a variety of different policies - most common being epsilon-greedy. This is currently hard coded into DQN - would be nice to pull it out and be able to use other methods (ie random, softmax, Boltzman)
In the link below the policy is defined as a method
https://ewanlee.github.io/2017/07/09/Using-Tensorflow-and-Deep-Q-Network-Double-DQN-to-Play-Breakout/
This has been done in the recent DQN rebuild (see energy_py/common/policies)