Reward scaling

Question

Reward scaling

ADGEfficiency opened this issue 6 years ago · comments

BonsAI have a great video on reward function design.

The techniques that would be relevant for energy_py would be scaling using the max reward (max = peak capacity * max price). The issue with this is that the reward is no longer directly interpretable as money.

This can be dealt with by including a cost in the env info dict.

Part of this work would be looking at how the distribution of rewards changes before and after scaling.