awjuliani / DeepRL-Agents

A set of Deep Reinforcement Learning Agents implemented in Tensorflow.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simple Policy Faulty Loss Function

wert23239 opened this issue · comments

Your loss function for the simple policy doesn't really make sense

"Loss=-Log(pi)*A"

If you have a weight of .9 and reward of 1
your loss is .045.

but if you have a weight of .9 and your reward is 3
your loss increases to .09 .

So the only reason your function works at all is that you only assign a single amount of reward.