Will dropout break out the final loss of ppo algorithm?
ppaanngggg opened this issue · comments
ppaanngggg commented
If I add dropout layer to model, will it be a bad idea?
Any experiments there?
ppaanngggg commented
I use eval model when explore environment, and use train model for policy, old policy and value model when training