Chaper 14 Deterministic policy gradients results are quite noisy.
isu10503054a opened this issue · comments
Random weights initialization adds randomness to initial starting point. Usage if different parallel environments also might add stochastisity вт, 27 окт. 2020 г., 12:01 isu10503054a notifications@github.com:
…
In the results of Chapter 14 Deterministic policy gradients in the book, why the training is not very stable and noisy? I read the content repeatedly, but I still don’t understand why. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#86>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAQE2WTJOWPGQGYY3MOTRLSM2D5XANCNFSM4TAQL7BQ .
Is there any hyperparameter in the source code that can modification to improve this situation?
thx