DQN why train iterate for 10 times
FeynmanDNA opened this issue · comments
https://github.com/seungeunrho/minimalRL/blob/master/dqn.py
Line 63 in 7597b9a
I am wondering why the train
method is internally looping 10 times? Shouldn't the policy network train per action?