A3C_ALE ArrayOutOfBounds and other issues
phong-phuong opened this issue · comments
Version: 1.0.0-beta7
Issues encountered:
- Array out of bounds exception while running A3C_ALE example with pong.bin with default 8 threads
- Sometimes get NullPointerException in image transform (however this is not fatal)
- The last thread finishes much later than the others.
- Program gets stuck and doesn't continue training, thus never reaches getPolicy().save()
Out of bounds:
Exception in thread "Thread-7" java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:422)
at java.util.ArrayList.get(ArrayList.java:435)
at org.deeplearning4j.rl4j.learning.async.a3c.discrete.AdvantageActorCriticUpdateAlgorithm.computeGradients(AdvantageActorCriticUpdateAlgorithm.java:63)
at org.deeplearning4j.rl4j.learning.async.a3c.discrete.AdvantageActorCriticUpdateAlgorithm.computeGradients(AdvantageActorCriticUpdateAlgorithm.java:32)
at org.deeplearning4j.rl4j.learning.async.AsyncThreadDiscrete.trainSubEpoch(AsyncThreadDiscrete.java:130)
at org.deeplearning4j.rl4j.learning.async.AsyncThread.handleTraining(AsyncThread.java:192)
at org.deeplearning4j.rl4j.learning.async.AsyncThread.run(AsyncThread.java:168)
I have the same problem - for some reason, it expects to have previous experience upfront.
My config:
A3CLearningConfiguration.builder()
.numThreads(16)
.maxEpochStep(2000000)
.maxStep(2000000)
.build()
val network: ActorCriticDenseNetworkConfiguration =
ActorCriticFactorySeparateStdDense.Configuration.builder()
.l2(0.001)
.updater(Adam(0.0005))
.numHiddenNodes(52)
.numLayer(4)
.build()
.toNetworkConfiguration()
After debugging for a moment I see that the default config value for A3CLearningConfiguration will not work, it is missing nStep
value. To solve it, simply chain .nStep(5)
within the builder:
A3CLearningConfiguration.builder()
.numThreads(16)
.maxEpochStep(2000000)
.maxStep(2000000)
.nStep(5)
.build()
Fixed in version 1.0.0-M1-1.