Baseline results on GRU and LSTM on MemoryGym

Question

Baseline results on GRU and LSTM on MemoryGym

subho406 opened this issue a year ago · comments

Hi,

Thanks for the amazing implementation. I was wondering if would be possible to release the baseline implementations along with the hyperparameters used for GRU and LSTM on the Memory Gym environment (https://openreview.net/pdf?id=jHc8dCx6DDr)? I hoping to use MemoryGym for my thesis work and it will be extremely helpful. Thanks!

Marco Pleines · Answer 1 · Sat Feb 18 2023 12:52:28 GMT+0800 (China Standard Time)

Hello!

The results are produced using neroRL (develop branch).
We are currently updating our GRU baseline repository to support Memory Gym. The develop branch should be functional, but we still need to reproduce our results, which is the last step toward merging it to the main branch. This should be done within the next two weeks. So feel free to use neroRL for training now and you can use the other repository to follow our implementation concept more easily.

Marco Pleines · Answer 2 · Sat Feb 18 2023 12:55:44 GMT+0800 (China Standard Time)

Also we found better hyperparameters for MMGrid and MPGrid using optuna that we have just implemented in neroRL (develop):

	MM Grid	MP Grid
Worker	32	32
Worker steps	512	512
Epochs	3	3
Num Minibatches	8	8
gamma	0.995	0.995
lamda	0.95	0.95
value loss coefficient	0.5	0.5
advantage normalization	batch	none
max grad norm	0.25	0.25
clip range	0.1	0.2
init learning rate	2.50E-04	2.75E-04
fina learning rate	1.00E-05	1.00E-05
init entropy coefficient	0.0001	0.001
fina entropy coefficient	0.000001	0.000001
RECURRENCE
num layers	1	1
layer type	GRU	GRU
sequence length	-1	-1
hidden state size	512	512
residual	TRUE	FALSE
updates	5000	10000