the training of CMA-ES shows high average reward but when you just check the model against the log the rewards are practically zero

Question

the training of CMA-ES shows high average reward but when you just check the model against the log the rewards are practically zero

itabhiyanta opened this issue 5 years ago · comments

Thanks for posting this repo. i have a strange issue I see a very promising curve for the training of my CMA-ES model however i cannot replicate the results when i execute the following command.

python3.5 model.py log/filewiththe best stats.json

I am using a custom environment.

I also wish to ask you something about the number of processors for the training of the CMA-ES model. I used 16 processors and also 48 processors (couldn't use 64 processors as then i run out of memory). Do you think reducing the number of processors for training of the CMA-ES model will have some adverse effect?

Kindly advise.
Rohit

hardmaru · Answer 1 · Thu Dec 20 2018 06:54:52 GMT+0800 (China Standard Time)

Hi You may have forgotten an extra flag (render/norender) ``` python model.py render log/carracing.cma.16.64.best.json ``` Chk out blog post http://blog.otoro.net/2018/06/09/world-models-experiments/

…

On Thu, Dec 20, 2018 at 12:28 AM itabhiyanta ***@***.***> wrote: — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#11>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGBoHjqenYIBXvYR2M7rWzbP4fmbW34Vks5u6lsKgaJpZM4Zad4Y> .

itabhiyanta · Answer 2 · Thu Dec 20 2018 17:27:45 GMT+0800 (China Standard Time)

yep that was it. i didn't use it thinking that since i do not use the gym environment in general it doesn't apply to me.
thanks

hardmaru · Answer 3 · Thu Dec 20 2018 17:50:25 GMT+0800 (China Standard Time)

cool. I'd be interested to see any results for custom environments, looking forward to see your publications in the future.