openai / procgen

Procgen Benchmark: Procedurally-Generated Game-Like Gym-Environments

Home Page:https://openai.com/blog/procgen-benchmark/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Changing when the vectorized environments return `done=True`

alversafa opened this issue · comments

I am trying to create vectorized environments that return done=True only when certain episodes have passed, say 2. So in coinrun, when the agent dies (or gets the coin) the first time, the step function should not return done=True. It should only return done=True in the second episode.

More specifically, I am trying to create vectorized trials, rather than episodes, as in the RL^2 paper.

(It would also be great if I can return some additional stuff from the environment every step.)

What would be the easiest way to achieve this?

You should be able to write a wrapper of the baselines VecEnv interface that overrides the value from done, have you tried that yet?

The more complicated way is to change the C++ code to behave how you want, which I would only recommend if there's no way to do this in python.

Adding information into the info dictionary from the python side is straightforward. If you want to add things from the C++ side it's fairly complicated, see this comment: #32 (comment)

Does that answer your question?

Closing due to lack of activity.