Using procgen with parallelized libraries (e.g. rlpyt)

Question

Using procgen with parallelized libraries (e.g. rlpyt)

bmazoure opened this issue 5 years ago · comments

When running procgen with rlpyt, the library uses multiprocessing to perform environment interactions. The following snippet is expected to run as it is part of the library, but hangs on the join:

import gym

import multiprocessing as mp
env = gym.make("procgen:procgen-coinrun-v0")
env.reset()
w = mp.Process(target=env.step,args=([0]))
w.start()
w.join()

Would there be any workaround to prevent the deadlock?

christopher-hesse · Answer 1 · Sat Dec 28 2019 14:49:12 GMT+0800 (China Standard Time)

This looks like procgen is not fork safe. A lot of things are not fork safe (such as tensorflow), and using fork instead of spawn can cause a lot of issues.

Here are some options:

use spawn instead of fork (possible through a multiprocessing context), or use fork but create the environment after creating the sub process
use the built in vectorization that procgen already has (this may not be worth it because the VecEnv interface is baselines specific)
add a hack to disable threads in procgen if num_envs=1, which may make it fork safe. It sounds like Qt in general is not fork safe though so that may not work.

What do you think of these options?

Bogdan Mazoure · Answer 2 · Sun Dec 29 2019 04:55:43 GMT+0800 (China Standard Time)

It seems like option 1 might be the best long-term solution, but it requires changing the way the specific library (rlpyt) works. Option 3 seems a good immediate solution, could you point to somewhere in the procgen code to start with?

Bogdan Mazoure · Answer 3 · Sun Dec 29 2019 11:30:16 GMT+0800 (China Standard Time)

This has fixed the issue and the env is now compatible with rlpyt. Thanks!