running integration example with official whl fails with multiprocessing exception, simple fix provided
ShiboXing opened this issue · comments
Your issue may already be reported!
Please search on the issue tracker before creating one.
Context
- Pytorch version: 1.12.1, or lower
- Operating System and version: Ubuntu 20.04.4 LTS
Your Environment
- Installed using source? [yes/no]: no
- Are you planning to deploy it using docker container? [yes/no]: yes
- Is it a CPU or GPU environment?: GPU
- Which example are you using: mnist_hogwild
- Link to code or data to repro [if any]:
Expected Behavior
mnist_hogwild is expected to finish with no error
Current Behavior
example fails at mp.set_start_method('spawn')
Possible Solution
try: mp.set_start_method('spawn') except RuntimeError: pass
tested and working
Steps to Reproduce
pip install torch torchvision
cd examples
bash run_python_examples.sh mnist_hogwild
...
Failure Logs [if any]
...
Traceback (most recent call last):
File "/home/ubuntu/Playground/examples/mnist_hogwild/main.py", line 75, in
mp.set_start_method('spawn')
File "/opt/conda/lib/python3.9/multiprocessing/context.py", line 243, in set_start_method
raise RuntimeError('context has already been set')
RuntimeError: context has already been set
mnist hogwild failed
Finished mnist_hogwild, status 0
Some examples failed:mnist hogwild failed
Hi @ShiboXing , I'm not able to reproduce the issue on my side(torch==1.12.1 in Ubuntu 22.04 LTS) and CI is green as well. Here is a simpler alternative: mp.set_start_method('spawn', force=True)
. Please feel free to submit a PR for it.