Error with custom pipeline inside main script module (`joblib.externals.loky.process_executor.BrokenProcessPool`)
jpeoples opened this issue · comments
I was writing a custom pipeline using the library, The file is structured like
import ...
from imgtools.pipeline import Pipeline
class MyPipeline(Pipeline):
def __init__(self, ...):
# setup pipeline
super().__init__(n_jobs=-1)
def process_one_subject(self, subject_id):
# custom code here
def main():
# Handle command line args and run pipeline
if __name__=="__main__": main()
This was failing with a BrokenProcessPool
error (full traceback below) (on Windows 11). The error has something to do with pickling for multiprocessing, and doesn't happen if n_jobs=1
.
The error can be worked around by separating the script being executed from the module containing the pipeline/main function. That is, in the above example, remove
if __name__=="__main__": main()
then create a new wrapper script:
from my_pipeline_module import main
if __name__=="__main__": main()
and execute that, rather than the module itself.
It is possible that it could be a Windows specific error (see here for example).
I'm not sure that this is something that can be fixed within med-imagetools
. If not, though, it may be worth documenting somewhere.
Traceback:
Traceback (most recent call last):
File "F:\SimpsonLab\r01_aim2\pipeline_improvement\.venv\lib\site-packages\joblib\externals\loky\process_executor.py", line 391, in _process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "C:\Users\jacob\AppData\Local\Programs\Python\Python38\lib\multiprocessing\queues.py", line 116, in get
return _ForkingPickler.loads(res)
TypeError: tuple expected at most 1 argument, got 3
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\jacob\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\jacob\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "f:\simpsonlab\r01_aim2\pipeline_improvement\r01_crlm_aim2_pipeline\chi\r01_crlm_aim2\pipeline.py", line 135, in <module>
if __name__ == "__main__": main()
File "f:\simpsonlab\r01_aim2\pipeline_improvement\r01_crlm_aim2_pipeline\chi\r01_crlm_aim2\pipeline.py", line 133, in main
pipeline.run()
File "F:\SimpsonLab\r01_aim2\pipeline_improvement\.venv\lib\site-packages\imgtools\pipeline.py", line 106, in run
Parallel(n_jobs=self.n_jobs, verbose=verbose)(
File "F:\SimpsonLab\r01_aim2\pipeline_improvement\.venv\lib\site-packages\joblib\parallel.py", line 1098, in __call__
self.retrieve()
File "F:\SimpsonLab\r01_aim2\pipeline_improvement\.venv\lib\site-packages\joblib\parallel.py", line 975, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "F:\SimpsonLab\r01_aim2\pipeline_improvement\.venv\lib\site-packages\joblib\_parallel_backends.py", line 567, in wrap_future_result
return future.result(timeout=timeout)
File "C:\Users\jacob\AppData\Local\Programs\Python\Python38\lib\concurrent\futures\_base.py", line 439, in result
return self.__get_result()
File "C:\Users\jacob\AppData\Local\Programs\Python\Python38\lib\concurrent\futures\_base.py", line 388, in __get_result
raise self._exception
joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable
Can you try running the custom Pipeline using n_jobs=1
? It might be a bug with the multiprocessing backend. Maybe it'll be helpful to allow the users to select which multiprocessing backend to use in case of these incompatibilities.
Yes, this is correct, it doesn't happen when n_jobs=1
.
@jpeoples Does this error occur if you use less n_jobs? I'm wondering if it just hit a hardware constraint or it's legitimately a bug (looks like it could be related to pickling?)
I'm mostly surprised because the joblib passes our windows CI/CD tests, but it doesn't seem to be too happy here. Also, AutoPipeline runs properly on my personal PC and haven't seen this error before.
@skim2257 -- I tried n_jobs=2
-- same error.
I now think it has something to do with the interaction of my code for this project and the multiprocessing backend, rather than a more general error. I tried making a super minimal custom pipeline and couldn't reproduce the same error at all, regardless of n_jobs
.
I was originally thinking the if __name__=="__main__"
block was creating a problem with the pickling, but that isn't the case, I don't think, given that the minimal custom pipeline fails to reproduce it. I'm kind of at a loss as to what the problem is -- luckily it is easy to work around.
I'll close for now -- if I happen to get to the bottom of it, I'll let you know.