cgarciae / pypeln

Concurrent data pipelines in Python >>>

Home Page:https://cgarciae.github.io/pypeln

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] Program hanging when input is throttling by maxsize and exception raised in stage

chengjinluo opened this issue · comments

Describe the bug
Please refer to the code example.

After some investigation, I found all process workers exits successfully.
But the initial worker thread which feed the input numbers into queue stuck on the block call of multiprocessing.Queue.put, and it can not get the exception raised by stopit.

Manually edit use_thread default value to False in pypeln.process.worker.start_workers will be a workaround (will start initial worker as a process instead of thread, and it will be terminated by signal).

It seems that pypeln.thread.Queue.IterableQueue has a different implementation, it will try put operation with timeout so that the thread is able to get the exception raised by stopit.

Minimal code to reproduce

import pypeln as pl

def proc(x):
    print(x, flush=True)
    if x == 10:
        assert False

if __name__ == '__main__':
    stage = pl.process.map(proc, list(range(100)), maxsize=4, workers=4)
    list(stage)

Expected behavior
Program exits with exception.

Library Info
os: ubuntu 22.04
python=3.10.12
pypeln=0.4.9

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.