Detect when nested nursery and asyncio loop end up in silent jamming

Question

Detect when nested nursery and asyncio loop end up in silent jamming

touilleMan opened this issue 6 years ago · comments

Considering the following code:

@pytest.fixture
async def nursery2():
    async with trio.open_nursery() as nursery:
        yield nursery

@pytest.fixture
async def asyncio_loop2(nursery2):
    async with trio_asyncio.open_loop() as loop:
        yield loop

@pytest.mark.trio
async def test_wrong_fixture_teardown_order(
        nursery2, asyncio_loop2, postgresql_connection_specs
):

    async def db_runner(*, task_status=trio.TASK_STATUS_IGNORED):
        async with triopg.connect(**postgresql_connection_specs):
            task_status.started()

    await nursery2.start(db_runner)

Running this test will hang forever (dosn't even react to ^C)

The trouble is asyncio_loop2 will be teardown before nursery2 given it depends on it.
But given nursery2 contains a coroutine that will call trio-asyncio it end up in a weird jammed state.

This issue also occurs with when nesting nursery and asyncio_loop in the wrong order, though there is less magic involve so it's much easier to realize the issue than with pytest trio-flavored fixtures:

@pytest.mark.trio
async def test_wrong_context_manager_order(
        postgresql_connection_specs
):
    async def db_runner(*, task_status=trio.TASK_STATUS_IGNORED):
        async with triopg.connect(**postgresql_connection_specs):
            task_status.started()

    async with trio.open_nursery() as nursery:
        async with trio_asyncio.open_loop():
            await nursery.start(db_runner)

I'm wondering if we could come up with a solution to easily detect this, for instance using a context var as a canary to detect if there is no longer any trio-asyncio loop available in the current context.

Matthias Urlichs · Answer 1 · Fri Aug 24 2018 22:22:17 GMT+0800 (China Standard Time)

The root problem is that there's still code around to handle stopped-but-not-closed loops (you need that stuff for loop.run_until_complete and friends).

I should remove that nonsense and declare a stopped loop to be closed, which will raise errors whenever somebody tries to queue something to it, which will (I hope …) cause a huge nasty traceback instead of hanging.

Emmanuel Leblond · Answer 2 · Tue Aug 28 2018 03:36:54 GMT+0800 (China Standard Time)

which will (I hope …) cause a huge nasty traceback instead of hanging.

That would be much better indeed ^^

Is there something I can do to help on this ?

Matthias Urlichs · Answer 3 · Tue Aug 28 2018 21:19:45 GMT+0800 (China Standard Time)

Yes. Create at testcase that hangs – ideally one that sends SIGINT to itself, instead of requiring the user to press ^C.

Emmanuel Leblond · Answer 4 · Wed Aug 29 2018 22:17:18 GMT+0800 (China Standard Time)

@smurfix I've created a PR with two tests hanging forever (not sure where to put them though, so there are in test_misc.py for the moment...)

Matthias Urlichs · Answer 5 · Sat Sep 01 2018 03:45:22 GMT+0800 (China Standard Time)

Should be solved in #38