API thoughts

Question

API thoughts

njsmith opened this issue 6 years ago · comments

Nathaniel J. Smith commented 6 years ago

This is super cool!

I want to change a bunch of the public API decisions :-).

I'll dump a list of ideas here, so we can discuss.

Starting the trio_asyncio loop

Right now, you have to use the trio_asyncio loop as the top-level entry point to your async code. I'm guessing this was forced on you by the old "parenting is a full time job" rule, but that's gone now :-). And I really think we want to switch this around.

What I'm imagining is, you start in trio, and create a loop with an async with statement, e.g.

async def main():
    async with trio_asyncio.open_loop() as loop:
        ...

trio.run(main)

Given that the main motivation for this package is using asyncio libraries from trio, this feels way more natural. It becomes possible to use an asyncio library from inside a trio library, without the trio library's ultimate end user having to know about it! It lets us drop tricky stuff like having to take over the asyncio loop during import. It lets you stop worrying about clock consistency across multiple calls to trio.run. And perhaps most interestingly, it lets us have multiple trio_asyncio loops active simultaneously and independently in different parts of the trio task tree, without interfering with each other. In particular, you can kill off part of the tree and know that any random detritus that was left running in that tree's aio loop won't "leak out" and keep running. It lets us keep the callback spaghetti in quarantine.

Detailed semantics

We should make sure that the loop is treated as the "current" event loop for asyncio. In practice, I think this means storing it in a trio.TaskLocal so we can find it again later, and then monkeypatching asyncio._get_running_loop and asyncio.events._get_running_loop so that they do something like:

def _our_get_running_loop():
    if this_is_a_trio_thread:
        loop = getattr(loop_tasklocal, "current_loop", None)
        if loop is None:
            raise RuntimeError("tried to use asyncio from within trio, without trio_asyncio.open_loop()")
    else:
        return _real_get_running_loop()

Our loop object doesn't have methods run_forever, run_until_complete, close, etc.: the loop starts running when it's created, and it stops running and is closed (cancelling anything remaining) when we leave the block. The equivalent of run_until_complete is call_asyncio (or run_asyncio, see below).

In order to support existing cases that use run_forever, we should provide a method await loop.wait_stopped(), which just pauses until someone calls loop.stop(). So instead of:

loop = ...
loop.run_until_complete(setup_fn)
try:
    loop.run_forever()
finally:
    loop.run_until_complete(teardown_fn)

you would write:

async with open_loop() as loop:
    await run_asyncio(setup_fn)
    try:
        await loop.wait_stopped()
    finally:
        await run_asyncio(teardown_fn)

(Considered alternatives: trio_asyncio.run is too close to run_asyncio; having run_asyncio start a loop on first invocation seems too magical -- loop is super super stateful so need to mark where it comes into existence. And it's nice to have creating the loop and entering asyncio as two separate actions, e.g. for those who want to create a loop at the top of their program and then do trio stuff underneath.)

loop.call_{trio,asyncio}

Two things:

Let's rename them run_trio, run_asyncio, for consistency with all of trio's other functions that run an async function and wait for it to finish.
Let's make them global functions, trio_asyncio.run_trio, trio_asyncio.run_asyncio, so you don't have to pass the loop around.

I'm not sure if wait_for is the best name... in asyncio there's a function called that, but what it actually does is impose a timeout on a future (very poorly chosen name). We can bikeshed that at some point I guess.

Synchronous functions

I think we can make it so that these "just work", in both directions, without special translation. The tricky issues for translating are (1) the different await protocols, (2) cancellation exceptions – right? And both of these only apply to async functions, right?

All the trio synchronous functions just need to be run inside trio.run, which everything here certainly is. And if we make get_event_loop() work, then I think that makes synchronous asyncio functions Just Work too.

Keyword arguments

I know it's convenient, but I think it's a mistake to support kwargs directly in all the call_* functions. It's inconsistent with both trio and asyncio's conventions, so anyone who uses this library is already going to be familiar with the partial trick. And there's a good reason for these conventions -- which you already ran into with the _scope kwarg. There are very few things in asyncio that seemed worth preserving in trio, but this was one of them :-)

Cancellation translation

Hoooboy this is tough. I'm not sure what the right thing to do here is, but the bit where it reaches "up" into trio to find a cancel scope to cancel is definitely wrong.

For an exception propagating from asyncio→trio, I don't know that there's anything better to do than just let the asyncio.CancelledError come out, and if you're using trio_asyncio then hopefully you're prepared for that, just like you're prepared for whatever other exceptions this asyncio library might raise as part of its interface.

For propagating from trio→asyncio, I think my comments in python-trio/trio#171 were pretty naive. (Maybe I didn't understand trio as well back then.) You can't even reasonably do except Cancelled:, because that will mess up on MultiErrors, and it's quite common to have a MultiError holding Cancelled exceptions. Maybe we do want to just let these exceptions ride.

This part will need more discussion for sure.

Matthias Urlichs · Answer 1 · Mon Jan 22 2018 10:19:56 GMT+0800 (China Standard Time)

OK. Food for thought (and coding).

Well, the original motivation was to have full bidirectional callability, for two reasons - (a) I wanted to be able to run the original Python asyncio tests on the trio-asyncio loop, to make sure the whole thing is correct and all that, (b) there might one day be a nice universe of trio-based libraries which you'd want to call from poor old ancient asyncio, instead of the other way 'round. ;-)
If we drop requirement (a) and basically require a Trio task to hook the loop into, the whole thing becomes a lot simpler.

I don't need to monkeypatch asyncio. Instead, I simply install my own loop policy, which is necessary anyway because auto-creation of the asyncio loop now becomes a big no-no.

As to wait_for, I'd rename it to run_coroutine and document that it also works for plain futures (and/or add run_future as an alias).

Cancelling: trio→asyncio: IMHO the decision whether to propagate an asyncio CancelledError as-is or by cancelling the enclosing scope should be left to the caller – either they explicitly pass in a scope, or not.

The scope to use now becomes the scope of the asyncio loop, if none is passed in explicitly, so that's easy to fix. ;-)

asyncio→trio: Instead of except Cancelled:, which I agree is obviously broken, the easy (and hopefully correct) way out is to create an explicit scope and cancel that when the Future gets cancelled / check whether that scope got cancelled and propagate that to the Future.

Nathaniel J. Smith · Answer 2 · Mon Jan 22 2018 16:00:25 GMT+0800 (China Standard Time)

Re: requiring that we start from Trio: oh ick, I forgot about the testing use case. (I don't care so much about the trio libraries on asyncio use case; it's irrelevant right now, and we can always revisit later.) I don't necessarily object to being able to run this as a standalone asyncio loop, but given that it adds a lot of complexity and we only really care about it for testing, I wonder if there's some ugly hack we can use to get this working just well enough for that. (And avoid making it a public API.)

Here's a gross idea that might work: We want to have everything happen inside a single call to trio.run, but we need to expose sync run_forever and run_until_complete methods. So, make a private subclass of our loop, that first thing it does is spawn a worker thread, which enters trio.run and sets up an event loop and then blocks on a synchronous Queue.get. Implement run_forever and run_until_complete as sending some message to this worker thread, then waiting for it to finish. The idea is that at any given moment exactly one of the threads is actually running and the other is blocked, so we preserve an illusion of there only being one thread of execution.

I don't need to monkeypatch asyncio. Instead, I simply install my own loop policy, which is necessary anyway because auto-creation of the asyncio loop now becomes a big no-no.

It took me a bit to remember why I had decided we needed to monkey-patch here... I think there are two reasons. First, BaseEventLoop.run_forever sets the running event loop at the beginning, and then unsets it at the end. If we copied that, it wouldn't work, because then if you had two event loops running simultaneously in different branches of the task tree, they'd collide. However, yeah, that's fine, we aren't using BaseEventLoop.run_forever, so get_event_loop will fall back on the event loop policy, which we can control.

The second reason is more real, but only matters in a pretty obscure case. In theory, someone might want to run trio in on thread and, I dunno, uvloop or something in another thread. So ideally, we'd only be taking over the event loop policy in the trio thread, not globally across the process. But there is no way to do that using the usual tools: the event loop policy gets to run arbitrary code, but is process-global; the running event loop is thread-local, but has to be a single value across the whole thread. We could potentially wait until someone complains about this to worry about it. It's also possible we might be able to convince Yury to make the running event loop a context-local (see PEP 567) in 3.7, and then we could use that b/c it would have different values in different trio tasks. (And if people who want to use uvloop and trio and asyncio_trio all together then just tell them they need to upgrade to 3.7.)

As to wait_for, I'd rename it to run_coroutine and document that it also works for plain futures (and/or add run_future as an alias).

Hmm, I think I vote we just have run_future, no aliases, and document that it also works for coroutines... but for coroutines run_asyncio is more natural so we don't have to make a big deal about it. Awaiting a Future is the main case not covered by run_asyncio.

Cancelling: trio→asyncio: IMHO the decision whether to propagate an asyncio CancelledError as-is or by cancelling the enclosing scope should be left to the caller – either they explicitly pass in a scope, or not.

I think there must be something we're thinking about differently here, because this concept just doesn't make any sense to me at all. I don't understand why cancelling the enclosing scope would ever make sense.

However, after sleeping on it, the cancellation issues seem less scary. For trio calling asyncio, if the trio side gets cancelled, it should cancel the asyncio code, wait for it to finish, and if it raises asyncio.CancelledError, then we can replace that with the trio.Cancelled that the trio cancellation machinery tried to get us to raise in the first place. Something like:

from trio.hazmat import reschedule, Result, Abort

async def run_future(self, future):
    task = trio.hazmat.current_task()
    raise_cancel = None

    def done_cb(_):
        reschedule(task, Result.capture(future.result))

    future.add_done_callback(done_cb)

    def abort_cb(raise_cancel_arg):
        # Save the cancel-raising function
        nonlocal raise_cancel
        raise_cancel = raise_cancel_arg
        # Attempt to cancel our future
        future.cancel()
        # Keep waiting
        return Abort.FAILED

    try:
        return await trio.hazmat.wait_task_rescheduled(abort_cb)
    except asyncio.CancelledError as exc:
        if raise_cancel is not None:
            try:
                raise_cancel()
            finally:
                # Try to preserve the exception chain for more detailed tracebacks
                sys.exc_info()[1].__cause__ = exc
        else:
            raise

For asyncio calling trio, your plan sounds right to me. If we want to get fancy then I guess we could even make sure that the asyncio.CancelledError gets its __cause__ set to the trio.Cancelled exceptions it catches, but that would require (a) hacking/subclassing asyncio.Future to set __cause__ appropriately whenever it raises asyncio.CancelledError, and (b) implementing python-trio/trio#285 . So I think this can wait for version 2 :-).

That leaves the more subtle cases:

The task we're cancelling in run_future refuses to stop: sometimes stuff isn't cancellable. It happens. Trio's normal semantics in this case are to wait for it. For example, you can't cancel run_sync_in_worker_thread, because synchronous code just doesn't support that. So that's what the above code does. We might want to add something like run_sync_in_worker_thread's cancellable argument, which defaults to False, but you can set it to True, which means "if this gets cancelled, just abandon the thread and let it run until it finishes". This is useful only in special cases where the thing in the thread is known to be side-effect-free, but there are important cases where that's true (like getaddrinfo).

The task we're cancelling in run_future doesn't raise CancelledError: that also happens – maybe it completed just before we tried to cancel it. Trio's normal semantics in this case are to treat that as, well, I guess it wasn't cancelled after all. So that works here too.

Unexpected asyncio.CancelledError in run_future: in the code above, if we cancelled the future and then got an asyncio.CancelledError, we convert it into a trio.Cancelled. But what if we just randomly get an asyncio.CancelledError without that? (Like: fut = Future(); fut.cancel(); await run_future(fut).) Right now trio-asyncio reaches up to cancel some random scope. I think in this case the best thing to do is just let the asyncio.CancelledError keep propagating, and let the user handle it however they want. If they don't like that, then they should stop calling fut.cancel :-). And this kind of issue is pretty normal when using asyncio, so I feel like it's just part of the price you pay here.

The trio-asyncio main loop/watchers/etc. get cancelled: This happens if someone does:

with open_cancel_scope():
    async with open_loop():
        ...

i.e., the whole thing gets cancelled. In this case, I think (maybe?) what we want to do is to shield all the internal worker tasks from cancellation, and only propagate the cancellation to the body of the async with block. Which presumably is blocked inside some call to run_future, which will be converted into a fut.cancel(), which will cause a Task to start cleaning itself up, etc. If that makes sense we can talk about how to implement it.

Nathaniel J. Smith · Answer 3 · Sat Jan 27 2018 17:57:30 GMT+0800 (China Standard Time)

Here's some untested code to illustrate my terrible idea about how to let us run, say, the aiohttp test suite against TrioEventLoop without having to generally support the run_until_complete and similar APIs:

import queue
import trio
from trio_asyncio import open_loop, run_future
import threading
from functools import partial

class HackyTestingLoop(TrioEventLoop):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.__blocking_job_queue = queue.Queue()
        self.__blocking_result_queue = queue.Queue()
        self.__thread = threading.Thread(
            target=trio.run,
            args=(self.__trio_thread_main,))

    async def __trio_thread_main(self):
        async with open_loop():
            while True:
                # This *blocks*
                async_fn = self.__blocking_job_queue.get()
                if async_fn is None:
                    self.close()
                    return
                result = await trio.hazmat.Result.acapture(async_fn)
                self.__blocking_result_queue.put(result)

    def __run_in_thread(self, async_fn, *args):
        if not self.__thread.is_alive():
            self.__thread.start()
        self.__blocking_job_queue.put(partial(async_fn, *args))
        return self.__blocking_result_queue.get().unwrap()

    def run_until_complete(self, fut):
        return self.__run_in_thread(run_future, fut)

    def run_forever(self):
        # TODO: need to reset the stopped Event, because asyncio loops support
        # multiple rounds of run_forever/stop/run_forever/stop. Maybe instead
        # of wait_stopped, we want a function that resets the event AND then
        # blocks waiting for it, similar to how BaseEventloop.run_forever
        # always resets the internal _stopped flag?
        self.__run_in_thread(self.wait_stopped)

    def close(self):
        self.__blocking_job_queue.put(None)
        self.__thread.join()

    def __enter__(self):
        return self

    def __exit__(self, _, _, _):
        self.close()

# TODO: an event loop policy that returns HackyTestingLoop objects

Matthias Urlichs · Answer 4 · Sat Jan 27 2018 23:58:58 GMT+0800 (China Standard Time)

Thanks. I'll investigate using something like this to support the run_forever/run_until_complete test cases. The problem is that we need to use the same TrioEventLoop in both threads, so your code doesn't work as-is.
I'll open another issue to investigate.
Another problem is that the asyncio test suite is neither upwards- nor downwards-compatible to the actual asyncio implementation, which is neither upwards- nor downwards-compatible to the rest of Python. Some creative use of pytest.skipif() may be required – I don't want to ship multiple versions of that test suite.

The rest is now implemented.