Improve performance of async execution

Question

Improve performance of async execution

Cito opened this issue 3 years ago · comments

Christoph Zwerschke commented 3 years ago

When executing a query against a schema that contains async resolvers, then the executor method creates some overhead by defining and calling async functions (async def async_... inside methods in execute.py).

We should check whether pre-defining these functions or static methods of the ExecutionContext could reduce this overhead and boost performance. See also benchmarks provided in #141.

Jonathan Kim · Answer 1 · Fri Oct 15 2021 02:06:52 GMT+0800 (China Standard Time)

@Cito would you be open to splitting the ExecutionContext class into sync and async versions? That way we can drop most of the is_awaitable checks and simplify the code significantly.

Christoph Zwerschke · Answer 2 · Fri Oct 15 2021 16:50:20 GMT+0800 (China Standard Time)

We should consider all options, but the API should be backward compatible, to not cause another turmoil in the GraphQL-Python ecoystem, and the code and API should not deviate too much from GraphQL.js so that it's easier to keep up with the development there.

Currently, the ExecutionContext can deal with mixed resolvers (some sync, and some async). This is similar to GraphQL.js, but it is more complicated to implement in Python because contrary to JavaScript you can't await something non-awaitable here.

What we could do is create optimized versions (possibly subclasses) of the ExecutionContext for the two cases when all resolvers (including type resolvers) are sync or all are async. The schema could get an attribute that tells whether it is sync, async or mixed, which could be determined automatically via introspection at schema build time. If possible, the executor could then call the optimized version. This way it would be fully backward compatible and transparent to the caller, and always as performant as possible.

If we do this, we should strive for as little code duplication as possible (just as much as needed to solve the performance issues).

Before experimenting with such a big (internal) change, we should first check how much we can gain by predefining the async functions as explained above. Maybe this already suffices. For optimization, in the case of a purely sync or async schema, you can also pass an is_awaitable function that always returns either False or True as. This would avoid the costly real checks every time (see also #54 regarding the is_awaitable function and the optimization that has been already done here).

Using a profiler like Yappi should also give insights into where things can be further optimized.

Jonathan Kim · Answer 3 · Fri Oct 15 2021 17:31:30 GMT+0800 (China Standard Time)

We should consider all options, but the API should be backward compatible, to not cause another turmoil in the GraphQL-Python ecoystem, and the code and API should not deviate too much from GraphQL.js so that it's easier to keep up with the development there.

Absolutely!

Before experimenting with such a big (internal) change, we should first check how much we can gain by predefining the async functions as explained above. Maybe this already suffices.

Agreed. I can try and put together a PR to see how much it would improve things.

For optimization, in the case of a purely sync or async schema, you can also pass an is_awaitable function that always returns either False or True as.

I think this already happens with the sync execution pathway but I don't think it's possible to always return True for async schema's because the default resolver is still sync.

Christoph Zwerschke · Answer 4 · Fri Oct 15 2021 18:29:01 GMT+0800 (China Standard Time)

I think this already happens with the sync execution pathway but I don't think it's possible to always return True for async schema's because the default resolver is still sync.

Good points. Regarding the default resolvers, we could either use async default resolvers in this case or we could simply make a case distinction in the async execute_field method when there is no field resolver, and similarly for the type resolver in complete_abstract_value.

Christoph Zwerschke · Answer 5 · Wed Dec 29 2021 07:35:47 GMT+0800 (China Standard Time)

See also #101 for performance optimization.

Christoph Zwerschke · Answer 6 · Wed Dec 29 2021 07:48:00 GMT+0800 (China Standard Time)

For the records: In #142 the locally defined async methods were moved to the class level, but this did not make things better. The next step could be to create an execution context that assumes every resolver is async and that is optimized for that case, and see if that can improve performance significantly.