smol-rs / blocking

A thread pool for isolating blocking I/O in async programs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Discussion] A more ergonomic way to deal with 'blocking' futures

Byron opened this issue · comments

At gitoxide, a git implementation in Rust, most code is blocking as it relies heavily on operations which aren't available in 'real' async or are CPU bound.

When implementing the server side or clients it becomes evident that due to the ecosystem one won't get around using async, and those using gitoxide for custom clients or on the server will have to integrate with async crates or framework.

To get started with this and make what's there more flexible the client side is now available in async and a prototype client program exists to test the blocking/async interoperation from start to end.

As all git repository interactions are blocking, async client code coming from the transport or protocol layers will have to call into that. The async parts must not block the executor, and parts of or the entire operation have to be unblocked.

With the current API this looks like this:

        let transport = net::connect(url.as_bytes(), protocol.unwrap_or_default().into()).await?;
        let mut delegate = CloneDelegate {
            ctx,
            directory,
            refs_directory,
            ref_filter: None,
        };
        blocking::unblock(move || {
            futures_lite::future::block_on(protocol::fetch(
                transport,
                delegate,
                protocol::credentials::helper,
                progress,
            ))
        })
        .await?;

Fully async portions remain async, but those who are at least partially blocking must be entirely unblocked. The need for futures_lite::future::block_on seems avoidable especially when looking at the underlying implementation which relies on Executor::spawn(future) receiving a future.

pub async fn unblock<T, F>(f: F) -> T
where
    F: FnOnce() -> T + Send + 'static,
    T: Send + 'static,
{
    Executor::spawn(async move { f() }).await
}

If there was a function that takes a future, the code above could be simplified to the following:

        blocking::spawn(protocol::fetch(
            transport,
            delegate,
            protocol::credentials::helper,
            progress,
        ))
        .await?;

I would love to hear your thoughts about this usecase, and if there is interest will be happy to contribute.

Is there anything I can do to help this along @taiki-e ?

I don't have a strong opinion on this. However, bounded thread pools are not necessarily suitable for this case (see tokio-rs/tokio#3717 (comment) for more).

Thanks for your reply, this sounds like a PR would be feasible to make using blocking with internally blocking futures more ergonomic. This can remove a redundant block_on which simplifies the machinery.

Maybe there can be extra-documentation to elaborate on the potential for deadlocks if there are complex inter-dependencies.

commented

I think this is not a good idea to direct pass a Future to the threadpool, and it is very easy to cause "bugs" in this usage.

For example:

blocking::spawn(// "maybe-blocking-future",
    async move {
         tokio::time::timeout(duration, async {}).await; // <- lost tokio context in threadpool
    }
);

Correct me if I misunstand your purpose.

I think the above could be rewritten like the following…

blocking::spawn(// "maybe-blocking-future"
         tokio::time::timeout(duration, async {}) // <- lost tokio context in threadpool
);

…wrapping in async move seems redundant and this proposal is all about removing unnecessary layers in the usage of blocking.

It may very well be that incorrect usage is possible especially with tokio due specific requirements. Does that make the initially stated less valid?

Or asked differently: Should blocking forego ergonomic and inarguably more efficient APIs to prevent possible issues with tokio specifically?

Correct me if I misunstand your purpose.

A function to spawn futures directly is what is indeed proposed here, and examples are made for how this would make blocking more ergonomic and probably more efficient for those who don't need special runtime support. It's helpful to have a counter-example that shows why blocking enforces the use of non-async functions to unblock - users of tokio would be forced to wrap it in tokio::block_on() to get runtime support, avoiding runtime panics which may be surprising.

commented

I think the above could be rewritten like the following…

blocking::spawn(// "maybe-blocking-future"
         tokio::time::timeout(duration, async {}) // <- lost tokio context in threadpool
);

…wrapping in async move seems redundant and this proposal is all about removing unnecessary layers in the usage of blocking.

It may very well be that incorrect usage is possible especially with tokio due specific requirements. Does that make the initially stated less valid?

Or asked differently: Should blocking forego ergonomic and inarguably more efficient APIs to prevent possible issues with tokio specifically?

Correct me if I misunstand your purpose.

A function to spawn futures directly is what is indeed proposed here, and examples are made for how this would make blocking more ergonomic and probably more efficient for those who don't need special runtime support. It's helpful to have a counter-example that shows why blocking enforces the use of non-async functions to unblock - users of tokio would be forced to wrap it in tokio::block_on() to get runtime support, avoiding runtime panics which may be surprising.

Emm, the sample code is just for simplifing the real logic of my app which panics due to lost the tokio context (kills my whole afternoon to find out 🤣). Tokio may crash when running in the fresh new thread of threadpool. So it is quite "easy" to cause such a bug.

I agree with the fact that this seems like an easy footgun to set off. If we're going to include this function, it should be very clearly marked as "know what you're doing".