Redesign `BufResult` to be `std::result::Result`

Question

Redesign `BufResult` to be `std::result::Result`

mzabaluev opened this issue 2 years ago · comments

BufResult being aliased to a tuple makes it less ergonomic than a standard Result could be.
In future, a std::ops::Try implementation could help, but it needs a local type to be implemented on.

I propose to make use of the standard infrastructure provided by Result and provide the buffer with a new error type:

use std::error::Error;
use std::io;

pub type BufResult<T, B> = Result<(T, B), BufError<B>>;

#[derive(Debug)]
pub struct BufError<B>(pub io::Error, pub B);

impl<B: Debug> Error for BufError<B> {
    // ...
}

Mikhail Zabaluev · Answer 1 · Sat Nov 19 2022 16:03:12 GMT+0800 (China Standard Time)

The currently promoted pattern of error handling could be enabled by an adapter trait:

pub trait LiftBuf {
    type Output;
    type Buf;
    fn lift_buf(self) -> (io::Result<Self::Output>, Self::Buf);
}

impl<T, B> LiftBuf for BufResult<T, B> {
    type Output = T;
    type Buf = B;

    fn lift_buf(self) -> (io::Result<Self::Output>, Self::Buf) {
        match self {
            Ok((out, buf)) => (Ok(out), buf),
            Err(BufError(e, buf)) => (Err(e), buf),
        }
    }
}

FrankReh · Answer 2 · Sat Nov 19 2022 21:40:26 GMT+0800 (China Standard Time)

I like it. It places the buffer as the second of an Ok tuple or an Err tuple. Returns would be

Ok(val, buf)
// or
Err(err, buf)

FrankReh · Answer 3 · Sat Nov 19 2022 23:17:54 GMT+0800 (China Standard Time)

@mzabaluev Do you have a login on rust's zulip chat? There is a thread, "IO traits" in the wg-async stream, that you should be aware of. It basically is about the long term goals of rust, in supporting async runtimes with more common features built into the std library and nrc has created a git repo for capturing their thoughts on buffer traits, and there are pieces for completion based APIs like uring requires. Your experience with slices and features like indexed buffers and returning the owned buffer and owned slice in results and errors would be very welcome I believe, and it doesn't hurt that you know the trait system so well.

Noah Kennedy · Answer 4 · Mon Nov 21 2022 02:11:47 GMT+0800 (China Standard Time)

I completely agree with this proposed change.

Mikhail Zabaluev · Answer 5 · Sat Nov 26 2022 14:45:34 GMT+0800 (China Standard Time)

Responding to points raised in a zulip thread:

I think the issue that it doesn't work like a Result is kind of necessary and using Result<(T, buf), (E, buf)> is worse than (Result<T, E>, buf). This is because in the error case the caller is going to want to deal with the buffer before propagating the error. If you just propagate the error then I think the buffer cannot be handled properly.

I find that the pattern of error handling depends a lot on the code around the call site. If the calling function does not have the buffer in its return signature, it can't be propagated anyway and you need to drop it. But I believe more often than not the buffer will need to be propagated, so a BufError will need to be destructured to its constituent parts or even returned straight in the Err.

Then there are loops that reuse the buffer. Note how the compiler forces recovery of the moved buf in the second match arm:

let mut buf = vec![0u8; 4096];
loop {
    match stream.read(buf).await {
        Ok((bytes_read, nbuf)) => {
            buf = nbuf;
            if bytes_read == 0 {
                break;
            }
            // Do something with received data
        }
        Err(e) if is_recoverable(&e.0) => {
            todo!("need to recover buf for the next iteration");
            continue;
        }
        Err(BufError(e, buf)) => {
            todo!("handle the error, deal with the buffer")
        }
    }
}

Aside from these two situations, I'd like to come up with an example where it would be easy to lose the buffer in error handling and not have some visible hint that this happens. This proposal also provides an adapter trait to convert to (std::io::Result, B) for the price of a method call.

Furthermore, I think the caller is not going to want to have (E, buf) as its error type because it would be exposing part of its internals in its API and it will be a problem for returning errors which don't come from the IO (and therefore don't have the buf to return)

We already have the distinction between functions that work with a buffer and functions that don't, and the latter return std::io::Error (io-uring conveniently uses the POSIX error codes so we get to report all failures with that). I believe struct BufError<B>(pub std::io::Error, pub B) to be sufficiently opaque and expressive, while providing a good API surface for pattern matching and the std::error::Error infrastructure.

Nick Cameron · Answer 6 · Tue Nov 29 2022 16:54:46 GMT+0800 (China Standard Time)

An alternative (which I'm not sure is better) would be to use a newtype rather than an alias. The advantage would be that you could implement lift_buf as an inherent method rather than on a trait. You could implement Deref to Result and Try for the newtype to make things ergonomic.

Nick Cameron · Answer 7 · Tue Nov 29 2022 17:03:20 GMT+0800 (China Standard Time)

But I believe more often than not the buffer will need to be propagated

This is interesting to me - could you explain why you think it?

Then there are loops that reuse the buffer. Note how the compiler forces recovery of the moved buf in the second match arm:

I'm not sure I understand the example - is this different to the current situation with (Result, Buf)?

Mikhail Zabaluev · Answer 8 · Wed Nov 30 2022 04:16:08 GMT+0800 (China Standard Time)

But I believe more often than not the buffer will need to be propagated

This is interesting to me - could you explain why you think it?

tokio-uring (or, for that matter, any completion-driven I/O crate) is a low level crate, I expect that applications would add their customized I/O APIs on top of it. If the application is not OK with consuming the allocation for each operation (or does not use a reference-counted buffer collection like FixedBufRegistry), it will use the same BufResult return value pattern.

I'm not sure I understand the example - is this different to the current situation with (Result, Buf)?

My point is that the Err match arm needs to recover the buffer if the loop iteration continues after handling the error. Which is easy to do by destructuring the BufError tuple struct, or calling .lift_buf(). But you can't just drop the buffer within the error value and not have a compiler error in this example.

OllieB · Answer 9 · Fri Jan 27 2023 02:33:38 GMT+0800 (China Standard Time)

#216 may have bearing on this, if accepted. It means the success case for read like operations can just return the buffer

FrankReh · Answer 10 · Fri Jan 27 2023 04:25:02 GMT+0800 (China Standard Time)

This also seemed to have wide spread support. Maybe we should get something in sooner rather than later for this issue.