tokio-rs / tokio-uring

An io_uring backed runtime for Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Generialise Buffer Types

ollie-etl opened this issue · comments

commented

Using io_uring requires ownership of the backing buffers for the duration of a Operation lifetime. This is currently enforced using the traits IoBuf and IoBufMut. This issue is suggesting dropping IoBuf and IoBufMut in favour of Pin. In conjunction with a slightly modified Slice struct, I believe Pin provides all the invariant we currently have, whilst allowing us to generalize over any input type (without newtyping everything)

I'd like to illustrate my proposal by considering the Op

// current
impl<T: IoBuf> Op<Write<T>> {
    pub(crate) fn write_at(fd: &SharedFd, buf:T, offset: u64) -> io::Result<Op<Write<T>>> {
        // ...
    }
}

// Pr #53
impl<T: IoBuf> Op<Write<T>> {
    pub(crate) fn write_at(fd: &SharedFd, buf: Slice<T>, offset: u64) -> io::Result<Op<Write<T>>> {
        // ...
    }
}

// Proposed
impl<T: Deref<Target = [u8]>> Op<Write<T>> {
    pub(crate) fn write_at(fd: &SharedFd, buf: Pin<Slice<T>>, offset: u64) -> io::Result<Op<Write<T>>> {
    }
}

A PR will be along shortly to give a concrete reference point, but I wanted to put this out there for discussion

I thought it would be a &mut to a pinned slice. So here the Op takes the ownership, as before, and has to return it when the future is complete?

commented

That was my thoughts, yes. We know (by construction) that as long as we hold the pin, the buffer is stable, but we don't have to handle the lifetime in the Op

commented

Although my thinking is likely to chnge as I will undoubted hit compile errors en-route

but we don't have to handle the lifetime in the Op

I don't follow. Just being dense I think. So we can't hold a &mut Pin<Slice> for the lifetime of the Op? The type doesn't have to be 'static? Aren't we saying the buffer's memory is 'static anyway, but bu by otherwise taking ownership of it as long as the future is active?

The future isn't send or sync, so seems like there should be a way to use a buffer from the stack.

commented

@FrankReh I called it a night, because its late here, but having thought about it for 5 minutes - you are of course completely correct &Pin<Slice<T>> is the correct thing. I'll take a look when fresh

I guess I see the problem with the future built around the uring interface. The kernel could be using the buffer when the app has already dropped the future so if the buffer/slice is back owned by the app, there could be unsound behavior. I don't know that it's officially UB when the kernel is involved, but it's close. So maybe back to the owned Pin. The buffer/slice ownership has been a thorn in the side of this design.

I think that if we pinned things which were 'static, this might work.

Although what does this get us that &'static + Deref doesn't?

Thinking more here, it gets us subslicing and a less annoying API.

Nevermind, we still can't have slicing, as the slices have non-static lifetimes.

After thinking more, deref is insufficient here. You can write a perfectly safe deref implementation which is invalid here. We can't actually do this with Pin either. We do need to keep IoBuf around.

For example, this compiles, but is wrong:

use std::ops::Deref;
use std::pin::Pin;

struct Wrapper {
    inner: [u8; 16],
}

impl Deref for Wrapper {
    type Target = [u8];

    fn deref(&self) -> &Self::Target {
        &self.inner
    }
}

fn main() {
    let buf = [0u8; 16];

    let pinned = Pin::new(Wrapper { inner: buf });

    write_at(pinned);
}

fn write_at<T>(buf: Pin<T>) where T: Deref<Target=[u8]> + 'static {
    drop(buf)
}

The signature on the write does not stop us from doing stupid crap. The core issue is that Pin is designed to protect against issues related to self-reference.

We do need an unsafe trait like IoBuf, which guarantees a stable pointer.

@Noah-Kennedy maybe I'm just not getting this. How does this go wrong?

@oliverbunting so, in the code which I wrote, any write op using that buffer would try and take the pointer of the stack location of the buffer, which would be invalid and result in memory corruption. Unless we want to heap allocate every op's data, we can't do this.

@Noah-Kennedy This is a confusing issue to be sure. Did you mean for us to extrapolate and have the write_at already return a Pin so the buffer could ostensibly be used by the caller again?

Funny how often I, for a moment, just wonder why a &mut can't be passed with no return buffer value, and then I remember the future could go out of scope but the kernel would still be operating on the buffer.

@Noah-Kennedy this won't be the first or last time I've read this:
https://doc.rust-lang.org/std/pin/struct.Pin.html#safety

My reading is that by constructing Pin, you are guaranting that the memory location is stable, even if on the stack

@oliverbunting except I can do this:

use std::ops::Deref;
use std::pin::Pin;

struct Wrapper {
    inner: [u8; 16],
}

impl Deref for Wrapper {
    type Target = [u8];

    fn deref(&self) -> &Self::Target {
        &self.inner
    }
}

fn main() {
    let buf = [0u8; 16];

    let pinned = Pin::new(Wrapper { inner: buf });

    move_a_pin(pinned);
}

fn move_a_pin<T>(buf: Pin<T>) where T: Deref<Target=[u8]> + 'static {
    let stack = buf.as_ptr();
    let boxed = Box::new(buf);
    let heap = boxed.as_ptr();

    assert_ne!(stack, heap)
}

It's incredibly subtle with what it actually means. Pin is meant as a safeguard for pointers to self-referential objects. It makes sure that the thing behind the pointer can't be moved. But when you accept an arbitrary Pin, you cannot guarantee that such an object is even a pointer. It is basically just there to make sure that you can operate on futures which are self-referential. It is not meant for what we are doing.

The reason pinning is like this is because async was kinda rushed out the door and this was needed to make it work. Rust doesn't have any native concept of an immovable type.

@oliverbunting it might be easier to talk about this in the tokio discord server. Are you in it?

No, but I could be

Although it won't be tonight. Which timezone are you in? I'm in GMT.

Oh wow, it's getting pretty late for you then. I'm in US Central.