Is it sound to check whether the bytes of an `Option<&T>` are zero?

Question

Is it sound to check whether the bytes of an `Option<&T>` are zero?

joshlf opened this issue 7 months ago · comments

Joshua Liebow-Feeser commented 7 months ago

Co-authored with @jswrenn.

In zerocopy, we have a situation where we have a *const Option<&T>. We know that the referent bytes are "as initialized" as the bytes of an Option<&T>, but not necessarily that they are a bit-valid Option<&T>. By "as initialized", we mean that one of the two is true:

The referent is a bit-valid Option::<&T>::None
The referent's discriminant represents Some, and its bytes are initialized wherever &T's bytes are initialized

What we need to do is check whether the referent contains all zeroed bytes. If it does, we can soundly treat those bytes as containing a Option::<&T>::None thanks to the NPO (which guarantees the layout of this specific value).

Our problem is this: We're not sure whether it's sound to look at all of the bytes (in other words, to transmute from Option<&T> to [u8; size_of::<Option<&T>>()]) in order to check that they're all zero. Another option we considered was round-tripping via Option<NonNull<T>> and then using NonNull::addr to extract the address.

Any guidance on whether transmuting Option<&T> to either [u8; size_of::<Option<&T>>()] or to Option<NonNull<T>> are sound?

Lokathor · Answer 1 · Wed Feb 07 2024 05:47:03 GMT+0800 (China Standard Time)

If it's definitely either None or Some then it's definitely fully initialized because that is the point of Rust enums.

I'm unclear why you'd do this transmute instead of matching on the value directly or using is_none

Jack Wrenn · Answer 2 · Wed Feb 07 2024 05:58:17 GMT+0800 (China Standard Time)

I'm unclear why you'd do this transmute instead of matching on the value directly or using is_none

Our starting point is a glorified *const Option<&T>. Glorified in that:

We know it adheres to shared aliasing rules.
We know that it's non-null.
We know that the referent is "as initialized" as Option<&T>

...but we don't know the referent is a validly initialized Option<&T>, nor do we know that the referent is validly aligned. We want to check whether the referent is all-zeros, but given these gaps we can't immediately call is_none.

^{If you're curious, here's our current stab at a proof, which involves several stages of somewhat-justified hoop-jumping in order to call is_none (as you suggest).}

Joshua Liebow-Feeser · Answer 3 · Wed Feb 07 2024 06:13:56 GMT+0800 (China Standard Time)

If it's definitely either None or Some then it's definitely fully initialized because that is the point of Rust enums.

I agree, except that IIUC there's one extra step required: all bytes of a &T must always be initialized. In practice I'm sure this is true, but is it guaranteed? The reason I'm skeptical is a) this aspect of the layout isn't documented anywhere and, b) @RalfJung has previously mentioned that pointer-to-int conversions might be UB. If they're UB, then that implies that doing &T -> [u8; N] might also be UB since another of writing that conversion (where t: &T) is ((t as *const T) as usize).to_ne_bytes(), which relies on this maybe-UB conversion.

Ralf Jung · Answer 4 · Sat Feb 10 2024 18:46:41 GMT+0800 (China Standard Time)

but we don't know the referent is a validly initialized Option<&T>, nor do we know that the referent is validly aligned

Alignment seems to be the key point here? I am not sure which other part of the validity invariant of Option<&T> might be missing. (Well, there is of course the issue around potentially recursive validity of references; not sure if you are referring to that.)

You are right to be cautious with loading this as a usize, as that could indeed be an ptr2int transmute. But what if you load it as *const () instead? IOW:

unsafe fn is_none<T>(ptr: *const Option<&T>) -> bool {
  ptr.cast::<*const ()>().read().is_null()
}

Joshua Liebow-Feeser · Answer 5 · Sun Feb 11 2024 01:08:06 GMT+0800 (China Standard Time)

Ah that hadn't occurred to me! That seems much more clearly reasonable on its surface.

I'm pretty sure this is sound today, but is it guaranteed to always be sound? IIUC, this relies on:

All bytes of an Option<&T> are initialized
All byte patterns are valid instances of *const () (looks like this is guaranteed if we interpret "layout" to include the initialized-ness of bytes)

While this isn't a soundness concern, the correctness of this function relies on the fact that there is only one bit representation for Option::<&T>::None. We know thanks to these docs that the all-zeroes pattern is one valid representation for Option::<&T>::None, but that doesn't guarantee that there aren't others. (Again, obviously this is true in practice, but I'm trying to find docs that guarantee it.)

Lokathor · Answer 6 · Sun Feb 11 2024 01:23:43 GMT+0800 (China Standard Time)

So, are you thinking that, perhaps similar to f32::NAN, there could theoretically be more than one bit pattern that's equal to the literal expression None? And thus, if the bits were non-zero they could still be one of the "other" None values?

Ralf Jung · Answer 7 · Sun Feb 11 2024 01:41:00 GMT+0800 (China Standard Time)

All bytes of an Option<&T> are initialized

Given that all bytes of &T are initialized, and Option<&T> has the same size, I don't see how there could possibly be a padding byte in Option<&T>.

Joshua Liebow-Feeser · Answer 8 · Sun Feb 11 2024 02:07:48 GMT+0800 (China Standard Time)

So, are you thinking that, perhaps similar to f32::NAN, there could theoretically be more than one bit pattern that's equal to the literal expression None? And thus, if the bits were non-zero they could still be one of the "other" None values?

Yeah, exactly. Obviously I don't actually think that'd ever happen, but technically I don't think the docs currently rule it out.

Given that all bytes of &T are initialized

Is your thinking that that's guaranteed by this? (Edit: as far as I can tell, that section only guarantees size and alignment, but nothing about which bytes are initialized.)

Ralf Jung · Answer 9 · Sun Feb 11 2024 03:47:06 GMT+0800 (China Standard Time)

Is your thinking that that's guaranteed by this?

My thinking is just that this is "obviously" the case, but I don't know what exactly is stably documented where. It is guaranteed by the MiniRust representation relation, but that doesn't help you.

It's hard to be precise in a spec without fully committing to all the details.

Joshua Liebow-Feeser · Answer 10 · Sun Feb 11 2024 04:04:41 GMT+0800 (China Standard Time)

Yeah, that makes sense.

While we're on the subject, maybe you can clear something up for me. It seems inconsistent to say that we can view the bytes of a pointer (ie, &T -> [u8; N]), but we can't do ptr2int (ie, &T -> usize). We know that bytes-to-int is sound ([u8; N] -> usize), so shouldn't we be able to combine that with the first transformation to get sound ptr2int? There seems to be a contradiction here. Am I missing something?

Lokathor · Answer 11 · Sun Feb 11 2024 04:23:17 GMT+0800 (China Standard Time)

My own understanding is that ptr2int transmutes are sound, but they strip provenance and so you can't transmute back to a pointer later and get a usable pointer.

For simply comparing the int to 0 it should be sound to transmute (again, if my understanding is still up to date).

Ralf Jung · Answer 12 · Sun Feb 11 2024 04:23:34 GMT+0800 (China Standard Time)

It seems inconsistent to say that we can view the bytes of a pointer (ie, &T -> [u8; N]), but we can't do ptr2int (ie, &T -> usize).

Correct. Viewing the bytes of a pointer also does ptr2int transmute and is hence on equally uncharted ground.

The t-opsem working consensus is what @Lokathor said, but so far we haven't felt ready to stably commit to that, and the lang team hasn't blessed this.

Joshua Liebow-Feeser · Answer 13 · Fri Feb 16 2024 02:06:34 GMT+0800 (China Standard Time)

Would it be easy to articulate what degrees of freedom there are in the design space that make this a not-yet-decided question? In other words, what could cause us to decide that ptr2int is UB in itself (rather than merely producing a pointer which is not particularly useful, and on which further operations are likely to be UB)?

In zerocopy, we have a lot of consumers who want to be able to look at the bytes of a pointer, so being able to make progress on this would be great. I'd be happy to do some of the work to move it forward if the gaps are well-known.

Lokathor · Answer 14 · Fri Feb 16 2024 03:25:40 GMT+0800 (China Standard Time)

Well, p as usize is safe code that works on Stable. So to make ptr2int itself be UB we'd have to somehow explain and justify p as usize as being something other than being a ptr2int operation. That basically wouldn't fly.

Joshua Liebow-Feeser · Answer 15 · Fri Feb 16 2024 03:28:18 GMT+0800 (China Standard Time)

Is there a possible world in which p as usize is considered ptr2int, but transmute::<_, [u8; N>(p) is not considered ptr2int, and as a result Rust reserves the right to declare it UB? If there is no such possible world, then presumably it'd be uncontroversial to write that guarantee down somewhere?

Jack Wrenn · Answer 16 · Fri Feb 16 2024 03:33:35 GMT+0800 (China Standard Time)

So to make ptr2int itself be UB we'd have to somehow explain and justify p as usize as being something other than being a ptr2int operation. That basically wouldn't fly.

Is it guaranteed that p as usize exposes the exact bits of p? For instance, on a hypothetical platform where pointers have uninit bits, could p as usize do something like initialize those exposed bits to 0? That doesn't strike me as a completely ridiculous hypothetical.

Lokathor · Answer 17 · Fri Feb 16 2024 03:35:48 GMT+0800 (China Standard Time)

~~The only time you can't transmute T to [u8; N] is if T contains uninit bytes. This is a general property of Rust because we don't have typed memory, only typed accesses.~~

Currently, pointers don't contain uninit bytes.

I guess there is some possible future (eg: a new arch becomes popular many years from now) where pointers somehow contain an uninit byte. That seems unlikely, but if we want to worry about the absolute limits of possibility, I suppose it's possible.

Joshua Liebow-Feeser · Answer 18 · Fri Feb 16 2024 03:40:26 GMT+0800 (China Standard Time)

The only time you can't transmute T to [u8; N] is if T contains uninit bytes. This is a general property of Rust because we don't have typed memory, only typed accesses.

Currently, pointers don't contain uninit bytes.

I guess there is some possible future (eg: a new arch becomes popular many years from now) where pointers somehow contain an uninit byte. That seems unlikely, but if we want to worry about the absolute limits of possibility, I suppose it's possible.

Yeah, that's exactly our concern. Our goal with zerocopy is to only rely on properties that we know won't be walked back in the future so we can credibly claim that "if your code is sound under Rust version X, it will be sound under all Rust versions Y > X." It means we end up being very pedantic about what is actually guaranteed 😛

Ralf Jung · Answer 19 · Fri Feb 16 2024 03:46:27 GMT+0800 (China Standard Time)

Is there a possible world in which p as usize is considered ptr2int, but transmute::<_, [u8; N>(p) is not considered ptr2int, and as a result Rust reserves the right to declare it UB?

Yes, that is very possible. It is, in my eyes, extremely unlikely that we will consider this transmute a ptr2int cast. ptr2int casts cannot be dead-code eliminated, and every pointer load is a potential transmutation site, and I am sure that we want to be able to remove dead loads.

We might end up special-casing transmute, which would make transmute not equivalent to "just load through a differently-typed raw pointer", but I'd prefer to not do that.

Currently I consider "ptr2int transmute is the same as ptr.addr()" to be the most sensible semantics. But this entire design space is so subtle I don't feel comfortable committing to anything here. And in terms of actually explicitly guaranteeing this -- we have just reached the point where we are officially saying that we have provenance; at this pace it will take a while until we guarantee anything about how provenance works on detail.

Ralf Jung · Answer 20 · Fri Feb 16 2024 03:47:28 GMT+0800 (China Standard Time)

The only time you can't transmute T to [u8; N] is if T contains uninit bytes. This is a general property of Rust because we don't have typed memory, only typed accesses.

No, that's not decided yet. If T contains bytes with provenance, we may also say that such a transmute is not allowed.

Lokathor · Answer 21 · Fri Feb 16 2024 03:57:57 GMT+0800 (China Standard Time)

That would be an unfortunate breaking change to a lot of existing code, but I suppose it's possible, true.

Ralf Jung · Answer 22 · Fri Feb 16 2024 03:58:39 GMT+0800 (China Standard Time)

Do we have a collection of such code?

Lokathor · Answer 23 · Fri Feb 16 2024 04:08:35 GMT+0800 (China Standard Time)

Not at hand. And "a lot" is probably overstating it. I've definitely seen people doing it before to inspect the bytes of an object, probably for the same reasons that the zerocopy users want.

Joshua Liebow-Feeser · Answer 24 · Fri Feb 16 2024 04:10:51 GMT+0800 (China Standard Time)

My guess is that this shows up primarily in places where you're communicating with another piece of code that shares access to a particular memory space. Think FFI, kernel/userland boundary, IPC with shared memory maps, etc. The most notable use case for zerocopy's users (that I'm aware of) is a userland process which emulates the Linux kernel.

Lokathor · Answer 25 · Fri Feb 16 2024 04:23:01 GMT+0800 (China Standard Time)

Even just a debug info display might read a pointer as bytes and show it.