rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do

Home Page:https://rust-lang.github.io/unsafe-code-guidelines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does `Box` contain any `UnsafeCell`s?

joshlf opened this issue · comments

Obviously it doesn't in practice, but is this guaranteed? In particular, is the following code guaranteed to be sound?

fn read_box_as_raw_ptr<T: Sized>(b: &Box<T>) -> *const T {
    let b_ptr: *const Box<T> = b;
    let ptr_ptr = b_ptr.cast::<*const T>();
    unsafe { core::ptr::read(ptr_ptr) }
}

In particular, this would be unsound if Box contained any UnsafeCells because it would run afoul of aliasing rules. The std::boxed docs guarantee that, for T: Sized, Box<T> is ABI-compatible with *const T, but this could in principle just be talking about value transmutations, which don't technically rule out UnsafeCell.

Specifically, my question is: Is it consistent to read the std::boxed docs as logically implying that Box does not contain any UnsafeCells? If not, I'll put up a PR 😃

I'm not even sure I understand the question.

  • Box is an owning pointer.
  • UnsafeCell is not a pointer at all.

I'm not even sure I understand the question.

  • Box is an owning pointer.
  • UnsafeCell is not a pointer at all.

Specifically, the question is whether it'd be sound to implement Freeze for Box<T> where T: Sized. Zerocopy is working on our own trait (we're calling it NoCell), and we're trying to figure out whether it's sound to implement for Box<T>.

As an example of how this could break, imagine that Box<T> is implemented as:

#[repr(C)]
struct Box<T> {
    ptr: UnsafeCell<*mut T>,
}

IIUC, this definition would be consistent with the std::boxed docs, and would make NoCell for Box<T> unsound.

Box is not a structure that has a pointer field internally. Box is the pointer, directly. So your example struct wouldn't be correct.

Also Box does already implement Freeze if you Ctrl+F to go way way down the impls list.

Box is not a structure that has a pointer field internally. Box is the pointer, directly. So your example struct wouldn't be correct.

I agree that that's true in practice, but is that fact guaranteed by the docs?

Also Box does already implement Freeze if you Ctrl+F to go way way down the impls list.

Sure, but Freeze is not a stable, public trait, so we can't treat that as a promise that it will always hold in the future.

Well it says

Box<T> values will always be fully aligned, non-null pointers.

Which seems true enough

Sure, but since UnsafeCell<T> has the same representation as T, that doesn't rule out Box<T> having the same representation as UnsafeCell<P>, where P is a pointer to T (I say "pointer to T" rather than *const T or *mut T since that's the language that std::boxed uses).

By that argument, *mut T might secretly contain an UnsafeCell too. Or *const T could secretly have one. In fact, any type, at any time, might secretly contain an UnsafeCell, if we're going to be that worried about things. I'm not aware of documentation saying that i32 definitely doesn't have an UnsafeCell inside of it.

Trying to be more helpful, looking again at your example code, I can't understand how the presence or absence of secret UnsafeCell would affect the code in any way. I don't understand what aliasing concerns you're referring to. You're basically just doing transmute_copy with extra steps to make a Box into a const pointer without dropping the box, and that's fine because it's always safe to make pointers it's just not always safe to use pointers.

EDIT: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0cd093c0cc8083a5e6a90e26fe7d5988 and you can even write your example fn without any unsafe code, just reborrowing.

I think there's no way that Box<T> (with the global allocator) can have interior mutability. But yeah we don't have an explicit list of types without interior mutability -- technically one could ask the same question about our integer, raw pointer, or reference types.

FWIW, the ability to Deref::deref a Box means that the only way which Box could possibly contain UnsafeCell and allow mutation of it while shared references are live is by the use of a global lock which locks before Deref::deref and unlocks when the deref lifetime is invalidated (i.e. requires an absolute mountain of magic). Putting the lock inside the allocation isn't even sufficient, since that would require reading the pointer to access the lock. If zerocopy is willing to implement NoCell for u8, then implementing NoCell for Box<_> is relying on the exact same area of not-strictly-guaranteed-but-effectively-unavoidable properties of the Rust memory model / type system.

There is a stable way to observe the Freeze-quality of a type: the lack of feature(const_refs_to_cell) means that you are allowed to take a reference to Box<i32> but not to Cell<i32> in a const context. This cannot be used to derive a future-resistant static assertion that a type is Freeze, but it does mean that removing the Freeze autotrait impl from a type is, strictly speaking, API breaking, despite not showing up in documentation in any way.

const fn works(x: Box<i32>) -> Box<i32> {
    &x; // no error
    x
}
const fn error(x: Cell<i32>) -> Cell<i32> {
    &x; //~ error[E0658]: cannot borrow here, since the borrowed element may contain interior mutability
    x
}

I've offhandedly suggested unhiding the Freeze trait before, citing this visibility of semantics for doing so. But AIUI, general temperature seems to be that documenting the Freeze property isn't super helpful until we lock down what that property actually means to the memory/borrow model. Again, if zerocopy is willing to rely on the currently as of current nonstandard memory model proposals, it should be fine with relying on more self-evidently correct properties like "Box<T> is at least as restricted as *const T." Additionally, even if Box were to contain UnsafeCell but not utilize the shared mutability, ignoring that fact given the acquisition of &Box<T> is sound, as the other code must tolerate you doing anything that would be valid to do with &Box<T>, which includes Deref::deref (which in this scenario does no synchronization).

If we were to add anything for this into the std docs, I think it would be wording along the lines of "doesn't do any synchronization" rather than anything about containing UnsafeCell. The former is at least justifiably useful information. The latter is meaningless except in the context of additional nonnormative details about the runtime borrow model.