rust-secure-code / safety-dance

Auditing crates for unsafe code which can be safely replaced

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

static mut code in blog post

troublescooter opened this issue · comments

Here I pointed out that the use of static mut might be ill-advised, since there seemed to be a lot of fine-print involved in using it correctly (linked in the issue).
With the introduction of const-generic indexing I thought to revisit the issue and was asked to provide stronger arguments for my concerns.

I admit to not being sufficiently proficient to make a strong case, and upon rereading the linked posts in the issue some confusion remains:

In unsafe statics withoutboats states

Actually taking a mutable reference is unsafe. However, this feature is incredibly footgunny and almost impossible to use correctly, since creating a mutable reference to shared memory is undefined behavior, even if you never dereference that reference.

Is &mut on static mut considered shared, even if not used in a multithreaded context? This answer from eddyb suggests that is the case, although the comment is not crystal-clear in a way that would be helpful to me. Concurrency being feasible in other contexts than the multithreaded case does not mean all cases of &mut on static mut is UB (*edit: outside a Cell or RefCell). I would think removing such an example from a blog post aimed at beginners would be worthwhile nonetheless, but I have been asked to prove the trivial code snippet is indeed UB, and MIRI is silent on the issue.

The code in question is

static mut MUT_BYTES: [u8; 3] = [1, 2, 3];
fn main() {
    unsafe {
        MUT_BYTES[0] = 99;
    }
}

which with const-generic indexing seems equivalent to

static mut MUT_BYTE: u8 = 1;
fn main() {
    unsafe {
        let ptr = &mut MUT_BYTE;
        *ptr = 99;
    }
}

Is this UB?

That pattern isn't UB per se, although the global static mut makes it error-prone in general. In that example, I'd prefer if, given:

macro_rules! once {( $($code:tt)* ) => (
    {
        use ::std::sync::Once;
        static O: Once = Once::new();
        &O
    }.call_once(|| { $($code)* })
)}
  • Note: this is actually a at_most_once kind of macro; it must evaluate to (). There can be an interesting equivalent of "exactly once or else panic" kind of macro that would be allowed to return a value:

    macro_rules! once {( $($code:tt)* ) => ({
        if {
            use ::std::sync::atomic::{self, AtomicBool};
            static ENTERED: AtomicBool = AtomicBool::new(false);
            ENTERED.swap(true, atomic::Ordering::SeqCst)
        }
        {
            enum _Diverged {}
            let _: _Diverged = ::std::panic!(
                "[{file}:{line}:{col}] Attempted to execute `once!`-guarded code more than once!",
                file = ::std::file!(),
                line = ::std::line!(),
                col = ::std::column!(),
            );
        } else {
            $($code)*
        }
    })}

it was:

fn main ()
{
    once! {
        let ptr = unsafe { static mut MUT_BYTE: u8 = 1; &mut MUT_BYTE };
        *ptr = 99;
    }
}

That way we do have a snippet that is robust to getting code added onto it.


The issue of static mut THING: T = …, vs. using its less error-prone static THING: RacyCell<T> = RacyCell::new(…); alternative, is conceptually similar to using mem::unitialized() vs. using MaybeUninit::uninit()-followed by-.assume_init() later on: in both cases, with the latter pattern, we avoid making the strong assumption too early to be correct. In the case of a static mut and a static RacyCell, the latter allows aliasing pointers with interleaved mutations whereas the former does not:

once! {
  static THING: RacyCell<T> = RacyCell::new();

  let ptr1: *mut T = THING.get(); // SB: shared read-write
  let ptr2: *mut T = THING.get(); // SB: shared read-write
  unsafe {
      (&mut *ptr1).mutate(); // SB: temporarily upgrade to unique
      (&mut *ptr2).mutate(); // ditto
      (&mut *ptr1).mutate();
      (&mut *ptr2).mutate();
  }
}

whereas the following is UB:

once! {
  static mut THING: T = …;
  unsafe {
      let ptr1: *mut T = &mut THING; // SB: shared read-write *derived from unique* (OK)
      let ptr2: *mut T = &mut THING; // SB: shared read-write *derived from unique* (Invalidates ptr1)
      (&mut *ptr1).mutate(); // UB: usage of invalidated pointer}
}

Notice how getting a raw mutable (SB: shared read-write) pointer to the global is not unsafe and thus not unsound with a static RacyCell, whereas even just forging such a raw pointer with a static mut is unsafe and thus UB-prone, since it does require / assert unicity of the access to the global (whereas with RacyCell we only make the unicity assertion on actual dereference of that pointer).


By the way, using an array is a good example of the problems of static mut:

once! { unsafe {
  static mut ARR: [u8; 2] = [0; 2];

  let at_fst = &mut ARR[0];
  let at_snd = &mut ARR[1]; // Probably invalidates `at_fst` if the `&mut` borrow occurs over the whole `ARR`.
  *at_fst += 42; // Usage of probably invalidated pointer}}

Granted, one can use the borrow checker to make sure these errors don't occur multiple times:

once! {
  // statically-allocated `Box::leak` equivalent
  let arr: &'static mut [u8; 2] = unsafe { static mut ARR: [u8; 2] = [0; 2]; &mut ARR };

  // Whatever we do here without using `unsafe` will be sound:
  let (at_fst, tail) = arr.split_at_first().unwrap();
  let at_snd = &mut tail[0];
  *at_fst += 42;
  …
}

Thanks for the detailed elaboration!