EmbarkStudios / texture-synthesis

🎨 Example-based texture synthesis written in Rust 🦀

Home Page:http://embark.rs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Undefined Behavior in Generator::update

ralfbiedert opened this issue · comments

This is an awesome tool, thanks for making this available for studying! One minor implementation thing:

In Generator::update you have:

// A little cheat to avoid taking excessive locks.
//
// Access to `coord_map` and `color_map` is governed by values in `self.resolved`,
// in such a way that any values in the former will not be accessed until the latter is updated.
// Since `coord_map` and `color_map` also contain 'plain old data', we can set them directly
// by getting the raw pointers. The subsequent access to `self.resolved` goes through a lock,
// and ensures correct memory ordering.
#[allow(clippy::cast_ref_to_mut)]
unsafe {
    *(self.coord_map.as_ptr() as *mut (Coord2D, MapId)).add(flat_coord.0 as usize) =
        (example_coord, example_map_id);

    *(self.id_map.as_ptr() as *mut (PatchId, MapId)).add(flat_coord.0 as usize) = island_id;

    *(self.color_map.get_pixel(update_coord.x, update_coord.y) as *const image::Rgba<u8>
        as *mut image::Rgba<u8>) = *example_maps[example_map_id.0 as usize]
        .get_pixel(example_coord.x, example_coord.y);
}

Unfortunately, I believe, this is undefined behavior. The documentation for Vec::as_ptr states

"The caller must also ensure that the memory the pointer (non-transitively) points to is never written to (except inside an UnsafeCell) using this pointer or any pointer derived from it."

The problem with UB (here) is not the order of access governance during runtime; but a property of the code during compile time; as a "contract" between the programmer and the compiler are formed regarding what is possible with that pointer.

Although today this might work just fine, any future compiler version (or code change, or different LLVM target backend) could invoke this contract, and exploit the guarantee that e.g., coord_map's backing heap storage "could not" have been changed via the as_ptr() pointer.

When that happens, the program might be 'optimized' in arbitrary ways, basically anything goes.

Possible Solutions

Just some random brainstorming:

  • Put content of the maps into UnsafeCells (might look a bit ugly)
  • Add locks like for other fields (overhead)
  • Take &self as &mut self (probably interferes with the crossbeam_utils::thread::scope closure and needs more rearchitecting)

Misc

Bonus quote from rkruppe since I was unsure about as_ptr myself:

[...] The precise rules for when mutation is allowed and through what pointers are still undecided, but the current best proposals (Ralf Jung's stacked borrows in its various iterations) are not primarily about what references are "floating around" somewhere but about permission to read/write for each part of memory (each of which applies to a subset of pointers and the permissions change as references and pointers get created and used). The code in as_ptr never creates any permission to mutate the Vec's buffer, so writing to it is UB [...]

So sweeping undefined behavior under the rug doesn't automatically make it well-defined? Fine, I guess I'll fix it 😛

Thanks for the report and elaborate description! This was in the bucket of "I know, Clippy, but I can't be bothered right now; what's the worst that could happen?", but you're of course right. I did want to keep the performance benefits, so adding locks didn't seem that attractive. Threading of course got into the way with mutable refs, and it also didn't like UnsafeCells, as they are not Sync. Wrapping them up in specialized structs seems to do the trick!