jonhoo / haphazard

Hazard pointers in Rust.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Musings on marked atomic pointers

anko opened this issue · comments

This is probably not even in scope for this library. But I started thinking about it and writing down notes, and by the time I decided this is a bad idea, I had all these notes. So in the spirit of "also publish negative results", I'll dump them here for posterity and close the issue.


What is a marked pointer?

A marked pointer (a.k.a. stateful pointer, or tagged pointer) is a fat pointer, to something with an alignment of at least 2. At such alignments, some number of the lowest bits of the pointer will "always" be zero (see Caveats). For example, a struct with alignment 2 will "always" be located at an even-numbered address, meaning the lowest bit of a pointer to such a struct will "always" be 0.

This means the low bits can be repurposed, as markers that are atomically tied to the pointer value, because they're "carried along for the ride" for atomic operations on the AtomicPtr. As long as those marker bits are masked away before the pointer is dereferenced, then the pointer remains logically identical, and we get to effectively do CAS on a (pointer, metadata) tuple.

Why?

Such marked atomic pointers are useful for example for implementing deletion in non-blocking linked lists with two CAS operations:

  1. Mark the next-node-pointer of the to-be-deleted node with a "deleted" marker bit. This means concurrent insertions after the to-be-deleted node will fail CAS (because the full marked pointer is no longer equal to the unmarked version) and so lost insertions can't happen, while concurrent reads are still able to traverse the pointer, because the marker bits don't impede its pointer functionality.
  2. Move the previous node's pointer to the next, skipping the to-be-deleted node.

The deleted node is then reclaimed as the readers and writers release their hazard pointer guards.

If the deletion-marker and next-node pointer were separate, concurrent insertions and deletions would race, and some insertions would be silently lost.

Alternatives

The Correct Solution would instead be for every platform to support bigger-than-usize atomics, like AtomicU128 (via cmpxchg16b or something), which would allow CAS on combined (pointer, metadata) tuples without bit-packing hacks like this. But that future is probably some decades away still, or it might never come at all; to my understanding cmpxchg16b even on the x86_64 hardware that support it is a hack that performs poorly enough that GCC doesn't want to call it lock-free. This kind of thing is why std::sync::atomic::AtomicU128 doesn't exist.

Non-blocking-list deletion can also be implemented without marked pointers, but it's less efficient and harder to reason about.

Or just give up and use a lock.

Existing work

Caveats

As noted in https://github.com/HDembinski/stateful_pointer readme:

Platform dependence: The library relies on two platform-dependent aspects of memory handling.

  • Memory addresses map trivially to consecutive integral numbers.
  • The address of aligned memory, in its representation as an integral number, is an exact multiple of the alignment value.

The C++ standard not does guarantee these properties, as explained on StackOverflow. Nevertheless, common platforms in use today seem to support this simple memory addressing scheme. There is no guarantee, of course, that future platforms will do the same, so use this library with caution. Many thanks go to the knowledgable Redditors who pointed all this out.

Or, as put more colourfully 😄 in the linked Reddit thread by /u/wrosecrans:

Oh god, please nobody use tagged pointers except as a gag. Somebody like me is going to wind up having to fix what you did ten years later after it explodes and catches fire on some compiler/OS/hardware that didn't exist when you wrote it.

Which are valid points. Maybe in year 2200 u16s live at odd addresses. This whole idea is wildly unsafe.

API?

Anyway, the idea I'm having regarding adding this horribleness to this delightful library is: Create a MarkedPointer trait representing a pointer out of which can be extracted both the "raw" *mut T (the marked and therefore actually possibly invalid pointer, to be used in atomic operations) and the "cooked" (*mut T, M) tuple (the actually-valid pointer and extracted marker, as the user-facing API).

Here's roughly that, and an automatic implementation for any type that implements haphazard::raw::Pointer, with an empty marker (()).

pub unsafe trait MarkedPointer<T, M, P: Pointer<T>> {
    /// Extract a pointer that corresponds to a valid `&mut T`, and the associated marker.
    fn into_cooked(self) -> (*mut T, M);

    /// Reconstruct this marked pointer from the given valid pointer and mark.
    unsafe fn from_cooked(ptr: *mut T, mark: M) -> Self;

    /// Encode the valid pointer and mark into a (possibly invalid) pointer value that the mark and
    /// actual pointer value can later be reconstructed from later.
    fn into_raw(self) -> *mut T;

    /// Load the (possibly invalid) pointer value, extracting a valid pointer and mark from it.
    unsafe fn from_raw(ptr: *mut T) -> Self;
}

/// Any `Pointer<T>` trivially implements `MarkedPointer<T, M> where M: ()`.  It reports itself as
/// always being marked `()`, and for everything else simply calls the existing methods of
/// `Pointer<T>` and passes their results through without modification.
unsafe impl<T, P: Pointer<T>> MarkedPointer<T, (), P> for P {
    fn into_cooked(self) -> (*mut T, ()) {
        (Self::into_raw(self), ())
    }

    unsafe fn from_cooked(ptr: *mut T, _mark: ()) -> Self {
        unsafe { Self::from_raw(ptr) }
    }

    fn into_raw(self) -> *mut T {
        P::into_raw(self)
    }

    unsafe fn from_raw(ptr: *mut T) -> Self {
        unsafe { P::from_raw(ptr) }
    }
}

That way the library could use MarkedPointer internally, using into_raw/from_raw to interact with the underlying std::sync::atomic::AtomicPtr, and into_cooked/from_cooked to interact with the user-facing API. Since every Pointer<T> is also trivially a MarkedPointer<T, ()> (by just passing the pointer through unmodified both ways, as above), the existing API could go unchanged.

The additions would be struct MarkedAtomicPtr and trait MarkedPointer, for people who like to live dangerously know exactly what architecture they'll be deploying on for the lifetime of their application.


As mentioned, I now think is a bad idea. I went into this thinking this was more safe than it actually turns out to be, and I don't want to complicate the internals of an already complex library, for a niche use-case that nobody should really be doing in the first place.

Have a nice day! ✨

Thanks for sharing! For what it's worth, we actually already make use of the low bit, just like folly does:

haphazard/src/domain.rs

Lines 839 to 850 in 55f1078

// Helpers to set and unset the lock bit on a `*mut HazPtrRecord` without losing pointer
// provenance. See https://github.com/rust-lang/miri/issues/1993 for details.
fn with_lock_bit(ptr: *mut HazPtrRecord) -> *mut HazPtrRecord {
int_to_ptr_with_provenance(ptr as usize | LOCK_BIT, ptr)
}
fn without_lock_bit(ptr: *mut HazPtrRecord) -> *mut HazPtrRecord {
int_to_ptr_with_provenance(ptr as usize & !LOCK_BIT, ptr)
}
fn int_to_ptr_with_provenance<T>(addr: usize, prov: *mut T) -> *mut T {
let ptr = prov.cast::<u8>();
ptr.wrapping_add(addr.wrapping_sub(ptr as usize)).cast()
}