Unsoundness in safe code, `AtomicPtr::compare_exchange_*`
TroyNeubauer opened this issue · comments
I have been working on a concurrent hash map with the help of haphazard, and have seem to come across undefined behavior in safe code.
Here is a reproducing example:
use haphazard::AtomicPtr;
#[non_exhaustive]
struct Family;
fn main() {
// SAFETY:
//
// p is null
let ptr: AtomicPtr<u32, Family> = unsafe { AtomicPtr::new(std::ptr::null_mut()) };
let new = Box::new(0u32);
let old = Box::into_raw(Box::new(0u32));
let _ = ptr.compare_exchange_weak(old, new);
}
When run with cargo miri run
produces:
982 | Box(unsafe { Unique::new_unchecked(raw) }, alloc)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ type validation failed: encountered 0, but expected something greater or equal to 1
...
note: inside `main` at src/main.rs:14:13
--> src/main.rs:14:13
|
14 | let _ = ptr.compare_exchange_weak(old, new);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The safety comment for AtomicPtr::new() says that p must be a valid reference to T
or null, however when the compare exchange fails
inside compare_exchange_weak
:
Lines 612 to 616 in e0e18f6
the null inside the pointer ends up in the Err variant which is then used to call Pointer::from_raw:
Lines 32 to 41 in e0e18f6
Which doesn't allow nulls.
Allowing nulls in AtomicPtr is non-negotiable for building concurrent data structures, so how do we fix this api?
Yes, you're totally right, that's just straight up wrong. The bug is in compare_exchange_weak
(and I suspect also compare_exchange
) assuming that the _ptr
version returns current
on failure, when in reality it returns the value stored in the std::sync::atomic::AtomicPtr
that wasn't current
. The solution here is luckily simple — we just need to change
Lines 613 to 616 in e0e18f6
to
r.map_err(move |_| {
// Safety: `new` was never shared, and was a valid `P`.
unsafe { P::from_raw(new) }
})
Which should work since new
is just a *mut T
and that is Copy
. Want to file a PR for this and the non-weak
variant? We may also want to update the documentation wording on these methods to clarify what "previous" value means, here for example:
Line 680 in e0e18f6
Sure, I'll knock this out after the stream haha.
Closed in #38