Use proper CAS instructions

Question

Use proper CAS instructions

apellegr opened this issue 7 months ago · comments

In include/atomics.h, shouldn't this:

        asm volatile(
        "       mov     %[old], %[exp]\n"
        "       cas   %[old], %[val], %[ptr]\n"
        : [old] "=&r" (old), [ptr] "+Q" (*(unsigned long *)ptr)
        : [exp] "Lr" (exp), [val] "r" (val)
        : );

Be:

        asm volatile(
        "       mov     %[old], %[exp]\n"
        "       casal   %[old], %[val], %[ptr]\n"
        : [old] "=&r" (old), [ptr] "+Q" (*(unsigned long *)ptr)
        : [exp] "Lr" (exp), [val] "r" (val)
        : );

Lucas Crowthers · Answer 1 · Fri Dec 22 2023 12:17:52 GMT+0800 (China Standard Time)

cas64_acquire_release uses this version of cas. Is there a reason to not use this where the al version is needed?

apellegr · Answer 2 · Fri Dec 22 2023 12:30:31 GMT+0800 (China Standard Time)

Fair enough.
So why are we not using the cas_aquire_release on the cas_ref lock code (eg line 46 and 64), instead of the CAs without acquire and release semantics?
val = cas64(lock, val, old);

Lucas Crowthers · Answer 3 · Fri Dec 22 2023 12:59:55 GMT+0800 (China Standard Time)

cas_lockref is a simplified representation of Linux kernel lockrefs which uses cmpxchg64_relaxed that boils down to a barrierless cas on AArch64 (I believe)

This is a lockable refcount that is mostly only used as a refcount that is infrequently locked and so doesn’t usually need barriers for correct behavior. When locking is actually performed the code falls back on spinlock wrapped modifications to ensure correct ordering. cas_lockref is only trying to increment and decrement the lockable refcount, not lock it.