ARM-software / synchronization-benchmarks

Collection of synchronization micro-benchmarks and traces from infrastructure applications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use proper CAS instructions

apellegr opened this issue · comments

In include/atomics.h, shouldn't this:

        asm volatile(
        "       mov     %[old], %[exp]\n"
        "       cas   %[old], %[val], %[ptr]\n"
        : [old] "=&r" (old), [ptr] "+Q" (*(unsigned long *)ptr)
        : [exp] "Lr" (exp), [val] "r" (val)
        : );

Be:

        asm volatile(
        "       mov     %[old], %[exp]\n"
        "       casal   %[old], %[val], %[ptr]\n"
        : [old] "=&r" (old), [ptr] "+Q" (*(unsigned long *)ptr)
        : [exp] "Lr" (exp), [val] "r" (val)
        : );

cas64_acquire_release uses this version of cas. Is there a reason to not use this where the al version is needed?

Fair enough.
So why are we not using the cas_aquire_release on the cas_ref lock code (eg line 46 and 64), instead of the CAs without acquire and release semantics?
val = cas64(lock, val, old);

cas_lockref is a simplified representation of Linux kernel lockrefs which uses cmpxchg64_relaxed that boils down to a barrierless cas on AArch64 (I believe)

This is a lockable refcount that is mostly only used as a refcount that is infrequently locked and so doesn’t usually need barriers for correct behavior. When locking is actually performed the code falls back on spinlock wrapped modifications to ensure correct ordering. cas_lockref is only trying to increment and decrement the lockable refcount, not lock it.