go4org / go4

go4 hosts the go4.org packages.

Home Page:https://go4.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Once question

johnrs opened this issue · comments

Hi. I was wondering about one thing I noticed in the Once's Do method. I understand the use of the atomic.LoadUint32, but I don't understand why the atomic.StoreUint32 was used instead of a simple store. I believe that o.done is protected by the o.m mutex, which is locked when the store takes place.

I did not benchmark it, but I'm guessing that perhaps the simple store would be a bit faster because it wouldn't involve a memory barrier operation.

Just wondering.... :)

Hi. I'm guessing atomic.Store is used precisely because atomic.Load is used out of the mutex. Since atomic.Load happens before the mutex, there could an atomic.Load happening at the same time we're trying to store. So the store has to be atomic too, to exclude the load from happening "in the middle" of a store.
Does that make any sense?

I erred when I said that o.done was protected by the mutex. The atomic.load isn't.

I understand what you are saying, but I believe that a normal load/store of a 32-bit integer would be atomic anyway, just not protected by a memory barrier. So the worst case of doing just a normal store would be that another goroutine would see a stale o.done value, till you unlocked, causing him to go slow-path. This would delay till you unlocked and then he would proceed and see the correct value.

But I do see a different problem: execution order. If a normal store were used, then the "err = f()" above it might not execute till after the store. This would be bad since the other routine could then think that Once had completed before it really did. The atomic.StoreUint32 prevents this.

Does this sound right? :)

P.S. Here an idea (not really worth doing since slow-path is going to be very rare): The store could actually take place after releasing the lock. That guarantees order of execution for f(). Actually, now that I'm thinking about it, the atomic.LoadUint32 (which is important since it's fast-path) could be a regular load. Worst case is that it takes the slow-path route in the rare case that it reads the stale value which someone just changed. This isn't fatal, just makes him go slow-path.

So it would be a tradeoff between much faster execution almost all of the time, versus a rare delay through slow-path. Interesting... Yes, some (not many) software race conditions aren't fatal. :)