'locking" broken
tejacques opened this issue · comments
From my testing, using locking over localStorage as a mechanism is fundamentally broken due to the way the localStorage data propagation model works. An example is this:
- open two windows on the same domain and start a busy loop in each.
- attempt to acquire the lock in window 1 in 1 second after both windows are open, and hold it for 2s before releasing it, and the lock in window 2 after ~2 seconds, also hold it for 2s before releasing it.
- both windows will hold the 'lock' at the same time, ~2 seconds in to ~3 seconds in.
- This happens because localStorage propagates changes with events, and if the event loop is not yielded to, localStorage does not get updated. This may not be true of ALL browsers, but it's true of most.
Edit: It looks as though this library does yield after calling setItem. However I did manage to break it (see my reply below).
The following code snippet below helps explain @tejacques 's concerns.
// Run this in the 1st tab
localStorage['foo'] = '1';
var startDate = new Date().valueOf();
while (new Date().valueOf() - startDate < 10000) {}
console.log(localStorage['foo']);
// Within 10 seconds, run this code in the 2nd tab.
localStorage['foo'] = '2';
You'll observe that:
- in the context of the first script, localStorage['foo'] is still observed to be '1'
- even though "localStorage['foo'] = 1" is run earlier than localStorage['foo'] = 2, it doesn't appear to be ever persisted into Local Storage
I managed to break it using: https://gist.github.com/andrewwakeling/430db7720a4393c4324b
It looks to eventually fail using a number of tabs in Chrome 44.
It looks as though 2 tabs manage to get hold of the lock simultaneously. Please let me know if something looks wrong with my test.
For Chrome 44, in my example, it appears that the value is now written to local storage immediately (i.e. If you go to a 2nd tab, evaluating localStorage['foo'] will be '1').
Hi, I think I've found what's wrong with the locking logic.
First, I have to say this is a fantastic idea and work, I'm wondering if the reason for it not being more popular is this bug, which causes really weird behavior. I started using the IWC-SignalR and with my livereload in dev environment and several tabs opened, this bug is not that uncommon. I used the following to reproduce it and troubleshoot everything - I added a bunch of console.log
lines with exact times to your library, opened 4 tabs and looked what happens every time I refresh a tab which is a SignalR connection owner.
I quickly determined that the problem is not in the IWC-SignalR, but in the IWC. I was afraid that it's somehow related to the base of locking logic, which is your InterlockedCall
with its complex timer based synchronization, but that proved to be rock solid in my testing. Simply, the interlocked calls among different tabs were always sequential, so it was very reassuring to see the foundation of it all being healthy. The problem turned out to be related to the clearJunkLocks
call, its logic and also it not being synchronized with the lock obtaining code. There are 2 separate problems I detected:
- The first problem I noticed was that the refreshed tab's
clearJunkLock
clears a valid lock established by another open tab a moment before. The reason for that is the logic in theWindowMonitor
updateDataFromStorage
. It updatesopenWindows
variable with the new state from local storage AFTER firing theonWindowsChanged
event, which triggersclearJunkLocks
, which checks whether a found lock belongs to a closed window and it relies onopenWindows
variable for that. This causes 2 issues:
- a) - the
clearJunkLocks
fired immediately in all tabs upon detection of the current tab unload/reload doesn't do anything as the unloaded tab is still reported to be open - b) - the
clearJunkLocks
fired in the reloaded tab treats a lock obtained by another tab as junk lock, since that tab is not yet present in the reloaded tab'sopenWindows
and hence is reported as closed
I fixed this in a way that I moved the code which updates openWindows
before the code that fires the OnWindowsChanged
event. It fixes 1-b, but 1-a causes a new issue which I described below:
- After 1) is fixed, a tab reload causes
clearJunkLocks
to be immediately fired in other tabs, but it doesn't mean that happens at exactly the same time in all of them. One tab may have doneclearJunkLocks
and proceeded with obtaining a now free lock, while another one does aclearJunkLocks
right about that same time, sees the lock still belonging to the unloaded tab and clears it. There's no guarantee about which kind of sequence ofclearJunkLocks
and lock obtaining calls will play out. The only way I figured out to fix this is to rewriteclearJunkLocks
so that it also is interlocked with the same id as the lock obtaining code. After that, I didn't notice any more duplicate SignalR connections opened. Though it's always possible that my manual testing didn't cover or play out all the possible scenarios, I tested it probably a hundred times, it never happened after the fix.
I made a pull request (#8) with everything above, I leave to you to decide if you are going to use it or find a better way to handle all this.
Finally, having said all this, the additional clearJunkLocks
interlocked calls do make everything slightly slower. I'm not sure why clearJunkLocks
is even needed, I assume possibly for some complex dynamically triggered locking scenarios. For my purposes, I intend to only use a single lock, the one for SignalR, and it happens on every tab load in my case. So, I changed your code for my purposes in a way that I simply don't call clearJunkLocks
ever as the junk detection is already embedded in the lock obtaining code. That also works, I haven't noticed any duplicate connections and it's faster. For people that decide to do this, the only thing that must not be commented out related to the clearJunkLocks
is the setLocksInitialized
code, which always needs to be called at the library's start.
@jasmh Thank you for your investigation. The purpose of clearJunkLocks
is to avoid pollution of localStorage
(size is limited at several megabytes). But you are right that in most cases this is not a problem. If you want to eliminate clearJunkLocks
, be aware that setLocksInitialized
shall be called after WindowMonitor
is ready.
Hi, could someone confirm that this issue is still valid or could be closed thanks to pull request (#8)?