potential race conditions in Chapter 8 example
opened this issue Β· comments
First of all, thank you very much for writing such an excellent book. I really enjoy it. π .
I'm wondering if there're some race conditions in Chapter 8 example.
- If userA quickly follows two users, userB and userC.
follow_user(conn, userA, userB)
runs first tillfollowing, followers, status_and_score = pipeline.execute()[-3:]
. Suppose the value of variablefollowing
is 1 now.
Then it switches to functionfollow_user(conn, userA, userC)
. It runs till end and setsfollowing
inuser:userA
as 2.
Then it switches back to continue to runfollow_user(conn, userA, userB)
and setfollowing
inuser:userA
as 1 again.
I think either watch or lock is required here sincefollowing
and some other variables are retrieved in one execute() call and used in another.
def follow_user(conn, uid, other_uid):
fkey1 = 'following:%s'%uid #A
fkey2 = 'followers:%s'%other_uid #A
if conn.zscore(fkey1, other_uid): #B
return None #B
now = time.time()
pipeline = conn.pipeline(True)
pipeline.zadd(fkey1, other_uid, now) #C
pipeline.zadd(fkey2, uid, now) #C
pipeline.zcard(fkey1) #D
pipeline.zcard(fkey2) #D
pipeline.zrevrange('profile:%s'%other_uid, #E
0, HOME_TIMELINE_SIZE-1, withscores=True) #E
following, followers, status_and_score = pipeline.execute()[-3:]
pipeline.hset('user:%s'%uid, 'following', following) #F
pipeline.hset('user:%s'%other_uid, 'followers', followers) #F
if status_and_score:
pipeline.zadd('home:%s'%uid, **dict(status_and_score)) #G
pipeline.zremrangebyrank('home:%s'%uid, 0, -HOME_TIMELINE_SIZE-1)#G
pipeline.execute()
return True
- Similar situations may happen in
unfollow_user
function, too. - It's worse if
follow_user
andunfollow_user
overlap.
Please help correct me if I am wrong. Thank you very much.
I'm glad you are enjoying the book :)
You are 100% correct that there is a race condition in both follow_user()
and unfollow_user()
. This was sort-of on purpose, as I intended to revisit this example in chapter 11 as an exercise for readers to turn this into a Lua script. But by the time I got to chapter 11, I realized that the return value of ZREVRANGE
in the server-side Lua scripting makes the Lua script mostly about rewriting arguments and less about scripting, and I had already spent time in chapter 10 handling sharding, so I omitted them... leaving the chapter 8 examples with the race conditions.
The reason I wanted to push this to Lua scripting is because to lock two keys sequentially can get you into a deadlock situation, and introducing a multi-lock without Lua was likely to be confusing. With Lua scripting, you can either write a function that acquires a variable number of locks simultaneously all or none (preventing the deadlock), or you can just write the function directly in Lua and not need locks.
But here we are with race conditions. Thankfully, the particular race condition is not that bad - you may get incorrect followers/following counts, but it is unlikely to be substantially different from reality unless you have users that are seeing a lot of follow/unfollow activity. Further, this particular error doesn't necessarily get worse over time, as we take the exact count of followers/following and assign it every time through the function. This should leave us with a race condition window of a Redis round trip.
But on the upside, there is a solution that does stay reasonably consistent with our later updates to follow_user()
and unfollow_user()
in section 10.3.3 in chapter 10 - use the result of the ZADD
/ZREM
calls as part of an HINCRBY
call instead of HSET
. I'll update the follow_user()
and unfollow_user()
functions to be more like the versions in 10.3.3, and will close this bug (as well as issue #13 and PR #14) when I've got the errata ready to send to the publisher.
Thank you for quick and detailed response. π
I will read Chapter 10 and 11.