Timeout error causing a locked resource

Question

Timeout error causing a locked resource

amanshary opened this issue 5 years ago · comments

Hi, we are using redlock 0.2.2 with redis 3.3.3.
I checked the release notes of redlock 1.0 and newer redis.rb versions which didn't seems to fix the issue (see my analysis bellow).

We configured a short 'redis_timeout' (0.02 seconds) in a certain flow, since then we see a weird behaviour from Redis:
After Redis:Timeout error ('rescue in _read_from_socket') it seems the resource is already locked, no other indication to another flow which tries to lock the same resource.
The issue is fixed after max_lock_time has been reached.

A deeper dive into both Redis-rb and Redlock made us believe the issue lies in Redis/client.rb:123 - call(command):
We suspect the timeout occurred in the middle of the 'read; after the resource have already been locked by the 'write' command which happened before.

We cannot fix it in our side because we do not have the 'lock handle'.
It can be fixed inside redlock by catching TimeoutError in redlock/client.rb :127 and the calling unlock to make sure the resource will not be locked unexpectedly.

Is there a known workaround to solve this?

Malte Rohde · Answer 1 · Wed May 08 2019 23:53:30 GMT+0800 (China Standard Time)

Hi @amanshary,

unfortunately, there's no way to tell inside Redlock whether the Timeout occurred within the command processing (the write) or the result reading. So, in my opinion, issuing an unconditional unlock after a timeout has occurred is not an ideal solution, more like cracking a nut with a sledgehammer. Also, the unlock is likely to suffer from the same timeout errors as well, isn't it? Assuming the issue is temporary congestion or other network latency problems.

There are of course various ways to solve this (one of which would be returning the lock_info regardless of whether locking succeed, plus a boolean indicating success), but none of these seem particularly charming, to be honest. And I don't see an easy solution in redlock without breaking backwards compatibility, so your best option might be to monkey patch redlock to do what you want. Sorry.

I'm also wondering whether you might need a different tool, if you requirements say you need a 20ms timeout on the redis client, plus you can't accept a "wasted" lock period (i.e. when the lock is "held" by your client, but the client doesn't know about it).

Best,
malte

Malte Rohde · Answer 2 · Thu May 09 2019 00:00:01 GMT+0800 (China Standard Time)

Maybe to clarify why I'm unsure about your suggested solution of calling unlock when a timeout is raised from Redis: Unlocking in Redlock is already only a best effort approach, see here. So even when we implement lock as you suggest, you don't really get a guarantee that a lock is really either a) held by a client who knows about it, or b) free. Redlock only guarantees mutual exclusion, not 100% resource usage.

How long is your ttl, if I may ask? I would assume that it is really short, given you have 20ms redis timeouts.

Asaf Manshary · Answer 3 · Thu May 09 2019 00:02:23 GMT+0800 (China Standard Time)

Hi,
Regarding the 20ms timeout, what is a reasonable timeout? Our idea was to check the 95% of the connection time in order to avoid network slowness.
As you mentioned it is indeed too low (:

We use 2000ms ttl
Thank you for your quick response!
Asaf

Malte Rohde · Answer 4 · Thu May 09 2019 03:31:00 GMT+0800 (China Standard Time)

I don't know, sorry. The default in redis-rb is 5 seconds, but I guess it depends on what you're trying to achieve, i.e. what latency requirements you have, and of course on your infrastructure (i.e. same host / same physical network allows for lower timeouts). But I have very little knowledge about these things, unfortunately.