Why are you using a Semaphore instead of a BlockingConnectionPool?

Question

Why are you using a Semaphore instead of a BlockingConnectionPool?

paul-finary opened this issue 2 years ago · comments

Hi,

I migrated from ARQ to SAQ (again: great job with this library, I find it way cleaner), and I faced some issues while migrating from aioredis 1.3.1 (used by ARQ) to redis-py 4.2.0 (used by SAQ) in other parts of my project.

During my investigation, I found out that you implemented a way to throttle the number of connections to Redis using a semaphore (_op_sem) instead of redis-py's BlockingConnectionPool (which, with the max_connections set to your max_concurrent_ops and timeout set to None, would have the same behaviour: wait for a connection to be available before executing a command).

So I wonder, is there a reason you choose to implement what seems to be your own BlockingConnectionPool instead of using redis-py's?

barak · Answer 1 · Fri Jun 03 2022 08:11:34 GMT+0800 (China Standard Time)

We actually were using the BlockingConnectionPool at one point: #11 (comment)

tldr - we switched to semaphores to make it harder for applications to configure Redis/SAQ in a way that can introduce deadlocks.

The SAQ worker doesn't poll - it uses brpoplpush, which consumes a connection. That means workers can quickly consume lots of connections, just waiting for jobs.

This gets complex when you consider subjobs - that is, jobs that enqueue and wait for more jobs (an example). If the workers have sucked up all the connections, there won't be any left to enqueue a subjob. And if you're using a BlockingConnectionPool, you'll just be deadlocked, waiting to enqueue.

One solution is to make sure the pool has plenty of connection headroom above the worker concurrency. At the time, I think we deemed that hard to configure/enforce. You could easily set worker concurrency at or below the pool's max connections and introduce a deadlock possibility.

Instead, we have two sephamores, one for the worker concurrency, one for the enqueuing/listening of subjobs. As long as there's enough connections in the Redis pool to go around for both (which is the case by default, since the default ConnectionPool is unbounded, essentially limited by the OS/Redis itself), there won't be any problems. We figured this was harder for applications to configure incorrectly.

Toby Mao · Answer 2 · Fri Jun 03 2022 08:25:23 GMT+0800 (China Standard Time)

thanks for the response @barakalon

Paul Renvoisé · Answer 3 · Fri Jun 03 2022 16:14:56 GMT+0800 (China Standard Time)

Thanks for taking the time to answer :)