malthe / pq

A PostgreSQL job queueing system

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Getting from an empty queue slower than checking its length

suchow opened this issue · comments

Thanks for creating pq, I've found it very useful in my work.

In my application, I observed that checking an empty queue took over 1 second, whereas checking the length of an empty queue was fast. My queue table had ~100 - 1000 rows for this test. Replacing every instance of:

item = queue.get()

with

if len(queue):
    item = queue.get()

improved my response times dramatically.

I'm not sure if this behavior is specific to my setup or a general issue, but i thought I would bring it to your attention in case it's helpful.

Thanks again for the excellent project.

commented

@suchow this is the expected behaviour.

Please take a look at the docs for the full explanation:
https://github.com/malthe/pq#methods

But the basic idea is that:

Items are pulled out of the queue using get(block=True). The default behaviour is to block until an item is available with a default timeout of one second after which a value of None is returned.

By using len(), you check that the queue is never empty and you skip the timeout enforced by the listen/notify functionality. I will not be surprised if your CPU is hitting 100% since your approach will cause basically an infinite loop if the queue is always empty (obviously assuming that you call get() inside a loop).

Could you post a code example in order to understand and reproduce the issue? Just in case I misunderstood your use-case.

Thanks, your explanation (and the documentation, now that I've put two-and-two together) perfectly explains what I observed.

To answer your question, I have a function that must check several queues to see if any of them have items that can be taken off, returning the first found item across the queues. And so I have a loop over the relevant queues:

for queue in queues_to_check:
    if len(queue):
        item = queue.get()
        if item:
            break

I suppose I could replace that with:

for queue in queues_to_check:
    item = queue.get(timeout=TIMEOUT)
    if item:
        break

where TIMEOUT is some time << 1 second.

commented

Please take a look at the tasks API, it's a simple and good starting point:
https://github.com/malthe/pq#tasks

Basically, if you wrap the first example in a function and feed it to a thread/process, your job is done. Pass the queue name as the argument of this function and it should solve your problem. Just play a bit with it :)