Unable to retrieve results across workers

Question

Unable to retrieve results across workers

sdrabblescripta opened this issue 5 months ago · comments

I have two queues although only one is actively processing tasks, call them A and B. Each is deployed in an individual docker deployment consisting of an app container and a celery worker. Both queues share one rabbit MQ running in its own container.

I can successfully start tasks in B.celery from A.app. I can watch the task in B's log and see it completes. In B.app I am able to retrieve the task's state and result.

If, however, I try to retrieve the task's result/ state in A.app, those are always empty and PENDING, and get() / wait() just hang.

This is problematic because A is where, for the most part, all tasks are started, and A needs to send the same task to each of B, C, D, ... then wait for all tasks to complete before moving on. I can't use chained tasks or similar because the tasks all need to run in parallel.

Is what I'm doing just not possible? If not, how would I go about obtaining the results in A for all tasks run in other queues?

Eri A. · Answer 1 · Wed Jan 10 2024 01:53:27 GMT+0800 (China Standard Time)

Hi @sdrabblescripta,

Could you provide environment/OS configuration to help replicate the issue you are facing?

sdrabblescripta · Answer 2 · Wed Jan 10 2024 04:12:18 GMT+0800 (China Standard Time)

Python: 3.8
celery: 5.3.4 (emerald-rush)
d-c-r: 2.4.0

Please let me know what else you need.

sdrabblescripta · Answer 3 · Fri Jan 19 2024 03:13:06 GMT+0800 (China Standard Time)

@50-Course any update?

Eri A. · Answer 4 · Fri Jan 19 2024 20:02:51 GMT+0800 (China Standard Time)

Hi @sdrabblescripta,

Unfortunately no. It's been a stretched week for me. However, I should be able to take a look at this over the weekend.

sdrabblescripta · Answer 5 · Fri Jan 19 2024 23:32:50 GMT+0800 (China Standard Time)

No worries, any attention you can give the matter would be great!

Eri A. · Answer 6 · Sun Jan 21 2024 03:44:29 GMT+0800 (China Standard Time)

@sdrabblescripta, Could you please provide a simplified information about your architecture? I would like to clarify if:

You do have two separate Django projects with their own Celery workers (queues A and B) and and a single shared RabbitMQ instance
Or could it be a single django application instance, with two celery clusters (A, B) and a single shared MQ instance?

Eri A. · Answer 7 · Sun Jan 21 2024 04:09:57 GMT+0800 (China Standard Time)

That aside, chained tasks run sequentially, respecting the order of arrangement. To run tasks in parrallel, you would have to explictly call the group function, which returns a GroupResult instance that you may call your .get() on.

The difference here is in apply_async instead of the conventional wait() method.

from celery import group

# Assuming you have celery_worker_B and celery_worker_C configured
your_task_group = group(
    your_task.s(*args, **kwargs).set(queue='B'),
    your_task.s(*args, **kwargs).set(queue='C'),
    ...
)

# Apply the group of tasks asynchronously
result_group = your_task_group.apply_async()

Again if your have tried the above and still won't work out, please provide the above requirements to help diagnose the issue.

You may track the progress here upon providing the above: https://github.com/50-Course/dj-celery-results-multiple-worker-failure

sdrabblescripta · Answer 8 · Mon Jan 22 2024 23:26:27 GMT+0800 (China Standard Time)

Hi @50-Course ,

I have two apps, A and B - they share some code but models are mostly NOT shared. There's a single A with its own celery worker, multiple Bs each with their own celery worker, and a single MQ.

A has to start a task in all B workers, which is why I thought group couldn't be used - the task is not importable in A, so I use send_task. If this is in error please let me know and I'll try the group approach.