documentcloud / cloud-crowd

Parallel Processing for the Rest of Us

Home Page:https://github.com/documentcloud/cloud-crowd/wiki

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

With a large number of jobs/work_units the status command begins to take a very long time

jgeiger opened this issue · comments

I've got 1900+ jobs (which eventually will be a lot more) and the status command is taking 10 seconds to run. I tried to fix this by increasing the time between polls, but it seems that the server is blocking and taking a long time to complete the request. I'm planning on looking into why, but the removal of automatic polling/updates should help the situation.

In our setup, we run a small number of very expensive jobs, so this hasn't been a problem for us. I'd love to help you out this week, but we're launching DocumentCloud on Wednesday, so things are rather busy. My apologies in advance.

The control panel needs to be changed to not display each individual job, as it is currently, and needs to only query the server for aggregate count information -- enough to draw the graphs, and perhaps show some statistics where the job list is now. I'm not going to be able to tackle it this week, but if you'd like to give it a shot, it seems like you already know where to look.

In server.rb in the /status block, you'll want to remove the JSON serialization of Job.incomplete -- which is what's taking so long. Instead, return Job.incomplete.count, and perhaps add a couple more bits of useful information.

I don't need it solved now. It's something I need to look into as well. I just wanted to get a record of it posted.

Based on my needs, I may have to find a different solution. While the control and ability to see what's happening is really nice, it's also causing issues with speed and blocking. As you said, it was build for your 'small number of expensive' jobs, where I have a large number of very fast jobs.

I've now removed the full job display on master. The next (0.4.0) version of CloudCrowd will have an Admin UI that looks more like this:

http://i.imgur.com/p9MdN.png

Since the /status hit is now very light, I've increased the polling speed back to 6 seconds.