documentcloud / cloud-crowd

Parallel Processing for the Rest of Us

Home Page:https://github.com/documentcloud/cloud-crowd/wiki

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Workers waiting an arbitrary amount of time prior to execution

jrobhsi opened this issue · comments

I've run into a problem that I can't seem to diagnose where workers seem to wait an arbitrary amount of time prior to beginning execution. The job itself usually completes in milliseconds (it's a simple image rotation), but the job is responsible for rotating four separate images (all different sized versions of one image), each in a worker.

I only noticed this as our UI issues a command to cloud crowd to begin the image rotation for the collection, and then waits to refresh the image in the UI until after all of the images have been rotated. Some of the time, this happens almost instantaneously, but I have waited up to 5 minutes for all of the workers for the given job to complete.

Does anyone know why worker execution could get delayed like this, or if there is any way to diagnose the problem?

Nope -- that sounds like trouble. I'd use the usual ruby profiling tools (ruby-prof and friends) to find out where your process is spending all its time.

Just to make sure I had stated the problem correctly:

A job starts, and some of the workers start instantly, execute, and complete in milliseconds. Other workers do not even start (no friendly 'i started' message, no logged messages that I ouput as the first step in a call to 'process') until a few minutes have passed.

Are you thinking this is a performance issue in the action I have written, or are you suggesting profiling the cloud crowd server and node processes to determine what the hold up is?

The latter.