documentcloud / cloud-crowd

Parallel Processing for the Rest of Us

Home Page:https://github.com/documentcloud/cloud-crowd/wiki

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sql Error during execution

jrobhsi opened this issue · comments

!! Unexpected error while processing request: Mysql::Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')) LIMIT 25' at line 1: UPDATE work_units SET reservation = 1343908200 WHERE (reservation is null and node_record_id is null and status in (1,4,5) and action in ()) LIMIT 25

We're trying to trace the cause of the above error, and we're wondering if anyone else has seen/diagnosed/fixed that error before.

Certainly -- there should be a better error message for it on master ... but it means that you're running a Node that doesn't have any configured actions to run.

Looking at the node record in MySQL, I see the following, with the two actions we have set up (sorry for the formatting):

mysql> select * from node_records;
+----+--------------------+------------+------+----------------------------+------+------+-------------+---------------------+---------------------+
| id | host | ip_address | port | enabled_actions | busy | tag | max_workers | created_at | updated_at |
+----+--------------------+------------+------+----------------------------+------+------+-------------+---------------------+---------------------+
| 1 | hsi-longbeard:9063 | 127.0.0.1 | 9063 | rotate_images,process_pdfs | 0 | | 1 | 2012-08-01 16:51:39 | 2012-08-01 17:56:43 |
+----+--------------------+------------+------+----------------------------+------+------+-------------+---------------------+---------------------+
1 row in set (0.00 sec)

Well, then perhaps it's the server that is unable to locate the actions. The particular error in the SQL here is:

and action in ()

We had traced it back to the line in the gem that is responsible for creating that piece of SQL, we just aren't seeing why it's getting created like that. This information is coming from development, so our server and node are running on the same machine, and there is only one instance of each running. In addition, some parts of the job will complete, while other parts will hang out for a while (yesterday I had seen workers hang for about 5 minutes before beginning processing; I have half a job that has waited since yesterday to kick off).

Our hypothesis was that the call to available nodes was returning an empty set from NodeRecord because we had surpassed our max worker count, but we 1) ran only one job which spawns 4 workers with the default max_workers of 5 (which occasionally caused the question I posted yesterday), and 2) increased the max_workers to 100, running the same job.

Thanks for the pointers, I may have to just introduce another piece of the system in order to keep these more time sensitive jobs flowing until I can nail this down!