RHInception / re-worker

Release Engine - Worker

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[hardening] workers explode if they can't write logs out

tbielawa opened this issue · comments

Kicked off a test job a few minutes ago and this happened:

2014-05-14 17:41:57,842 - Juicer - INFO - Connection and channel open.
2014-05-14 17:41:57,842 - Juicer - INFO - Consuming on queue worker.juicer
Traceback (most recent call last):
  File "./replugin/juicer/__init__.py", line 119, in <module>
    worker.run_forever()
  File "/usr/lib/python2.6/site-packages/reworker/worker.py", line 221, in run_forever
    self._connection.ioloop.start()
  File "/usr/lib/python2.6/site-packages/pika/adapters/select_connection.py", line 136, in start
    self.poller.start()
  File "/usr/lib/python2.6/site-packages/pika/adapters/select_connection.py", line 424, in start
    self.poll()
  File "/usr/lib/python2.6/site-packages/pika/adapters/select_connection.py", line 479, in poll
    self._handler(fileno, event, write_only=write_only)
  File "/usr/lib/python2.6/site-packages/pika/adapters/base_connection.py", line 302, in _handle_events
    self._handle_read()
  File "/usr/lib/python2.6/site-packages/pika/adapters/base_connection.py", line 326, in _handle_read
    self._on_data_available(data)
  File "/usr/lib/python2.6/site-packages/pika/connection.py", line 1271, in _on_data_available
    self._process_frame(frame_value)
  File "/usr/lib/python2.6/site-packages/pika/connection.py", line 1351, in _process_frame
    self._deliver_frame_to_channel(frame_value)
  File "/usr/lib/python2.6/site-packages/pika/connection.py", line 963, in _deliver_frame_to_channel
    return self._channels[value.channel_number]._handle_content_frame(value)
  File "/usr/lib/python2.6/site-packages/pika/channel.py", line 791, in _handle_content_frame
    self._on_deliver(*response)
  File "/usr/lib/python2.6/site-packages/pika/channel.py", line 886, in _on_deliver
    body)
  File "/usr/lib/python2.6/site-packages/reworker/worker.py", line 172, in _process
    self._output_dir, corr_id + ".log"]))
  File "/usr/lib64/python2.6/logging/__init__.py", line 827, in __init__
    StreamHandler.__init__(self, self._open())
  File "/usr/lib64/python2.6/logging/__init__.py", line 846, in _open
    stream = open(self.baseFilename, self.mode)
IOError: [Errno 13] Permission denied: '/tmp/logs/ID_HERE.log'

I usually view exceptions like this as unacceptable and try to catch them, but I think we should discuss what to do about this together.

Since we are logging everything that passes through the queue anyway, maybe we should try to catch this kind of error so the entire release isn't bunked up only because we couldn't log an item.

Thoughts?