ruffus sometimes throws exceptions in RethrownJobError
jbarlow83 opened this issue · comments
It appears that in some error paths cases the arguments of a RethrownJobError will be set to a list of five strings, rather than a list of tuples of five strings, as expected. That causes the exception below:
--- Logging error ---
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/logging/__init__.py", line 980, in emit
msg = self.format(record)
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/logging/__init__.py", line 830, in format
return fmt.format(record)
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/logging/__init__.py", line 567, in format
record.message = record.getMessage()
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/logging/__init__.py", line 328, in getMessage
msg = str(self.msg)
File "/Users/jb/Documents/src/OCRmyPDF-dev/venv-3.5/lib/python3.5/site-packages/ruffus/ruffus_exceptions.py", line 127, in __str__
message += self.get_nth_exception_str (ii)
File "/Users/jb/Documents/src/OCRmyPDF-dev/venv-3.5/lib/python3.5/site-packages/ruffus/ruffus_exceptions.py", line 116, in get_nth_exception_str
task_name, job_name, exception_name, exception_value, exception_stack = self.args[nn]
ValueError: too many values to unpack (expected 5)
Here len(self.args) == 5 and self.args = ['task_name', 'job_name', ...], so that self.args[nn] == self.args[0] == 'task_name", causing the ValueError.
Thanks for flagging this. This is indeed a serious problem. The last thing you need in the middle of throwing is the library itself messing up, and loosing all your errors.
Is this a reproducible problem? I am afraid just eyeballing the code, I can't immediately see the offending bug. I see that I use a tuple of 5 arguments in some places, and a list of 5 arguments in others (my bad) and RethrownJobError.append tries to concatenates two tuples rather than extend a list. However, I can't see where I am missing a set of parenthesis.
Some help need :(
I also looked at the code before submitting and couldn't see anything obvious. I did some runtime tests and figured it out: Exception.args is a property, not a variable, the property setter forces anything assigned to Exception.args to be a tuple.
In [1]: ex = Exception()
In [2]: ex.args
Out[2]: ()
In [3]: ex.args = ['list', 'of', 'things']
In [4]: ex.args
Out[4]: ('list', 'of', 'things')
In [5]: Exception.args
Out[5]: <attribute 'args' of 'BaseException' objects>
You can't do self.args = tuple(list(job_exceptions)) because tuple() will iterate through the list. The syntax self.args = (list(job_exceptions),) does work, along with self.args[0].append(job_exception) to append to the list.
But perhaps it's best to create a different variable to track the list of exceptions, one that isn't managed by a base class.
I can consistently reproduce it in my program (https://github.com/jbarlow83/OCRmyPDF) with a certain file as input that causes an unrelated exception. I'm not sure why this rather ordinary looking AttributeError causes trouble for ruffus while other exceptions in my test suite don't.
@jbarlow83 I'm consistently getting the RethrownJobError in the latest version (4.0.7) on debian:stretch. I can't seem to get any OCR to function. Is there anything I can pitch in on in fixing this?
@chriscohoat A possible workaround is here: ocrmypdf/OCRmyPDF#61. If that doesn't do it in your case I will investigate further.
Sorry about that.
Will try and get a patched release out Monday or Tuesday.
Given that someone else has done all the heavy lifting tracking down the
bug (thanks!) this should be relatively straightforward
Thanks
Leo
On 3 Apr 2016 2:43 p.m., "jbarlow83" notifications@github.com wrote:
@chriscohoat https://github.com/chriscohoat A possible workaround is
here: ocrmypdf/OCRmyPDF#61
ocrmypdf/OCRmyPDF#61. If that doesn't do it
in your case I will investigate further.—
You are receiving this because you were assigned.
Reply to this email directly or view it on GitHub
#65 (comment)
I have a patched release where RethrownJobError no longer inherites its
implementation details from Exception.
However, I am very loathe to release any new code without a unit test .
If you have a better idea as to what caused the error, Is it possible to
create a minimal test case. I am still having serious difficulty
understanding what triggers this bug: python 3.4?
threading problems? etc.
Leo
Dr. Leo Goodstadt
University of Oxford
United Kingdom
On 3 April 2016 at 08:13, Leo Goodstadt bunbun68@gmail.com wrote:
Sorry about that.
Will try and get a patched release out Monday or Tuesday.
Given that someone else has done all the heavy lifting tracking down the
bug (thanks!) this should be relatively straightforwardThanks
Leo
On 3 Apr 2016 2:43 p.m., "jbarlow83" notifications@github.com wrote:@chriscohoat https://github.com/chriscohoat A possible workaround is
here: ocrmypdf/OCRmyPDF#61
ocrmypdf/OCRmyPDF#61. If that doesn't do it
in your case I will investigate further.—
You are receiving this because you were assigned.
Reply to this email directly or view it on GitHub
#65 (comment)