timeout option

Question

timeout option

DanielRuf opened this issue 4 years ago · comments

Hi,

pdfx is very helpful for us to analyze a few things. Thanks for creating pdfx.

But we have a small problem. When a pdf file contains much text pdfx / python only fails after the "too many recursions" error is thrown.

It would be helpful to have a max-timeout option to prevent that pdfx tries to parse files for 45 minutes and more (in our case).

And another small question: how could we scan / check many files at once in the best way? So far we run single pdfx commands from a bash script and wait until every command has finished. Using the & trick would cause some issues with the job scheduler of the OS and that the whole OS freezes.

Chris Hager · Answer 1 · Tue Apr 13 2021 03:00:26 GMT+0800 (China Standard Time)

Could you post the full stack trace, and perhaps an example PDF? Please reopen the issue with those, thanks 🙏

Daniel Ruf · Answer 2 · Tue Apr 13 2021 04:39:01 GMT+0800 (China Standard Time)

Please reopen the issue with those, thanks

Only you can reopen the issue ;-)

Here is an example file:

54013162437.pdf

This stacktrace is produced:

pdfx.log