Some PDFs can't be read
tylerdq opened this issue · comments
Tyler Quiring commented
Certain PDFs (in a way that can't seem to be predicted) may not be parsable by PyPDF2 (the library that allows pdfda to work).
Refer to: https://stackoverflow.com/questions/30272269/python-text-extraction-does-not-work-on-some-pdfs
This may be a font issue and may actually be fixable, or it may require a different PDF reading library (if one exists).