0xabu / pdfannots

Extracts and formats text annotations from a PDF file

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug: Assertion error resulting in the abortion of extraction

chrisgrieser opened this issue · comments

This looks look a similar case like issue #48.

Once again, I have a PDF file which is a book scan, and get a warning about popup annotations not being supported. And instead of continuing, an Assertion Error results in the abortion of the annotation extraction.

This time, I couldn't break it down to a particular page – the issue seems to occur regardless of the page tried. I have therefore attached a sample of 10 pages, and the log I get.

sample.pdf

WARNING: Unsupported annotation subtype: /'Popup'
WARNING: Unsupported annotation subtype: /'Popup'
Traceback (most recent call last):
  File "/opt/homebrew/bin/pdfannots", line 8, in <module>
    sys.exit(main())
  File "/opt/homebrew/lib/python3.9/site-packages/pdfannots/cli.py", line 141, in main
    doc = process_file(
  File "/opt/homebrew/lib/python3.9/site-packages/pdfannots/__init__.py", line 472, in process_file
    page.annots.sort()
  File "/opt/homebrew/lib/python3.9/site-packages/pdfannots/types.py", line 226, in __lt__
    return self.pos < other.pos
  File "/opt/homebrew/lib/python3.9/site-packages/pdfannots/types.py", line 182, in __lt__
    assert self._pageseq != 0
AssertionError

Yes, that looks exactly like #48. Which version of pdfannots are you using? I notice you've installed it from a package (e.g. from PyPI), but the warning about Popup annotations is removed in the latest version, so I wonder whether you are missing the fix I made for that issue.

Indeed, your sample works fine for me. Please reopen if you can repro on the latest version.

ah yes indeed, I was still on 0.2. After updating to 0.3, everything works.

thanks and sorry for the waste of time!