0xabu / pdfannots

Extracts and formats text annotations from a PDF file

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it possible to detect shapes like Lines and Arrows in pdf

balandongiv opened this issue · comments

commented

Great works, really appreciate it.

Just curious, is it possible to detect the shapes like, rectangle, line and arrows using this code, if possible for future improvement?

Thanks. I think those probably show up in the annotations, yes -- we filter for only a few types of annotations:

ANNOT_SUBTYPES = frozenset({'Text', 'Highlight', 'Squiggly', 'StrikeOut', 'Underline'})

... but I'm not sure how you would usefully represent those in a text output format?

commented

Thanks for pointing the code snapshot. Apparently, no text output is generated even though we make an arrow line below a text.