[Question] How to get bounding box information for text on a page?
arifd opened this issue · comments
Hello, I'm not very familiar with the PDF spec, so perhaps this already has me at a disadvantage.
But I can't work out how to get at all the text and find their coordinates on a page.
I want to extract the positions for all of the text on page of a document basically.
I too would like to know if this achievable currently or otherwise support this as a feature request. Something similar to the API of PDF.js would be great which exposes text content, transform and width/height on its TextItem
amongst some other fields
lopdf doesn't provide a method to do so but all the information you need is inside the pdf, although you will need another crate to parse the font and give you the font metrics, I might try to make an example later.