dbashford / textract

node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OCR for PDFs

boazl-cyera opened this issue · comments

Hi,
It would be great if you will support text extraction for non-textual pdfs. (for example, scanned documents) - OCR. (In the same way you do for images).
Thanks,
Boaz