databricks-industry-solutions / digitization-documents

Using Apache tika and tesseract to extact text from any document

Home Page:https://databricks-industry-solutions.github.io/digitization-documents/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

databricks-industry-solutions/digitization-documents Stargazers