There are 1 repository under extraction-engine topic.
Extract tables from PDF files (port of tabula-java)
A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig).
ICDAR 2015 competition on robust reading :smile:
All five assignments and the final group project is done in class CSCI5408(Data Management, Warehousing and Analytics) Summer 2021 of MACS at Dalhousie University.
Simple, extendable HTML and XML data extraction engine using YAML configurations and some times pythonic functions.
Created python utility to extract and transform data from TestStand SQL database schema into flat CSV files.