Greg Gaba's starred repositories
invoice2data
Extract structured data from PDF invoices
elastic-thought
Scalable REST API for the Caffe deep learning framework
LogReaderProgram
This application is used to parse through SQL verbose logs produced by the OnBase Thick Client and print off long running queries.
SampleUnityScriptTestHarness
Develop and test OnBase Unity scripts outside of OnBase using Visual Studio.
OneEFormBrowser
External EForm Browser for OnBase E-Forms (not affiliated with Hyland Software)
DataflowEx
A .NET dataflow and etl framework built upon Microsoft TPL Dataflow library
SmartWatcher
simple windows service designed to watch a specific directories and taking specific actions to : Create - Change - Rename - Delete files events into those directories.
DUP-ocropy
Python-based tools for document analysis and OCR
rabbitmq-dotnet-client
RabbitMQ .NET client for .NET Standard 2.0+ and .NET 4.6.2+
linux-mint-nemo-actions
Some useful Nemo Actions and Shell Scripts with zenity GUIs: 1. Sandwich PDF Maker (OCR, Text Layer, searchable pdf, tesseract, scantailor); 2. PDF Page Rotator; 3. PDF Metadata Editor; 4. PDF Document Downsizer; 5. doc(x), odt, txt to pdf Converter; 6. doc(x) to odt Converter
docker-tess4j
Oracle Java 8 with all Tesseract dependencies installed, the perfect base for Tess4J apps.
tesseract-ocr-compilation
Tesseract 4 OCR Compilation - Docker Container
AzureBatchTesseractSample
A sample showing how-to use Azure Batch based on the Tesseract Open Source OCR Recognition software.
tesseract4win64
charlesw/tesseract 4.0 build for x64 Windows using C++ run-time 141.
tesseract-web-service
An implementation of RESTful web service for tesseract-OCR using tornado
tesseract-hocr
A simple wrapper for the Tesseract OCR package for node.js
hocr-tools
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
hocr-parser-hadoopjob
Hadoop job for parsing HTML book page layout and text content files (hOCR)
hOCR-to-ALTO
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets