Tim Allison's starred repositories
elasticsearch
Free and Open, Distributed, RESTful Search Engine
markdown-here
Google Chrome, Firefox, and Thunderbird extension that lets you write email in Markdown and render it before sending.
OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
lucene-solr
Apache Lucene and Solr open-source search software
sqlite-jdbc
SQLite JDBC Driver
open-semantic-search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
action-automatic-releases
READONLY: Auto-generated mirror for https://github.com/marvinpinto/actions/tree/master/packages/automatic-releases
juniversalchardet
Originally exported from code.google.com/p/juniversalchardet
rated-ranking-evaluator
Search Quality Evaluation Tool for Apache Solr & Elasticsearch search-based infrastructures
tika-docker
Convenience Docker images for Apache Tika Server
arlington-pdf-model
A vendor- and implementation-independent specification-derived, machine-readable model of PDF.
ocrevalUAtion
OCR evaluation brought to you by University of Alicante
htmlparser
The Validator.nu HTML parser https://about.validator.nu/htmlparser/
solr-ocrpayload-plugin
Efficient indexing and retrieval of OCR bounding boxes in Solr
file-tests
File-tests is test-suite for File tool. Previous home: https://fedorahosted.org/file-tests/
dropwizard-tika-server
A DropWizard wrapper around Apache Tika.
tika-gui-v2
Unofficial user interface for Apache Tika