Tim Allison's repositories
lucene-addons
Standalone versions of LUCENE_5205 and other patches: SpanQueryParser, Concordance and Co-occurrence stats
file-observatory
Single server/laptop grade file-observatory
tika-gui-v2
Unofficial user interface for Apache Tika
SimpleCommonCrawlExtractor
Simple wrapper around IIPC Web Commons to take a literal warc.gz and extract standalone binaries
hodgepodge
one off dev repo, very experimental
tika-addons
Addons not part of the official Tika release
AGPL
Repo of AGPL licensed code -- nothing in here is connected/related to anything outside of this repo
james-mime4j
Mirror of Apache James Mime4j
java-bplist
A Java library for reading Apple bplists, based on the work of
logging-log4j2
Apache Log4j 2 is an upgrade to Log4j that provides significant improvements over its predecessor, Log4j 1.x, and provides many of the improvements available in Logback while fixing some inherent problems in Logback's architecture.
lucene-solr
Mirror of Apache Lucene + Solr
metadata-extractor
Extracts Exif, IPTC, XMP, ICC and other metadata from image files
opennlp
Mirror of Apache OpenNLP
parso
lightweight Java library designed to read SAS7BDAT datasets
tabula-java
Extract tables from PDF files
tika-docker
Convenience Docker images for Apache Tika Server
xmpcore-shaded
Shaded version of Adobe's xmpcore to remove *.internal.* part of namespace
yalder
Yet another language detector