TokenMill's repositories
clojure-graalvm-aws-lambda-template
Leiningen template for AWS Lambda custom runtime with GraalVM native image compiled Clojure projects.
crawling-framework
Easily crawl news portals or blog sites using Storm Crawler.
ltlangpack
Tools for Lithuanian language processing
fast-url-access-checker
Easily run HTTP GET requests against a list of URLs to check their HTTP status.
docx-utils
Easily work with .docx files from Clojure (a wrapper on Apache POI library).
dictionary-annotator
Fast and configurable UIMA dictionary annotator.
common-crawl-utils
Various Common Crawl utilities in Clojure.
docker-images
Docker configurations, images, and examples of Dockerfiles for various TokenMill products and projects.Official source for Docker configurations, images, and examples of Dockerfiles for TokenMill products and projects
crawling-framework-example
Demonstration on how to use the Crawling Framework to setup a simple science news crawler and store results in ElasticSearch. Use this configuration to set up your own crawler.
beagle-performance-benchmarks
Performance benchmarks for the Beagle library, and comparisons with other stored-query solutions.
metadata-detector
Library to detect metadata from html files.
gf-wordnet
A WordNet in GF
unsupervised-keyphrase-extraction
EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings (official implementation)