Hannibal's repositories
blacklight
Blacklight provides a discovery interface for any Solr (http://lucene.apache.org/solr) index.
blacklight-marc
MARC enhancements for Blacklight
cc-quick-scripts
Useful scripts for attacking the CommonCrawl dataset and WARC/WET/WAT files
CSVInputFormat
Input format for hadoop able to read multiline CSVs
incubator-zeppelin
Mirror of Apache Zeppelin (Incubating)
json-wikipedia
Json Wikipedia, contains code to convert the Wikipedia xml dump into a json dump
learning-spark-examples
Examples for learning spark
lib-lucene-sugar
Add some sugar to your Lucene
QuantSoftwareToolkit
QuantSoftwareToolkit
samza-luwak
Integration of Samza and Luwak
simple-scala-rest-example
Example of simple REST Service on Scala
spark-notebook
Use Apache Spark straight from the Browser
spark-solr
Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
webarchive-indexing
Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
word2vec-lucene
This tool extracts word vectors from Lucene index.