Alex Holmes's repositories
hadoop-book
Source code to accompany the book "Hadoop in Practice", published by Manning.
vagrant-hadoop-spark-hive
Vagrant project to spin up a single virtual machine running current versions of Hadoop, Hive and Spark
hdfs-file-slurper
Utility to easily copy files into HDFS
json-mapreduce
InputFormat that can split multi-line JSON
avro-maven
A simple example of how to use the Avro Maven plugin to generate Avro sources.
hadoop-utils
A set of Hadoop utilities to make working with Hadoop a little easier.
avro-sorting
Examples of built-in and customizable sorting in Avro and Hadoop.
java-external-sort
sort large files in Java
storm-trending-words
Quick and dirty trending words example on Storm.
hdfscompact
A HDFS file compacter.
mleap
MLeap: Deploy ML Pipelines to Production
spark
Apache Spark - A unified analytics engine for large-scale data processing