bloomberg
Work projects for Hadoop and HBase monitoring / testing.
projects
- jmx-client: client for accessing HBase / HDFS and exporting RESTful JSON jmx metrics.
- jmx-metrics: library for background jmx Hadoop metrics2 logging.
- mapreduce: a standard mapreduce job. first one I have done properly...
reading list
very important things to read.
-
The Google File System (2003):
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung
[https://research.google.com/archive/gfs.html] -
MapReduce: Simplified Data Processing on Large Clusters (2004):
Jeffrey Dean and Sanjay Ghemawat
[https://research.google.com/archive/mapreduce.html] -
Web Search for a Planet: The Google Cluster Architecture (2003):
Luiz Andre Barroso, Jeffrey Dean, and Urs Holzle
[https://static.googleusercontent.com/media/research.google.com/en//archive/googlecluster-ieee.pdf] -
Experiences with MapReduce, an Abstraction for Large-Scale Computation (2006):
Jeffery Dean
[https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/32721.pdf] -
The Anatomy of a Large-Scale Hypertextual Web Search Engine (1998):
Sergey Brin and Lawrence Page
[https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/334.pdf] -
Impossibility of Distributeed Consensus with One Faulty Process (1985):
Michael Fischer, Nancy Lynch, and Machael Paterson
[https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf] -
Paxos Made Simple (2001):
Leslie Lamport
[http://lamport.azurewebsites.net/pubs/paxos-simple.pdf]