macmaster / bloomberg

distrubted computing scratchpad for my job.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bloomberg

Work projects for Hadoop and HBase monitoring / testing.

projects

  • jmx-client: client for accessing HBase / HDFS and exporting RESTful JSON jmx metrics.
  • jmx-metrics: library for background jmx Hadoop metrics2 logging.
  • mapreduce: a standard mapreduce job. first one I have done properly...

reading list

very important things to read.

  1. The Google File System (2003):
    Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung
    [https://research.google.com/archive/gfs.html]

  2. MapReduce: Simplified Data Processing on Large Clusters (2004):
    Jeffrey Dean and Sanjay Ghemawat
    [https://research.google.com/archive/mapreduce.html]

  3. Web Search for a Planet: The Google Cluster Architecture (2003):
    Luiz Andre Barroso, Jeffrey Dean, and Urs Holzle
    [https://static.googleusercontent.com/media/research.google.com/en//archive/googlecluster-ieee.pdf]

  4. Experiences with MapReduce, an Abstraction for Large-Scale Computation (2006):
    Jeffery Dean
    [https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/32721.pdf]

  5. The Anatomy of a Large-Scale Hypertextual Web Search Engine (1998):
    Sergey Brin and Lawrence Page
    [https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/334.pdf]

  6. Impossibility of Distributeed Consensus with One Faulty Process (1985):
    Michael Fischer, Nancy Lynch, and Machael Paterson
    [https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf]

  7. Paxos Made Simple (2001):
    Leslie Lamport
    [http://lamport.azurewebsites.net/pubs/paxos-simple.pdf]

About

distrubted computing scratchpad for my job.


Languages

Language:Java 95.4%Language:Ruby 1.6%Language:Shell 1.5%Language:HTML 1.0%Language:Python 0.6%