he9lin / Impatient-Cascalog

Cascalog for the Impatient

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cascading Cascalog for the Impatient

Welcome to Cascalog for the Impatient, a series of tutorial and Cascalog code examples to get you started. This series is a fork of Cascading for the Impatient.

This set of progressive coding examples starts with a simple file copy and builds up to a MapReduce implementation of the TF-IDF algorithm.

Getting Started

Clone this repository and head over to the Wiki to follow through with this 6-part tutorial.

Prerequisites

Install the following:

  1. Hadoop, see Apache's instruction on setting up a local node
  2. Leiningen build tool for Clojure

Some basic knowledge of Clojure and using Leiningen would be helpful.

About

Cascalog for the Impatient