killpackb / CascadingSpatial

Cascading workflow with spatial binning function

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CascadingSpatial

Cascading workflow with spatial binning function.

This Hadoop based Cascading workflow enables me to take zip code locations in the continental US (not very big BTW, this is just a PoC :-)

CascadingBefore

overlay it with a set of hexagon cell in an Albers equal area conic projection

CascadingBetween

to produce a spatial density set of bins

CascadingAfter

Dependencies

$ git clone https://github.com/Esri/geometry-api-java.git
$ cd geometry-api-java
$ mvn install
$ git clone https://github.com/mraad/Shapefile.git
$ cd Shapefile
$ mvn install

Data Preparation

I've placed some sample data in the data folder. I'm assuming that you have a Hadoop cluster. If you do not have one, you can download the Cloudera Quick Start VM

$ hadoop fs -put data/zipcodes.tsv zipcodes.tsv
$ hadoop fs -put data/hexalbers.shp hexalbers.shp

Build and Run

$ mvn package
$ hadoop jar target/CascadingSpatial-1.0-job.jar zipcodes.tsv hexalbers.shp output

View Output

$ hadoop fs -cat output/part* | more
ORIGID,POPULATION
136,3
137,1
188,17
189,13
213,1
214,2
263,2
264,8
265,7
266,3
...

Save the output to a local file

$ hadoop fs -cat output/part* > density.csv

In ArcGIS for Desktop, add the density.csv as table, and join it with the hexalbers layer for symbolization on the POPULATION field.

CascadingJoin

About

Cascading workflow with spatial binning function