echeipesh / geotrellis-hrpd

GeoTrellis Jobs on Facebook HRPD dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Facebook HRPD Summary

This project performs polygonal summary over HRPD dataset for US County polygons and aggregates results in CSV.

Data

US Counties GEOJSON is included in this project at src/main/resources/us_counties.geojson

Output is verified using following gist

Running on EMR

Update and verify EMR configuration values in build.sbt then:

sbt
sbt:geotrellis-hrpd> sparkCreateCluster
sbt:geotrellis-hrpd> sparkSubmitMain geotrellis.jobs.hrpd.PopSummaryApp --output s3://geotrellis-test/eac/county-hrdp-pop-v1 --partitions 2000

Running Locally

This will be a bear to run locally. For purpose of development and iteration you would need to trim the input set.

sbt
sbt:geotrellis-hrpd> test:runMain geotrellis.jobs.hrpd.PopSummaryApp --output file:/tmp/county.csv --partitions 2000

About

GeoTrellis Jobs on Facebook HRPD dataset

License:Other


Languages

Language:Scala 100.0%