isabella232 / dataflow-precipitation-pipeline

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataflow Precipitation Sample

About

Both Google Dataflow and Google BigQuery can be tricky to use, let alone combining the two. This example pipeline uses both technologies extensively and implements several features (both simple and complicated) that you may need in your own project. Feel free to explore this example as you like. We hope it may serve some small part in furthering your endevours.

Data Source

All data comes from NOAA and can be found at http://water.weather.gov/precip/download.php The BigQuery US Precipitation dataset is also continuously updated using this pipeline, and can be found at https://bigquery.cloud.google.com/table/publicdata:samples.us_precipitation

Usage

Credentials

If you do not already have Google cloud credentials setup, you'll need to install gcloud and run the command:

$ gcloud auth login

This pipeline uses the default Google credentials. You can find more about setting up Google credentials at https://developers.google.com/identity/protocols/application-default-credentials

Execution

To run this pipeline, simply run the command:

$ java -jar PrecipPipe.jar --startDate=20150101 --endDate=20150709 --project=myProject --table=myProject:weather.precipitation --bucket=myBucket

For a full list of options, run:

$ java -jar PrecipPipe.jar --help

Include the "--help=PrecipitationOptions" flag for a list of pipeline-specific options.

License

This library is licensed under Apache 2.0. Full license text is available in LICENSE.

Contributing

See CONTRIBUTING.

Support

For support on BigQuery and Dataflow, please submit questions tagged with google-bigquery and google-cloud-dataflow on StackOverflow.

For issues, please submit issues here on this project page.

About

License:Apache License 2.0


Languages

Language:Java 100.0%