emlaver / notebooks

Spark notebooks for working with Cloudant and dashDB data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

notebooks

Refer to the LICENSE for information about the license under which this code is made available.

This repository includes Spark notebooks for working with Cloudant data.

Import to Cloudant: This notebook is intended for Python 2 with Spark 2.0. It imports SparkSession from pyspark to load a CSV file stored in Bluemix object storage into a dataframe, filters that data, then using the spark-cloudant connector, writes the filtered data to a previoulsy created Cloudant database. This example notebook loads a CSV file containing Child Care providers in Massachusetts downloaded from https://data.mass.gov/Education/Program-list-for-Child-Care-Search-1-15-2015/cb6m-ccic

About

Spark notebooks for working with Cloudant and dashDB data

License:Apache License 2.0


Languages

Language:Jupyter Notebook 100.0%