pangeo-data / pangeo-datastore

Pangeo Cloud Datastore

Home Page:https://catalog.pangeo.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pangeo Cloud Datastore

Catalog Status: Build Status

Browseable Online Website: https://pangeo-data.github.io/pangeo-datastore/

This repository is where Pangeo's official cloud data catalog lives. This catalog is an Intake catalog. Most of the data is stored in Zarr format and meant to be opened with Xarray.

The master intake catalog URL is

https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml

Requirements

Using this catalog requires package versions that are quite recent as of April, 2019.

Examples

To open the catalog and load a dataset from python, you can run the following code

import intake
cat_url = 'https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml'
cat = intake.open_catalog(cat_url)
ds = cat.atmosphere.gmet_v1.to_dask()

To explore the whole catalog, you can try

cat.walk(depth=5)

Accessing requester pays data

Several of the datasets within the cloud data catalog are contained in requester pays storage buckets. This means that a user requesting data must provide their own billing project (created and authenticated through Google Cloud Platform) to be billed for the charges associated with accessing a dataset. To set up an GCP billing project and use it for authentication in applications:

 conda install -c conda-forge google-cloud-sdk 
  • Initialize the gcloud command line interface, logging into the account used to create the aforementioned project and selecting it as the default project; this will allow the project to be used for requester pays access through the command line:
gcloud auth login
gcloud init
  • Finally, use gcloud to establish application default credentials; this will allow the project to be used for requester pays access through applications:
gcloud auth application-default login

Adding Datasets

To suggest adding a new dataset, please open an issue.

About

Pangeo Cloud Datastore

https://catalog.pangeo.io


Languages

Language:Python 100.0%