stac-extensions / xarray-assets

This extension helps users open STAC Assets with xarray. It gives a place for catalog maintainers to specify various required or recommended options.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

xarray Assets Extension Specification

This document explains the xarray Assets Extension to the SpatioTemporal Asset Catalog (STAC) specification.

This extension helps users open STAC Assets with xarray. It gives a place for catalog maintainers to specify various required or recommended options. Without this extension, users would somehow need to know which options are required in order to load the dataset. See Python Example for an example of how consumers of this extension can use it to simplify data loading.

Item Properties and Collection Fields

Field Name Type Description
xarray:open_kwargs Map<string, Any> Keyword arguments to provide to the xarray opener
xarray:storage_options Map<string, Any> Additional keywords to provide to fsspec.filesystem

Additional Field Information

xarray:open_kwargs

Keyword arguments to provide to the xarray opener, for example xarray.open_zarr. The opener should be determined by the media type of the asset.

The are in addition to the positional argument, for example the store, which is obtained from the Assets href. For example, to specify consolidated metadata:

{
  "xarray:open_kwargs": {
    "consolidated": true
  }
}

xarray:storage_options

fsspec.filesystem enables opening a filesystem from URI (e.g. abfs://path/to/blob, https://path/to/file, s3://path/to/file). The various filesystems support and require backend-specific keyword arguments, which can be provided as **storage_options.

Python Example

This example demonstrates how consumers of this extension can use the data to simplify the process of loading an asset from STAC into an xarray Dataset.

>>> import fsspec, xarray, pystac
>>> collection = pystac.read_file("examples/collection.json")
>>> asset = collection.assets["example"]
>>> asset.media_type
'application/vnd+zarr'
>>> store = fsspec.get_mapper(asset.href, **asset.properties["xarray:storage_options"])
>>> ds = xarray.open_zarr(store, **asset.properties["xarray:open_kwargs"])
>>> ds
<xarray.Dataset>
Dimensions:                 (crs: 1, lat: 4320, lon: 8640, time: 744)
Coordinates:
  * crs                     (crs) int16 3
  * lat                     (lat) float64 89.98 89.94 89.9 ... -89.94 -89.98
  * lon                     (lon) float64 -180.0 -179.9 -179.9 ... 179.9 180.0
  * time                    (time) datetime64[ns] 1958-01-01 ... 2019-12-01
Data variables: (12/18)
    aet                     (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    def                     (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    pdsi                    (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    pet                     (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    ppt                     (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    ppt_station_influence   (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    ...                      ...
    tmin                    (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    tmin_station_influence  (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    vap                     (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    vap_station_influence   (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    vpd                     (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>
    ws                      (time, lat, lon) float32 dask.array<chunksize=(12, 1440, 1440), meta=np.ndarray>

Contributing

All contributions are subject to the STAC Specification Code of Conduct. For contributions, please follow the STAC specification contributing guide Instructions for running tests are copied here for convenience.

Running tests

The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid. To run tests locally, you'll need npm, which is a standard part of any node.js installation.

First you'll need to install everything with npm once. Just navigate to the root of this repository and on your command line run:

npm install

Then to check markdown formatting and test the examples against the JSON schema, you can run:

npm test

This will spit out the same texts that you see online, and you can then go and fix your markdown or examples.

If the tests reveal formatting problems with the examples, you can fix them with:

npm run format-examples

About

This extension helps users open STAC Assets with xarray. It gives a place for catalog maintainers to specify various required or recommended options.

License:Apache License 2.0