pangeo-forge / pangeo-forge-recipes

Python library for building Pangeo Forge recipes.

Home Page:https://pangeo-forge.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Nest `CombineReferences` inside `WriteCombinedReferences`

cisaacstern opened this issue · comments

Working on #610 has me thinking about how to easily communicate common recipe styles. Over there, I'm centering on the idea that the basic version of all common pipeline styles should be expressible as: FilePattern | Opener | Writer. StoreToZarr fits this style, but our kerchunk recipes do not:

| OpenWithKerchunk(
file_type=pattern.file_type,
remote_protocol=remote_protocol,
storage_options=storage_options,
kerchunk_open_kwargs={"filter": grib_filters},
)
| CombineReferences(
concat_dims=pattern.concat_dims,
identical_dims=identical_dims,
precombine_inputs=True,
)
| WriteCombinedReference(
store_name="hrrr-concat-step",
)

To align kerchunk recipes with this style, we can nest CombineReferences inside WriteCombinedReferences, making the basic kerchunk recipe expressible as:

recipe = (
    beam.Create(pattern.items())
    | OpenWithKerchunk(...)
    | WriteCombinedReferences(...)
)

This seems like a good call!

Closed by #635