ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Select dataset/group while opening grib file with xarray

vinodkatmos opened this issue · comments

Is your feature request related to a problem? Please describe.

Currently, if I open a heterogeneous grib file uisng xarray, I have to filter for appropriate dataset by providing appropriate filter_by_keys . However this doesnot always work and in my case for data = xr.open_dataset(filein, engine='cfgrib', backend_kwargs={'indexpath': 'temp/{short_hash}.idx', 'filter_by_keys':{'typeOfLevel':'hybrid'}}) I got an error

cfgrib.dataset.DatasetBuildError: key present and new value is different: key='hybrid' value=Variable(dimensions=('hybrid',), data=array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13.,
14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.,
27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39.,
40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 50., 51., 52.,
53., 54., 55., 56., 57., 58., 59., 60.])) new_value=Variable(dimensions=(), data=1.0)

This was happening because my gribfile had two dataset on hybrid levels. One of 60 Hybrid levels and another on 1 hybrid levels.

Describe the solution you'd like

If I open the same data with data = cfgrib.open_datasets(filein, indexpath='temp/{short_hash}.idx') , it returns me a list of xarray datasets nicely groped.

data[0]
<xarray.Dataset>
Dimensions: (hybrid: 60, values: 348528)
Coordinates:
time datetime64[ns] 2007-09-12
step timedelta64[ns] 06:00:00
hybrid (hybrid) float64 1.0 2.0 3.0 4.0 5.0 ... 57.0 58.0 59.0 60.0
latitude (values) float64 89.73 89.73 89.73 ... -89.73 -89.73 -89.73
longitude (values) float64 0.0 20.0 40.0 60.0 ... 280.0 300.0 320.0 340.0
valid_time datetime64[ns] 2007-09-12T06:00:00
Dimensions without coordinates: values
Data variables: (12/19)
t (hybrid, values) float32 ...
q (hybrid, values) float32 ...
aermr01 (hybrid, values) float32 ...
aermr02 (hybrid, values) float32 ...
aermr03 (hybrid, values) float32 ...
aermr04 (hybrid, values) float32 ...
... ...
no2 (hybrid, values) float32 ...
so2 (hybrid, values) float32 ...
co (hybrid, values) float32 ...
hcho (hybrid, values) float32 ...
go3 (hybrid, values) float32 ...
aerext355 (hybrid, values) float32 ...
Attributes:
GRIB_edition: 2
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts

data[1]
<xarray.Dataset>
Dimensions: (values: 348528)
Coordinates:
time datetime64[ns] 2007-09-12
step timedelta64[ns] 06:00:00
hybrid float64 1.0
latitude (values) float64 ...
longitude (values) float64 ...
valid_time datetime64[ns] ...
Dimensions without coordinates: values
Data variables:
lnsp (values) float32 ...
Attributes:
GRIB_edition: 2
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts

Can we implement similar behaviour with xarray such that the user could select which data group she/he wants to load just by providing the index similar to index of list returned by cfgrib.open_datasets

Describe alternatives you've considered

No response

Additional context

No response

Organisation

EUMETSAT