ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inconsistency between "grib_errors" and "errors" as argument name

Metamess opened this issue · comments

What happened?

The open_datasets() function can be supplied with a backend_kwargs dict, with one of the potentially interesting keys being "grib_errors". This keyword argument is used by open_variable_datasets() to set the errors parameter of FileStream.__init__(). (This has the value "warn" by default, but can be set to "raise" to cause an Error to be raised if there is an issue with opening the GRIB file). The dataset.py::open_file() function also expects this argument as grib_errors.

However, the CfGribBackend.open_dataset() function expects this argument to be called errors, as does FileStream.__init__() and several other functions throughout the codebase.

Due to this inconsistent naming, and the fact that the key "grib_errors" gets propagated unchanged, actually supplying the key "grib_errors" as part of backend_kwargs to open_datasets() will cause a TypeError when the call reached CfGribBackend.open_dataset(), which does not allows **kwargs and which calls this parameter errors and not grib_errors

Unless there is a specific reason to keep grib_errors as backend_kwargs separated from errors, I believe the easiest fix would be to rename the few occurrences of grib_errors to errors.

What are the steps to reproduce the bug?

Call open_datasets(filename, backend_kwargs={"grib_errors": "raise"}) where filename is a str pointing to a valid GRIB file

Version

0.9.10.4

Platform (OS and architecture)

WSL2 Ubuntu 22.04.2 LTS

Relevant log output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/site-packages/cfgrib/xarray_store.py", line 105, in open_datasets
    datasets = open_variable_datasets(path, backend_kwargs=backend_kwargs, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/cfgrib/xarray_store.py", line 93, in open_variable_datasets
    datasets.extend(raw_open_datasets(path, bk, **kwargs))
  File "/usr/local/lib/python3.10/site-packages/cfgrib/xarray_store.py", line 66, in raw_open_datasets
    datasets.append(open_dataset(path, backend_kwargs=backend_kwargs, **kwargs))
  File "/usr/local/lib/python3.10/site-packages/cfgrib/xarray_store.py", line 39, in open_dataset
    return xr.open_dataset(path, **kwargs)  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/xarray/backends/api.py", line 526, in open_dataset
    backend_ds = backend.open_dataset(
TypeError: CfGribBackend.open_dataset() got an unexpected keyword argument 'grib_errors'

Accompanying data

No response

Organisation

No response

I have created a PR (#349 ) that intends to fix this issue. Please let me know if any changes are required!

On a sidenote, I think the pinned versions in requirements-tests.txt could really use a bump. Many major packages (numpy, pandas, xarray, even pytest) have had a lot of development happen since March 2020, and the current pinned versions are not operable with python 3.10.

For what it's worth, I did a pip install -r ci/requirements-tests.txt after removing all version pinning, and all the tests pass with pytest -v, while pytest -v --flakes merely complains about some unused imports. So for as far as that is any indication, there seem to be no breaking changes with newer versions of the required packages.