nws noaa grib files open_file but fail to convert OnDiskArray to numpy array (cfgrib >= 0.9.10.2)
ghaarsma opened this issue · comments
What happened?
When downloading, opening [open_file
] and getting the data values from NWS NOAA grib files, all works perfect under cfgrib 0.9.10.1. Any newer cfgrib version, the file still opens, but getting a numpy array from the OnDiskArray fails in the newly (0.9.10.2) added function get_values_in_order
.
What are the steps to reproduce the bug?
# Testing cfgrib on NWS NOAA grib files ----------------------------------------------------------------
import requests
import cfgrib
import sys
print(f"cfgrid version: {cfgrib.__version__}, Python: {sys.version}")
for var in ["waveh", "wdir", "wgust", "wspd", "wwa"]:
with requests.get(f'https://tgftp.nws.noaa.gov/SL.us008001/ST.opnl/DF.gr2/DC.ndfd/AR.oceanic/VP.001-003/ds.{var}.bin', stream=True) as r:
with open(f'ds.{var}.bin', 'wb') as f:
f.write(r.content)
print(f"downloading {var} data")
# Open file directly and return cfgrib dataset
ds = cfgrib.open_file(f'ds.{var}.bin', indexpath="")
variables = ds.variables.keys()
# The last variable is the expected data variable
data_var = list(variables)[-1]
# Get the last variable values as numpy array
values = ds.variables[data_var].data[:, :]
print(f"Data var: {data_var} has shape: {values.shape}")
Version
0.9.10.2, 0.9.10.3, & 0.9.10.4
Platform (OS and architecture)
Python: 3.11.6 (tags/v3.11.6:8b6ee5b, Oct 2 2023, 14:57:12) [MSC v.1935 64 bit (AMD64)]
Relevant log output
C:\Users\PythonProjects\degrib\venv\Scripts\python.exe
C:\Users\PythonProjects\degrib\gom_forecast.py
C:\Users\PythonProjects\degrib\venv\Lib\site-packages\gribapi\__init__.py:23: UserWarning: ecCodes 2.31.0 or higher is recommended. You are running version 2.27.0
warnings.warn(
cfgrid version: 0.9.10.1, Python: 3.11.6 (tags/v3.11.6:8b6ee5b, Oct 2 2023, 14:57:12) [MSC v.1935 64 bit (AMD64)]
downloading waveh data
Data var: shww has shape: (21, 4512981)
downloading wdir data
Data var: wdir10 has shape: (21, 4512981)
downloading wgust data
Data var: gust has shape: (21, 4512981)
downloading wspd data
Data var: si10 has shape: (21, 4512981)
downloading wwa data
Data var: unknown has shape: (21, 4512981)
Process finished with exit code 0
__________________________________________________________________________________________________
C:\Users\PythonProjects\degrib\venv\Scripts\python.exe
C:\Users\PythonProjects\degrib\gom_forecast.py
C:\Users\PythonProjects\degrib\venv\Lib\site-packages\gribapi\__init__.py:23: UserWarning: ecCodes 2.31.0 or higher is recommended. You are running version 2.27.0
warnings.warn(
cfgrid version: 0.9.10.2, Python: 3.11.6 (tags/v3.11.6:8b6ee5b, Oct 2 2023, 14:57:12) [MSC v.1935 64 bit (AMD64)]
downloading waveh data
Traceback (most recent call last):
File "C:\Users\PythonProjects\degrib\gom_forecast.py", line 19, in <module>
values = ds.variables[data_var].data[:, :]
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File "C:\Users\PythonProjects\degrib\venv\Lib\site-packages\cfgrib\dataset.py", line 355, in __getitem__
values = get_values_in_order(message, array_field[tuple(array_field_indexes)].shape)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PythonProjects\degrib\venv\Lib\site-packages\cfgrib\dataset.py", line 314, in get_values_in_order
values[1::2, :] = values[1::2, ::-1]
~~~~~~^^^^^^^^^^^^
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
Process finished with exit code 1
Accompanying data
https://tgftp.nws.noaa.gov/SL.us008001/ST.opnl/DF.gr2/DC.ndfd/AR.oceanic/VP.001-003/ds.waveh.bin
Organisation
No response
I checked the downloaded files:
ds.wwa.bin
ds.wspd.bin
ds.wdir.bin
ds.wgust.bin
ds.waveh.bin
and all use WMO encodings for their parameters except for ds.wwa.bin which uses a LOCAL encoding i.e.,
discipline=0
parameterCategory=19
parameterNumber=217
and this is not recognised by ecCodes and therefore the shortName key has the "unknown" value
Thank you @shahramn, for your troubleshooting so far.
Yes, I agree that the wwa.bin file has an "unknown" data variable (as seen in the logs).
However, In cfgrib 0.9.10.1 all 5 files properly extract the 2-dim Data array. All cfgrib later versions (0.9.10.2, 0.9.10.3, & 0.9.10.4) fail to extract the data array for all five files.
I have done a little bit of debugging and it seems that for all 5 NWS NOAA files, inside dataset.py
line 316: the code section message.get("alternativeRowScanning", False)
returns 1 (True), which results in the crash on line 318.
Hope this helps
Hi @ghaarsma,
I believe I've fixed the issue (thanks for the report, it was a particular case that we had not previously encountered - alternativeRowScanning in a Mercator grid). Are you able to test my branch locally, or do you need a new release of cfgrib?
Cheers,
Iain
Hi @iainrussell,
I can confirm that the branch fix/alternate-scanning-mercator
fixes the problem after some local testing. Thank you for the quick fix. Looking forward to a new release of cfgrib, so we can roll it into production.