ecmwf / cfgrib

A Python interface to map GRIB files to the NetCDF Common Data Model following the CF Convention using ecCodes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

hashlib.md5 used for security

barronh opened this issue · comments

cfgrib's messages.py line 531 fails on newer versions of Python on systems with Federal Information Processing Standards (FIPS) enabled.[2] However, it can be bypassed by using the usedforsecurity=False option for hashlib.md5.

cfgrib is not using the md5 for security, but the usedforsecurity option is only guaranteed available for newer versions of python 3.9+. So, the fix needs to include a failback to the non-keyword approach.

Backward Compatible Fix

Below is a unified patch (diff -up)

--- old/lib/python3.11/site-packages/cfgrib/messages.py        2023-11-29 13:52:50.921602000 -0500
+++ new/lib/python3.11/site-packages/cfgrib/messages.py    2024-06-21 10:31:58.943205473 -0400
@@ -528,7 +528,14 @@ class FileIndex(FieldsetIndex):
         if not indexpath:
             return cls.from_fieldset(filestream, index_keys, computed_keys)
 
-        hash = hashlib.md5(repr(index_keys).encode("utf-8")).hexdigest()
+        # hash = hashlib.md5(repr(index_keys).encode("utf-8")).hexdigest()
+        keystr = repr(index_keys).encode("utf-8")
+        try:
+            # try python3.9+ keyword, which is also supported on some earlier versions
+            hash = hashlib.md5(keystr, usedforsecurity=False).hexdigest()
+        except TypeError:
+            # unknown keywords trigger TypeError so default back to basic call
+            hash = hashlib.md5(keystr).hexdigest()
         indexpath = indexpath.format(path=filestream.path, hash=hash, short_hash=hash[:5])
         try:
             with compat_create_exclusive(indexpath) as new_index_file:

Instead of try/except, it would be possible to use version specific methods. Fixes to other libraries have gone that direction, but then folks have noted that some older versions do support/require the usedforsecurity (probably depending on the openssl version?). So, I have opted for the try it and see approach.

Reproduce Problem

Environment:

Red Hat Enterprise Linux 8
Python 3.11.5 (main, Sep 22 2023, 15:34:29) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] on linux
cfgrib: 0.9.10.4
FIPS enabled

Verify that FIPS mode is on

import _hashlib
_hashlib.get_fips_mode()
# 1

Test code and traceback

import cfgrib
import requests
# any grib2 file will work, but here I am using an NWS publicly available file.
r = requests.get('https://tgftp.nws.noaa.gov/SL.us008001/ST.opnl/DF.gr2/DC.ndgd/GT.aq/AR.conus/ds.apm25h24.bin')
with open('test.grib', 'wb') as tmpf:
  tmpf.write(r.content)

f = cfgrib.dataset.open_file('test.grib')

Without the fix, you should get the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/test/py311/lib64/python3.11/site-packages/cfgrib/dataset.py", line 782, in open_file
    index = open_fileindex(stream, indexpath, index_keys, filter_by_keys=filter_by_keys)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/test/py311/lib64/python3.11/site-packages/cfgrib/dataset.py", line 761, in open_fileindex
    index = messages.FileIndex.from_indexpath_or_filestream(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/test/py311/lib64/python3.11/site-packages/cfgrib/messages.py", line 531, in from_indexpath_or_filestream
    hash = hashlib.md5(repr(index_keys).encode("utf-8")).hexdigest()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: [digital envelope routines: EVP_DigestInit_ex] disabled for FIPS