ecmwf / eccodes-python

Python interface to the ecCodes GRIB/BUFR decoder/encoder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segfault from overwriting numpy array in grib_set_double_array().

direvus opened this issue · comments

I have been chasing down an extremely weird bug in my system where a whole bunch of data gets mysteriously zeroed out when converting from NetCDF4 to GRIB1. Specifically, out of 476700 float datapoints, the first 119174 datapoints end up as zeroes in the final GRIB1 file.

I eventually narrowed this down to a bad interaction between eccodes-python and all versions of numpy from 1.17.4 onwards. numpy 1.17.3 is unaffected.

The issue seems to come from grib_set_double_array() storing a numpy array in a variable, and then overwriting that variable with a FFI ctype object created from the same variable.

Test case:

#!/usr/bin/env python3
import numpy as np
from gribapi.bindings import ffi


inarray = np.ones((100000,), np.float32)
print(inarray)

# Copy numpy array to another local variable.
a = inarray

# Simulate type munging from grib_set_double_array().
a = a.astype(float)
print(a)

# Simulate FFI cast from grib_set_double_array() and overwrite variable.
a = ffi.cast('double*', a.ctypes.data)
print(a)

print(a[0])  # Sad trombone

On my amazon linux machine with eccodes 2.18.0, python 3.7.6, and any numpy >= 1.17.4, this code segfaults. Aren't pointers fun!? [1]

However, if we just store the FFI ctype object in a totally separate variable, and avoid clobbering the numpy array, it all works perfectly. Replace the final three lines of the test script with these:

b = ffi.cast('double*', a.ctypes.data)
print(b)

print(b[0])  # Cool sax

Speculating wildly, I think the issue might be that we're passing a.ctypes.data in to ffi.cast, which is the location in memory where the numpy array is stored. After we overwrite a, the numpy array is destroyed, so that memory location is no longer valid.

[1] They are not.

This fix will be in the next release. Many thanks for your diligence

No problem, and thank you for the quick response.