Weird outputs in interp_out
gillette7 opened this issue · comments
I'm running some examples where the input data points form a regular lattice in 2D and the points for interpolation sometimes lie on the edges or diagonals of this grid. I've found that the output values from interp_out
in this case are sometimes things like 1e271
or nan
. (I'm using the python bindings). Is this a known issue, perhaps due to the non-uniqueness of the Delaunay triangulation on a lattice? Or is this (more likely?) some other bug in my code? Thanks.
Grid aligned data is not in "general position", so the nonuniqueness of the Delaunay triangulation can cause all sorts of problems. I'd suggest adding an acceptable random perturbation to all of your input data (on the scale of 10^{-8}).
If the problem persists, then it can be an issue with the compilation of the package when it uses the local quadratic program solver for extrapolation (I believe this can be used on the edge of the convex hull as well as outside). In that case I'd make sure that your Python bindings are being compiled with "-lblas -llapack" and not the local files.
See the comments in the Python wrapper here.
The outputs for interp_out should never contain nan values or anything outside the range of observed data... If the triangulation is non-unique, you should still get the interpolated result from a Delaunay simplex from some Delaunay triangulation... There should be no need to ever perturb your data...
@gillette7 can you share such a dataset with us so I can try to check and see what is causing the failure?
Hmm... in preparing a dataset for you I may have discovered the problem. It looks like the interp_in
array that I'm providing as input to DelaunaySparse has unallowable(?) numbers, like 2.3e-310
as well as nan
s. I thought that was avoided by this line, which I put in before printing interp_in
:
interp_in = np.require(interp_in, dtype=np.float64, requirements=['F'])
Any thoughts?
This is definitely the problem. When I generate the data, put it into a pandas
dataframe and print it, I get this:
data_train outputs =
0
0 0.000000
1 0.000000
2 0.000000
3 0.000000
4 -0.263418
5 0.000000
6 0.000000
7 0.000000
8 0.000000
Looks fine. This pandas dataframe object is returned by one python function, then passed as input to another. By the time it gets there, it prints like this:
data_train outputs =
0
0 2.052198e-316
1 2.052161e-316
2 0.000000e+00
3 0.000000e+00
4 -2.634177e-01
5 0.000000e+00
6 0.000000e+00
7 0.000000e+00
8 1.264808e-321
This is before I try to enforce type, or do anything else. So it's seemingly some quirk of moving data around in python...
Aha, this looks like the cause of the issue... not sure what is going on without seeing your code, but these seem like corrupted or uninitialized entries
The issue was in the conversion from numpy to pandas:
data_train_outputs = pd.DataFrame(data_train_outputs)
Some of the data was not being read into pandas as floats, for reasons unknown (probably a quirk of how the data was generated, using a package called pymfem). However, I was able to fix it by adding this line:
data_train_outputs = data_train_outputs.astype(float)
Thanks for your help!