histogram1d returns incorrect bins due to rounding error
jokabrink opened this issue · comments
Hello,
the function histogram1d
(and likely histogram2d
) can increment the wrong bin when a floating-point rounding error happened. This happens at https://github.com/astrofrog/fast-histogram/blob/main/fast_histogram/_histogram_core.c#L173, when the value of ix
is calculated just below a whole number due to rounding. The index lookup in the next line then truncates the value to a whole number resulting in an incorrectly incremented bin one below.
See here:
left, right = (0.9, 1.1)
bins = 10
norm = bins/(right-left)
value = 1.00
ix = (value-left)*norm;
print(ix, int(ix)) # 4.999999999999997 4
The incorrect behaviour of histogram1d
can be reproduced with the following script:
#!/usr/bin/env python3
import numpy as np
from fast_histogram import histogram1d
data = np.array([0.999, 1.000, 1.001, 1.099, 1.100, 1.101])
bins, edges = np.histogram(data, bins=10, range=(0.9, 1.1))
fast_bins = histogram1d(data, bins=10, range=(0.9, 1.1))
print(data)
print(edges)
print("#### NumPy")
print(bins) # [0 0 0 0 1 2 0 0 0 2]
print("#### Fast-Histogram")
print(fast_bins) # [0. 0. 0. 0. 2. 1. 0. 0. 0. 1.]
Also this includes another error (miscounted right-most value), the incorrect result due to the rounding error is the in the middle of the bins ( (1,2) vs (2,1) ).