HTM.match sometimes returns non-zero distance for self-match
NiallMac opened this issue · comments
I was trying to use HTM.match to get the distance to the nearest neighbour for each object in a catalog. Hence I thought I could use maxmatch=2
and use the non-zero distance. However, with maxmatch=1
, HTM.match returns some (about 20%) non-zero distances which presumably should not happen, since every objects' closest object should be itself. I've copied some ipython showing this below:
In [4]: from esutil.htm import HTM
In [5]: h=HTM()
In [7]: ra
Out[7]:
array([52.18053786, 52.45002301, 52.15060583, ..., 52.39342204,
52.39235128, 52.41923532])
In [8]: dec
Out[8]:
array([-27.12791705, -27.12528608, -27.11948308, ..., -27.2711784 ,
-27.27120412, -27.27125424])
In [9]: m1,m2,d12 = h.match(ra, dec, ra, dec, 0.01, maxmatch=1)
In [10]: d12
Out[10]:
array([8.53773646e-07, 0.00000000e+00, 0.00000000e+00, ...,
0.00000000e+00, 8.53773646e-07, 0.00000000e+00])
In [11]: (d12<1.e-9).sum()
Out[11]: 12019
In [12]: (d12>1.e-9).sum()
Out[12]: 2408
In [13]: m1
Out[13]: array([ 0, 1, 2, ..., 14424, 14425, 14426])
this seems to be some imprecision in the great circle distance calculations
The numpy version in esutil.coords.gcirc
also can give nonzero distance, but interestingly it seems to happen with different coordinates than the C++ one
interestingly the eu.coords.sphdist
gives zero
sphdist also doesn't always get zero.
This seems to be a floating point issue; I converted everything to long double and then, when I get nonzero, it is always at the same small value