esheldon / esutil

A variety of python utilities focusing on numerical, scientific, and astrophysical computing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HTM.match sometimes returns non-zero distance for self-match

NiallMac opened this issue · comments

I was trying to use HTM.match to get the distance to the nearest neighbour for each object in a catalog. Hence I thought I could use maxmatch=2 and use the non-zero distance. However, with maxmatch=1, HTM.match returns some (about 20%) non-zero distances which presumably should not happen, since every objects' closest object should be itself. I've copied some ipython showing this below:

In [4]: from esutil.htm import HTM

In [5]: h=HTM()

In [7]: ra
Out[7]: 
array([52.18053786, 52.45002301, 52.15060583, ..., 52.39342204,
       52.39235128, 52.41923532])

In [8]: dec
Out[8]: 
array([-27.12791705, -27.12528608, -27.11948308, ..., -27.2711784 ,
       -27.27120412, -27.27125424])

In [9]: m1,m2,d12 = h.match(ra, dec, ra, dec, 0.01, maxmatch=1)

In [10]: d12
Out[10]: 
array([8.53773646e-07, 0.00000000e+00, 0.00000000e+00, ...,
       0.00000000e+00, 8.53773646e-07, 0.00000000e+00])

In [11]: (d12<1.e-9).sum()
Out[11]: 12019

In [12]: (d12>1.e-9).sum()
Out[12]: 2408

In [13]: m1
Out[13]: array([    0,     1,     2, ..., 14424, 14425, 14426])

this seems to be some imprecision in the great circle distance calculations

The numpy version in esutil.coords.gcirc also can give nonzero distance, but interestingly it seems to happen with different coordinates than the C++ one

interestingly the eu.coords.sphdist gives zero

sphdist also doesn't always get zero.

This seems to be a floating point issue; I converted everything to long double and then, when I get nonzero, it is always at the same small value