thampiman / reverse-geocoder

A fast, offline reverse geocoder in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

San Francisco shows up as San Ramon

rchrd2 opened this issue · comments

Hello, this might be in the hands of the CSV file, but it may also be a bug.

I looked in the CSV file and found San Ramon,California and San Francisco,California.
I also have my coordinates that come up as San Ramon, even though they are in San Francisco.

37.78674,-122.39222,ME
37.77493,-122.41942,San Francisco,California,San Francisco County,US
37.77993,-121.97802,San Ramon,California,Contra Costa County,US

The "ME" point is closer to San Francisco, it shows up as San Ramon. See this image:

ME is blue. San Francisco is green. San Ramon is red.

screen shot 2015-04-01 at 11 33 58 pm

I understand that coordinates are not in a 2d space, but is there something about the math that is making ME show up as San Ramon?

>>> reverse_geocoder.search([(37.78674,-122.39222)])
[{'name': 'San Ramon', 'cc': 'US', 'lon': '-121.97802', 'admin1': 'California', 'admin2': 'Contra Costa County', 'lat': '37.77993'}]

Thank you.

Hi Richard... looks like you were finally able to use the library. I'll have a look at the Geonames data and see if something is missing. Will keep you posted.

I've having the same issue (SF <-> San Ramon).

Another example: the point 34.6734523069,135.528030395 should be in Osaka, JP but resolves to Nara, JP which is a significant distance to the east (but maybe has a point that is closer in latitude?)

Qualitatively my results are better with version 1.1 instead of version 1.2 with the updated distance metrics.

There was a problem with the Geodetic -> ECEF conversion. I've rolled back this change in v1.3 (including other fixes). I'm working on a better solution using the haversine formula.

Great, thanks for your hard work! 👍 😎

Glad someone else chimed in and this got resolved. Thank you both.

@thampiman Hello again! I am noticing a much more minute issue, albeit important one. Once again I have a long/lat in San Francisco, but it's being reverse geocoded as "Daly City" which is an adjacent town. Admittedly it's a close call, but upon inspection the San Francisco coordinate is indeed closer and should be the result of the search function. Daly city is 6.112 km away but San Francisco is 5.171 km away.

37.759748099999996,-122.4750292, ME
37.70577,-122.46192,Daly City,California,San Mateo County,US (6.112 km)
37.77493,-122.41942,San Francisco,California,San Francisco County,US (5.171 km)

Distance calculated using: http://www.movable-type.co.uk/scripts/latlong.html

>>> import reverse_geocoder
>>> reverse_geocoder.search([(37.759748099999996,-122.4750292)])
Loading formatted geocoded file...
[{'name': 'Daly City', 'cc': 'US', 'lon': '-122.46192', 'admin1': 'California', 'admin2': 'San Mateo County', 'lat': '37.70577'}]
>>> 

To visually see the error go to http://www.darrinward.com/lat-long/?id=540891

screen shot 2015-04-17 at 3 26 17 pm


I also noticed that Daly City is "closer" when calculated using basic distance formula (ie http://ncalculators.com/geometry/length-between-two-points-calculator.htm). Ie 0.0555 versus 0.0576. Maybe this is why it's returning the wrong result? Does this library use the "haversine" formula, or would that be too slow?

@rchrd2 You are right. The library now uses the Euclidean distance to find the nearest neighbour. This is primarily because cKDTrees in scipy does not support the haversine formula. I'm currently working on implementing haversine in the library and will let you know when I release the update. Thanks to you I have an additional test case.