Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

San Francisco shows up as San Ramon #10

Closed
rchrd2 opened this issue Apr 2, 2015 · 7 comments
Closed

San Francisco shows up as San Ramon #10

rchrd2 opened this issue Apr 2, 2015 · 7 comments

Comments

@rchrd2
Copy link

rchrd2 commented Apr 2, 2015

Hello, this might be in the hands of the CSV file, but it may also be a bug.

I looked in the CSV file and found San Ramon,California and San Francisco,California.
I also have my coordinates that come up as San Ramon, even though they are in San Francisco.

37.78674,-122.39222,ME
37.77493,-122.41942,San Francisco,California,San Francisco County,US
37.77993,-121.97802,San Ramon,California,Contra Costa County,US

The "ME" point is closer to San Francisco, it shows up as San Ramon. See this image:

ME is blue. San Francisco is green. San Ramon is red.

screen shot 2015-04-01 at 11 33 58 pm

I understand that coordinates are not in a 2d space, but is there something about the math that is making ME show up as San Ramon?

>>> reverse_geocoder.search([(37.78674,-122.39222)])
[{'name': 'San Ramon', 'cc': 'US', 'lon': '-121.97802', 'admin1': 'California', 'admin2': 'Contra Costa County', 'lat': '37.77993'}]

Thank you.

@rchrd2 rchrd2 changed the title San Francisco shows up at San Ramon San Francisco shows up as San Ramon Apr 2, 2015
@thampiman
Copy link
Owner

Hi Richard... looks like you were finally able to use the library. I'll have a look at the Geonames data and see if something is missing. Will keep you posted.

@bdon
Copy link
Contributor

bdon commented Apr 11, 2015

I've having the same issue (SF <-> San Ramon).

Another example: the point 34.6734523069,135.528030395 should be in Osaka, JP but resolves to Nara, JP which is a significant distance to the east (but maybe has a point that is closer in latitude?)

Qualitatively my results are better with version 1.1 instead of version 1.2 with the updated distance metrics.

@thampiman
Copy link
Owner

There was a problem with the Geodetic -> ECEF conversion. I've rolled back this change in v1.3 (including other fixes). I'm working on a better solution using the haversine formula.

@bdon
Copy link
Contributor

bdon commented Apr 11, 2015

Great, thanks for your hard work! 👍 😎

@rchrd2
Copy link
Author

rchrd2 commented Apr 11, 2015

Glad someone else chimed in and this got resolved. Thank you both.

@rchrd2
Copy link
Author

rchrd2 commented Apr 17, 2015

@thampiman Hello again! I am noticing a much more minute issue, albeit important one. Once again I have a long/lat in San Francisco, but it's being reverse geocoded as "Daly City" which is an adjacent town. Admittedly it's a close call, but upon inspection the San Francisco coordinate is indeed closer and should be the result of the search function. Daly city is 6.112 km away but San Francisco is 5.171 km away.

37.759748099999996,-122.4750292, ME
37.70577,-122.46192,Daly City,California,San Mateo County,US (6.112 km)
37.77493,-122.41942,San Francisco,California,San Francisco County,US (5.171 km)

Distance calculated using: http://www.movable-type.co.uk/scripts/latlong.html

>>> import reverse_geocoder
>>> reverse_geocoder.search([(37.759748099999996,-122.4750292)])
Loading formatted geocoded file...
[{'name': 'Daly City', 'cc': 'US', 'lon': '-122.46192', 'admin1': 'California', 'admin2': 'San Mateo County', 'lat': '37.70577'}]
>>> 

To visually see the error go to http://www.darrinward.com/lat-long/?id=540891

screen shot 2015-04-17 at 3 26 17 pm


I also noticed that Daly City is "closer" when calculated using basic distance formula (ie http://ncalculators.com/geometry/length-between-two-points-calculator.htm). Ie 0.0555 versus 0.0576. Maybe this is why it's returning the wrong result? Does this library use the "haversine" formula, or would that be too slow?

@thampiman
Copy link
Owner

@rchrd2 You are right. The library now uses the Euclidean distance to find the nearest neighbour. This is primarily because cKDTrees in scipy does not support the haversine formula. I'm currently working on implementing haversine in the library and will let you know when I release the update. Thanks to you I have an additional test case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants