Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OSM data for geocoding in all boroughs #179

Merged
merged 71 commits into from
Nov 24, 2024
Merged

Use OSM data for geocoding in all boroughs #179

merged 71 commits into from
Nov 24, 2024

Conversation

danvk
Copy link
Owner

@danvk danvk commented Nov 22, 2024

#174

data update: oldnyc/oldnyc.github.io@171c2fb

Changed: 532
  +geom: 638

So far as I can tell, the additions are all good. The changes are mostly good. Of the 65 biggest movers, 5 are losses, 3 are neutral (neither is right) and 57 are wins. Pretty good! Most of these are correcting Google failures or previous bugs in the grid geocoder. The truth data supports this, we pick up ~20% of the remaining locations with zero losses:

-    208 / 269 = 77.32% of locatable images correctly located.
+    221 / 269 = 82.16% of locatable images correctly located.
-      7 / 215 = 3.26% incorrectly located.
+      7 / 228 = 3.07% incorrectly located.

This dramatically reduces reliance on Google for geocoding. I've also removed most all the per-item logging, at least for now. The net effect is that geocode.py is pretty fast! Only ~5s to geocode the ~41k entries in images.ndjson.

TODO:

@danvk danvk marked this pull request as ready for review November 24, 2024 14:50
@danvk danvk merged commit 5b502eb into master Nov 24, 2024
4 checks passed
@danvk danvk deleted the five-boro-osm branch November 24, 2024 15:24
@danvk danvk mentioned this pull request Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant