Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classifying more addresses for norway #171

Merged

Conversation

mansoor-sajjad
Copy link
Contributor

@mansoor-sajjad mansoor-sajjad commented Jan 10, 2023

👋 Adding more street types and directionals and activating the CompoundStreetClassifier and DirectionalClassifier for Norway.


Here's the reason for this change 🚀

We have a national data set of addresses containing almost 2.5 million addresses.
We have found that pelias-parser is able to classify most of them as addresses. 💯
There are almost 120,000 addresses not classified as addresses and as result address layer is filtered out by the pelias-api.

Norway has a lot of compound street names. And CompoundStreetClassifier is not activated for the Norwegian dictionary(nb).
By just activating the CompoundStreetClassifier for Norway we will be able to classify almost 52000 more addresses. 🎉

Norwegian addresses also have the directional tokens. We have activated the DirectionalClassifier and added some more directional token in Norwegian dictionary(nb).
In addition to that we have extended the list of street_types.
This helps in classifying the almost additional 20000 addresses. 🎉

We still have almost 50000 addresses in norway which the pelias-parser fails to classify as addresses. 😞
Created the following separate issues for discussing solution to those addresses.
Option to do the address parsing for a specific country
Option to not use the WhosonFirstClassifier for AddressParser

mansoor.sajjad added 2 commits January 3, 2023 12:31
Copy link
Member

@Joxit Joxit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for your contribution.

Please add your synonyms in this folder instead : https://github.com/pelias/parser/tree/master/resources/pelias/dictionaries/libpostal/nb

We want to separate custom entries

Add also some tests in order to check if everything is working and for us to know how to write a Norwegian address 😄

@mansoor-sajjad mansoor-sajjad requested a review from Joxit January 28, 2023 20:21
Copy link
Member

@Joxit Joxit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for your changes 😄

@Joxit Joxit merged commit ff961cc into pelias:master Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants