Classifying more addresses for norway #171
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
👋 Adding more street types and directionals and activating the
CompoundStreetClassifier
andDirectionalClassifier
for Norway.Here's the reason for this change 🚀
We have a national data set of addresses containing almost 2.5 million addresses.
We have found that
pelias-parser
is able to classify most of them as addresses. 💯There are almost 120,000 addresses not classified as addresses and as result address layer is filtered out by the
pelias-api
.Norway has a lot of compound street names. And
CompoundStreetClassifier
is not activated for the Norwegian dictionary(nb).By just activating the
CompoundStreetClassifier
for Norway we will be able to classify almost 52000 more addresses. 🎉Norwegian addresses also have the directional tokens. We have activated the
DirectionalClassifier
and added some more directional token in Norwegian dictionary(nb).In addition to that we have extended the list of street_types.
This helps in classifying the almost additional 20000 addresses. 🎉
We still have almost 50000 addresses in norway which the pelias-parser fails to classify as addresses. 😞
Created the following separate issues for discussing solution to those addresses.
Option to do the address parsing for a specific country
Option to not use the WhosonFirstClassifier for AddressParser