Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove duplicates from taxonomic spine #214

Closed
FedorSteeman opened this issue Dec 9, 2022 · 2 comments
Closed

Remove duplicates from taxonomic spine #214

FedorSteeman opened this issue Dec 9, 2022 · 2 comments
Assignees
Labels
1 priority 1
Milestone

Comments

@FedorSteeman
Copy link
Contributor

FedorSteeman commented Dec 9, 2022

Issue

Once the auto-merging as per #74 has been done, we need a process for remove these duplicates also from taxonomic spine stored in local app database.

Remarks

Removing duplicates in a taxonomy is non-trivial as duplicates can be hard to identify due to slight spelling variations.
In a database system such as Postgres you have the option of finding close matches by using Levenschtein distance or other string metrics for identifying duplicate candidates.

@PipBrewer PipBrewer added the 1 priority 1 label Dec 14, 2022
@FedorSteeman FedorSteeman self-assigned this Dec 21, 2022
@FedorSteeman
Copy link
Contributor Author

Prerequisite is a smarter way to handle database as per #185

@FedorSteeman FedorSteeman added this to the Sprint 14 milestone Dec 21, 2022
@FedorSteeman
Copy link
Contributor Author

Solved in conjunction with #185

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 priority 1
Projects
Development

No branches or pull requests

2 participants