Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve preferred names by boosting certain prefixes in choosing preferred name #210

Closed
wants to merge 6 commits into from

Conversation

gaurav
Copy link
Collaborator

@gaurav gaurav commented Dec 4, 2023

This PR starts the process of improving preferred names, including:

  • Synonym files used to use only the label of the most preferred identifier. This PR changes that so that it uses the label of the first identifier with a label. Should close Middigital hair (OMIM:157200) does not have a preferred_label in Babel 2023may18 #142.
  • A first stab at implementing "boosted prefixes" configured by Biolink type, so that we can specify that for e.g. biolink:ChemicalEntity, we should boost certain prefixes when it comes to chooses preferred_name.
    • The idea for doing this comes from reconsider name choices #158 (comment)
    • Note that this will only set the preferred_name for NameRes, not NodeNorm, which calculates it on the fly. Changing that will either require adding the preferred label to the database or to duplicate this algorithm there. I prefer the former approach, but that's definitely something we'll need to consider in the next release.

The current preferred name boost prefixes are:

  "preferred_name_boost_prefixes": {
    "biolink:ChemicalEntity": [
      "DRUGBANK",
      "GTOPDB",
      "DrugCentral",
      "CHEMBL.COMPOUND",
      "RXCUI",
      "PUBCHEM.COMPOUND",
      "CHEBI",
      "HMDB"
    ]
  }

@gaurav gaurav force-pushed the pick-better-preferred-names branch from 792e20d to 4d8d9c2 Compare December 10, 2023 03:40
@gaurav gaurav changed the base branch from babel-1.3 to master December 10, 2023 04:08
@gaurav gaurav force-pushed the pick-better-preferred-names branch from 4d8d9c2 to 6a131bd Compare December 10, 2023 04:09
@gaurav gaurav mentioned this pull request Dec 10, 2023
@gaurav gaurav changed the title Improve preferred names Improve preferred names by boosting certain prefixes in choosing preferred name Dec 14, 2023
@gaurav gaurav marked this pull request as ready for review December 14, 2023 03:29
@gaurav gaurav requested a review from cbizon December 14, 2023 03:29
@gaurav gaurav removed the request for review from cbizon January 29, 2024 03:21
@gaurav
Copy link
Collaborator Author

gaurav commented Jan 29, 2024

This ended up getting merged in PR #201. I'll move one minor update into PR #235 and close this PR.

@gaurav gaurav closed this Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Middigital hair (OMIM:157200) does not have a preferred_label in Babel 2023may18
1 participant