Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DrugChemical conflation typing #266

Merged
merged 7 commits into from
Apr 22, 2024
Merged

Fix DrugChemical conflation typing #266

merged 7 commits into from
Apr 22, 2024

Conversation

gaurav
Copy link
Collaborator

@gaurav gaurav commented Apr 16, 2024

We previously used a randomly chosen identifier from each DrugChemical conflation to choose the Biolink type for the entire conflation, which would also determine the order of prefixes within the conflation. This lead to issues where we used an RXCUI to determine that a conflation should be considered a biolink:Drug, when really biolink:SmallMolecule would be a better type. Instead, this PR replaces that approach with a preferred-type approach.

Also replaces COMPLEX_CHEMICAL_MIXTURE with COMPLEX_MOLECULAR_MIXTURE, which is what Biolink calls it now.

Closes #264. Should be merged after PR #227.

@gaurav gaurav changed the base branch from master to add-pubmed-ids April 16, 2024 04:42
@gaurav gaurav requested a review from cbizon April 16, 2024 08:04
@@ -9,7 +9,7 @@
CELLULAR_COMPONENT = 'biolink:CellularComponent'
CHEMICAL_ENTITY = 'biolink:ChemicalEntity'
CHEMICAL_MIXTURE = 'biolink:ChemicalMixture'
COMPLEX_CHEMICAL_MIXTURE = 'biolink:ComplexMolecularMixture'
COMPLEX_MOLECULAR_MIXTURE = 'biolink:ComplexMolecularMixture'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is admittedly not exactly the envisioned use case but the idea of these consts was so that when a 1:1 change occurred here you could just change the string and save yourself many edits throughout the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know! But I took advantage of the centralized constants to bulk-rename one that had fallen out of sync with the corresponding Biolink type. If it had been a large change I would have proposed it in its own PR, but since it was only used in a handful of places and was related to the overall PR changes, I decided to incorporate it in here. For future changes, I'll update the string first and only update the constant name once it's pretty clear Biolink isn't going to change it again any time soon.

Base automatically changed from add-pubmed-ids to master April 22, 2024 16:47
@gaurav gaurav marked this pull request as ready for review April 22, 2024 16:47
@gaurav gaurav merged commit 1db19eb into master Apr 22, 2024
@gaurav gaurav deleted the fix-conflation-typing branch April 22, 2024 16:47
@gaurav gaurav mentioned this pull request Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Water results in "WATER O 15" (PUBCHEM.COMPOUND:10129877) in NameRes because of a conflation issue
2 participants