-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to handle Bioportal normalization #16
Conversation
@@ -406,3 +399,48 @@ def load_sssom_maps(maps) -> tuple: | |||
print(f"Loaded {len(cat_map)} category mappings.") | |||
|
|||
return (id_map, cat_map) | |||
|
|||
|
|||
def obo_handle(old_id: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should really be fixing the sources rather than writing such exception handling code.. is there a way we can get a report of all "exceptions" fixed this way so we can try to correct them in the ontologies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love a way to automate fixing this across the original ~1,000 Bioportal entries (because that's mostly what this is for), but for now, all IDs are written out to one of three different reports, as needed:
- IDs of unexpected format, e.g.,
ID
OBO:ExO_0000030
OBO:ExO_0000151
OBO:ExO_0000152
- IDs with remapped categories
Old ID New Category
OBO:ExO_0000030 biolink:NamedThing
OBO:ExO_0000151 biolink:NamedThing
OBO:ExO_0000152 biolink:NamedThing
- IDs with remapped IDs
Old ID New ID
OBO:ExO_0000030 EXO:0000030
OBO:ExO_0000151 EXO:0000151
OBO:ExO_0000152 EXO:0000152
So that last report would be most useful for finding the easily-solved exceptions, but the first report may also contain some candidates for repair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remind me, why are these not correctly understood to be: ExO:0000030
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, it's to align with the Bioportal ID (EXO). I'm thinking about adding a profile option to use "OBO mode" so the Bioportal prefixes can still be used for mapping but will be normalized to the preferred forms like ExO
.
--contexts
) to provide one or more prefix contexts to use