-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application of Mondo mapping file is turning MONDO IDs into OMIM instead of the reverse #721
Comments
I think this merits some more careful thinking! One idea is to define a flip method which respects the semantics of the mapping relations and knows about the rest of the sssom metadata as well. I think for now (so you don't have to wait for tooling), it is probably a good idea to by fault-tolerant and build some simple aspects of flipping into your own code base (scan object and subject properties for prefixes, and then flip |
I closed other issues as duplicates of this one, and I'll continue from here. I thought it might be useful to bring in this table from monarch-initiative/monarch-ingest#360
|
Also interesting, is that the reverse is happening, G2D associations that are MONDO to MONDO:
|
Where are these g2ds coming from? The OMIM ids at least at a quick glance are diseases. Check this: https://omim.org/geneMap/12/657?start=-3&limit=10&highlight=657 It seems the gene column even links to a disease identifier. @sabrinatoro help :P |
🕵️ morbidmap has:
Which monarch ingests's omim_gene_to_disease turns into:
608520 isn't in the hgnc file (which has an OMIM column), and it's not in the monarch-gene-mapping SSSOM file. It is in mondo.sssom.tsv, but then, it's a disease, it should be:
My assumption has been that the bug I'm solving is that I shouldn't be using mondo's SSSOM to replace IDs in the subject column of a G2D association...but maybe the problem is that the OMIM ingest shouldn't even be making this association? If the mondo mapping weren't applied here, it would stay an OMIM ID and then get dropped as a dangling edge, which I think would be ok - though - handling that more explicitly upstream in the Koza transform is probably a cleaner solution. |
I dont know.. My feeling is that this is a high priority issue that should be discussed in data call next week.. Can you add to agenda? It seems so weird that that i cant just universally apply a sssom file to a KG rewiring process.. I am sure @cmungall would also be against contextualising the applicability of sssom files on certain ingestibles. |
Detective work on one of the HGNC to HGNC g2d edges.
starts in morbidmap as:
This is one of the few cases in monarch ingest where we're still using a Koza map, for mim2gene:
The row in mim2gene is:
monarch-gene-mapping uses the hgnc file to produce:
Which ends up picking up both sides of the association, and So, again in this case, is the problem that this isn't actually a g2d association in mordbidmap? |
Dipper's parsing of mimTitles to get type information seems incredibly important, but is missing from the monarch-ingest OMIM ingest.
|
Next up:
|
This is likely a fix that will need to happen to cat_merge, which applies mapping, or to the gene_mapping repo to change the order there, but it seems reasonable to start with an issue in this repo as the umbrella.
We're generating gene_mappings with the subject as the ID that we want to convert from, and the object as the ID that we want to convert to. So we do:
NCBI:123 skos:exactMatch HGNC:456
And the cat_merge rewiring code converts any subject or object with an NCBI:123 ID to HGNC:456.
The Mondo sssom file goes in the other direction, the subject is always a mondo ID, and the object is the ID that we're converting from.
I'm guessing the most sensible thing is to convert out gene mapping files to match the mondo mapping file, and then update cat_merge so that it will replace the mapping file object with the mapping file subject.
@matentzn What do you think? Swapping the order feels convenient, but my gut tells me that the subject/predicate order in the SSSOM file shouldn't matter and I should instead tell my mapping apply-er what prefixes are allowed.
The text was updated successfully, but these errors were encountered: