-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate possible missing orthology #923
Comments
Figure and analysis by Peter, tagging him since he is the person most interested in this. @hansenp |
Right now I'm adding ZFIN's curated orthology, which should give us the best possible connections between human and zebrafish, I'm also planning to fix the missing XB-GENEPAGE to XB-GENE mappings that will give us XenBase's own orthology. I just ran across issues in monarch-ingest where we looked at Panther's dangling edges: monarch-initiative/monarch-ingest#446 & #351 What we missed in 351 was that the counts in Panther didn't change, even though the counts of what came out of our ingest did change, which will probably require a careful tracing through of the utils functions related to the ingest to see if there is a filtering that happens. I didn't include my methodology for the counting (boo past me!) there, the presence of |
Separately, I did some looking into DIOPT updates. They claim a 2021 build, but are missing ZFIN orthology that existed in 2020. It's wonderful when we can pull from an aggregator to get many sources in one easy go, but I don't think it's a great idea when the update cadence is so irregular. |
Slightly related: we asked ourselves what happens when running a reduced query (just pasting here the important part):
we get the following number of items:
whereas with the full query this changes to:
Is this a bug or a feature of uPheno? Sorry if this is trivial, thanks! @matentzn |
About 70% of our Panther ingest ends up in the dangling edges bin, and @leokim-l noticed that in a process that they're running that it seemed like we may have low orthology coverage between human genes and genes from species other than mouse.
Additional info:
This is the neo4j query that is showing few results for species other than mouse:
and here is the visualization showing the difference in counts
The text was updated successfully, but these errors were encountered: