-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexing behavior when a sub-object pair is linked by multiple relations #35
Comments
Why not just make different associations? Doesn't each have it's own evidence/provenance etc? |
@cmungall could you clarify your suggestion? One document per association could lead to a lot of additional documents since we infer across variants; some genes have a lot of causal variants for a disease (eg BRCA). One document per relation is possible, but IMO we'll still be showing too much duplication to the user (or operating on it in ontobio). As a potential workaround for G2D, I have split up causal vs non causal associations. This way they can be displayed separately to our end users. The downside is that there will be some redundancy between the two gene-disease lists, as CTD and Coriell will often report he causal gene in additional to those with more hypothetical evidence. |
I think your solution is on the right lines. I think having a smaller set of relationship types where we separate evidence from relation ("likely pathogenic" should not be a relation) should in theory mean high quality resources should not generally conflict |
The relation that maps to ACMG likely_pathogenic is all in yaml file(s), so it's an easy change when we're ready. Thinking about this from the UI perspective, should we have one list of causal genes, and one list of all genes so that the latter list fully subsumes the list of causal genes (instead of partially overlapping sets)? |
I don't have strong opinions about the UI so long as it's clear.
I had envisioned on the disease page showing the causal gene prominently
(first entry in table, if we have a table view) and others beneath that
…On Tue, Feb 19, 2019 at 2:27 PM Kent Shefchek ***@***.***> wrote:
The relation that maps to ACMG likely_pathogenic is all in yaml file(s),
so it's an easy change when we're ready.
Thinking about this from the UI perspective, should we have one list of
causal genes, and one list of all genes so that the latter list fully
subsumes the list of causal genes (instead of partially overlapping sets)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#35 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADGOaDPFhXkEyHiH6qKvcSheki6TFJjks5vPHpAgaJpZM4QlCgv>
.
|
Consider the following pattern:
(subject:gene)<-[has_locus]-(variant)-[relation]->(object:disease)
Where relation is one of:
...
In many cases, multiple variants of a single gene are linked to a disease via multiple relations (commonly pathogenic and likely pathogenic). Currently, the solr loader seems to pick a relation at random (although this may not be the case and it may in fact be deterministic for a given db).
This is also an issue with combining orthology statements from multiple sources (panther and zfin) where panther specifies whether two orthologs have a 1 to 1 relationship whereas zfin does not.
One option is to store the set of relations linking two nodes. Another option would be to configure a relation priority, where the relation with the highest priority is designated while the others are retrievable via the evidence graph.
@mbrush @selewis @cmungall thoughts?
The text was updated successfully, but these errors were encountered: