Indexing behavior when a sub-object pair is linked by multiple relations #35

kshefchek · 2017-11-20T23:08:23Z

Consider the following pattern:

(subject:gene)<-[has_locus]-(variant)-[relation]->(object:disease)

Where relation is one of:

pathogenic
likely pathogenic
has phenotype
marker/mechanism
contributes to
...

In many cases, multiple variants of a single gene are linked to a disease via multiple relations (commonly pathogenic and likely pathogenic). Currently, the solr loader seems to pick a relation at random (although this may not be the case and it may in fact be deterministic for a given db).

This is also an issue with combining orthology statements from multiple sources (panther and zfin) where panther specifies whether two orthologs have a 1 to 1 relationship whereas zfin does not.

One option is to store the set of relations linking two nodes. Another option would be to configure a relation priority, where the relation with the highest priority is designated while the others are retrievable via the evidence graph.

@mbrush @selewis @cmungall thoughts?

cmungall · 2018-12-13T17:49:03Z

Why not just make different associations? Doesn't each have it's own evidence/provenance etc?

kshefchek · 2019-02-18T17:23:16Z

@cmungall could you clarify your suggestion? One document per association could lead to a lot of additional documents since we infer across variants; some genes have a lot of causal variants for a disease (eg BRCA). One document per relation is possible, but IMO we'll still be showing too much duplication to the user (or operating on it in ontobio).

As a potential workaround for G2D, I have split up causal vs non causal associations. This way they can be displayed separately to our end users. The downside is that there will be some redundancy between the two gene-disease lists, as CTD and Coriell will often report he causal gene in additional to those with more hypothetical evidence.

causual g2d

hypothetical g2d - gwas, ctd, coriell

cmungall · 2019-02-19T16:36:58Z

I think your solution is on the right lines. I think having a smaller set of relationship types where we separate evidence from relation ("likely pathogenic" should not be a relation) should in theory mean high quality resources should not generally conflict

kshefchek · 2019-02-19T22:27:12Z

The relation that maps to ACMG likely_pathogenic is all in yaml file(s), so it's an easy change when we're ready.

Thinking about this from the UI perspective, should we have one list of causal genes, and one list of all genes so that the latter list fully subsumes the list of causal genes (instead of partially overlapping sets)?

cmungall · 2019-02-19T23:07:40Z

I don't have strong opinions about the UI so long as it's clear. I had envisioned on the disease page showing the causal gene prominently (first entry in table, if we have a table view) and others beneath that

…

On Tue, Feb 19, 2019 at 2:27 PM Kent Shefchek ***@***.***> wrote: The relation that maps to ACMG likely_pathogenic is all in yaml file(s), so it's an easy change when we're ready. Thinking about this from the UI perspective, should we have one list of causal genes, and one list of all genes so that the latter list fully subsumes the list of causal genes (instead of partially overlapping sets)? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#35 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADGOaDPFhXkEyHiH6qKvcSheki6TFJjks5vPHpAgaJpZM4QlCgv> .

monicacecilia · 2019-07-02T00:50:53Z

Adding a little reminder that Chris' suggestion is still not implemented. Instead, we have a list of all genes, and the causal gene in this, our favorite example, shows up 6th on the list.

kshefchek mentioned this issue Nov 29, 2018

Quality control on gene-disease assocs monarch-initiative/dipper#685

Closed

kshefchek mentioned this issue Feb 15, 2019

Show relationship type in G2D and other pages monarch-initiative/monarch-ui#43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indexing behavior when a sub-object pair is linked by multiple relations #35

Indexing behavior when a sub-object pair is linked by multiple relations #35

kshefchek commented Nov 20, 2017 •

edited

Loading

cmungall commented Dec 13, 2018

kshefchek commented Feb 18, 2019

cmungall commented Feb 19, 2019

kshefchek commented Feb 19, 2019

cmungall commented Feb 19, 2019 via email

monicacecilia commented Jul 2, 2019 •

edited

Loading

Indexing behavior when a sub-object pair is linked by multiple relations #35

Indexing behavior when a sub-object pair is linked by multiple relations #35

Comments

kshefchek commented Nov 20, 2017 • edited Loading

cmungall commented Dec 13, 2018

kshefchek commented Feb 18, 2019

cmungall commented Feb 19, 2019

kshefchek commented Feb 19, 2019

cmungall commented Feb 19, 2019 via email

monicacecilia commented Jul 2, 2019 • edited Loading

kshefchek commented Nov 20, 2017 •

edited

Loading

monicacecilia commented Jul 2, 2019 •

edited

Loading