Replies: 2 comments
-
Very important question - I think I would like to hear @deepakunni3 (and by proxy Chris) first. My very initial thought is: named graphs is weird in neo4j and maybe we should first decide what exactly we mean by LPG - anything that can be queried in cypher? Anything that can be queried in neo (think graphlib). In any case -> provenance on node properties is a huge issue -> not sure how much I want to conflate this with the question here. Maybe you can walk us through your thoughts in the next meeting! |
Beta Was this translation helpful? Give feedback.
-
I think this ends up being out of scope for koza, although we're still working out how exporting to kgx will work. |
Beta Was this translation helpful? Give feedback.
-
Our tsv serialization aims to be compatible with property graphs that support node and edge properties as scalars (string, int, float) and list properties (eg neo4j).
Background:
Given a triple in a named graph, we can model this in a property graph in two ways:
Triple:
named_graph: gene:A RO:has_phenotype HP:phenotype
As node properties:
(
id: "gene:A"
has_phenotype: "HP:phenotype"
defined_by: "named_graph"
)
as an edge:
(id: "gene:A") - [has_phenotype {defined_by: "named_graph"}] -> (id: "HP:phenotype")
The node property approach does not scale well for tracking provenance and metadata for attributes, or complex objects - for example:
In many cases it useful to store data as both properties and as new nodes.
Previously this was partially hardcoded in scigraph and a configuration option
This also doesn't touch on edge properties linked to objects. However, theres no great way around this unless we want to store associations as nodes again (see the scigraph transform as an example). So I think it's best to enforce that edge properties have to be primitives.
cc @TomConlin @matentzn @deepakunni3
Beta Was this translation helpful? Give feedback.
All reactions