-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy instantiate a node with unique property #353
Comments
In what sense? All nodes have to have a node id associated with them and node ids are expected to be consistent at least within the context of a session. Node ids would be unreliable if business logic assumed that a node id never changes for as long as a node remains stored in a database instance. From the point of view of Neomodel, the way you demonstrate This could be a use-case for a batched mode way of running operations on a database though. As long as
What / where is the
Basically, what you are saying is that any property with a unique constraint should be used to fetch a node because it is guaranteed that it will reference that node uniquely (?). Would your use-case require this kind of referencing at the level of This kind of thing sounds useful (to me at least) but the description is probably a bit simplistic at the moment. The scenario needs to become a bit more specific (what if there are more than one Just as a side note which you might not find entirely helpful. There are certain things you can do with Neomodel and others that you currently cannot but could do with other ways. For those things that you cannot do with Neomodel at the moment (and provided that they are not bugs) it would be useful to capture how you went about solving it in your specific use-case and whether it could be turned into a generic solution expressed over the Graph data model. This could then be reviewed on how it fits with the current way Neomodel works and see how it could be included in an upcoming release. |
Indeed, our basic use case is a SQL record associated with a Neo4J node. Therefore I need to store an unique reference to this Neo4J node.
Exactly. Main issue I'm trying to raise is that the Neo4J recommended way to store a persistent reference to a Neo4J node is to use such an unique property. Yet the only way in neomodel to inflate a node without fetching it first is to use internal ids (thanks to .inflate). On the contrary, the only way to perform batch merges without fetching the nodes first is to use an unique id property (using .create_or_update). Internal ids won't work in this scenario as stated previously. This sounds inconsistent to me and causes important performance issues on large batch operations. To sum it all up:
This API could solve those limitations:
In case of multiple unique properties, this approach would still work as I'm inserting the name of the property in all those calls. What are your thoughts? |
I have been working along the lines of this in a different package that makes use of neomodel and (I am hoping) I will be able to release soon. However, there is something we have briefly discussed with the rest of the people involved in Neomodel around "Batch Operations". This is nothing more than CRUD calls that appear to return immediately because they don't really apply the changes to the database but simply add the transaction to a list and return. The main reason for this is that when you know that you have large quantities of operations then you can take this into account when translating operations to queries (unwind lists of values, execute operations in transactions, etc). Once the size of this list reaches a user-defined level, the operations are "auto-committed" to the database. This can be done in an asynchronous or blocking way (the default is blocking). At the moment, these "batch" operations cover Creation of nodes and Creation of relationships. The way these are implemented respects the principles behind Neomodel. So, re-use the abstract objects to communicate data back and forth to the backend rather than raw queries, re-use established (and very sensible) data structures and operations and extend them. Updating properties en masse looks like another broad pattern we can bring in to Neomodel, possibly within the context of "Batch Operations". I think that the point about So:
@robinedwards would be interesting to get your thoughts on this too at your convenience. |
@lerela Shall we close this for the moment? Did |
You can close the ticket but I don't feel .nodes.get() solves my use case. It returns the expected node indeed, but to update a node it's still making one unneeded query (since the uid was enough to uniquely identify the node in the MERGE query).
So instead of : fetching one node, waiting for the Neo4J response, fetching another node, waiting for the Neo4J response, then creating the relationship; we'd only have one request. Yet I understand your points and that it might require quite a few changes in different places, but implementing more batch operations and the ability to manipulate a node without fetching it when there's no need to do so would greatly improve performance in our use cases. |
@lerela There are two things here:
All the best |
Thanks for the tips, I'll definitely try those! I'm already wrapping queries in transactions but of course things are slower when Neo4J response must be awaited. I can work on specifying more thoroughly the requirements but the minimal API I described above sums up most of the needs. If my company can allocate time to develop those we'll do it for sure but it'd probably be helpful to get some feedback from the neomodel team first in order to give us some clues about the implementation and make sure it follows your standards and expectations so that it can be merged at some point. |
Can we unpick this a bit please, I am not sure I get it. What is the "problem" here? The fact that the OGM returns a complete record for an operation when it could be ignoring the response from the server? Or something else? Is your server on a different machine or the same one? What sort of delays are we talking about and at what point? (Is it possible to profile?) EDIT: Sorry, forgot to add: Regarding the rest, I understand, I had similar concerns before I took the plunge to work with Neomodel. |
@lerela I just had another read through this thread. After this, you can now do I feel that this is the least amount of steps you could perform and at the same time ensuring that the operation will indeed run on an existing node. What I think you are describing here is something like If that is correct (?), then this series of operations does not guarantee that a Node with But, here is what I am thinking:
All the best |
Hello,
To delete a node without loading it first (because why load it to delete it?) one can do:
However using the internal Neo4j id is a bad practice. Also, this approach fails for updates as the query built by neomodel does not use the node id in account, meaning that:
fails to update the object (it creates a new one).
The recommended way to reference Neo4j objects is to use uids, however neomodel does not support lazy inflating objects through their uids for instance with:
All logic currently rely on the internal Neo4j
id
to be loaded which requires a database hit to be fetched from the uid. This database hit is unnecessary in many cases (delete, merges) and leads to huge performance impact when trying to update a batch of nodes (say, setting the properties of 10,000 nodes of which we have the ids/uids).I believe that being able to lazy inflate a node through an unique property and perform queries that rely on it instead of
id
would solve this, what are your thoughts?The text was updated successfully, but these errors were encountered: