Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not treat missing ancestor as input mismatch #14

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jl-wynen
Copy link
Member

This should fix scipp/sciline#180.

If new_branch has no ancestors, it should be safe to insert it and use the ancestors of self.

@SimonHeybrock test_setitem_fails_when_grandparents_change is failing. I think this test should pass. But in __setitem__, we get intersection_nodes = set() because

        if branch in self.graph:
            graph = _remove_ancestors(self.graph, branch)
            graph.nodes[branch].clear()

removes all ancestors but 'c' such that

intersection_nodes = set(graph.nodes) & set(new_branch.nodes) - {branch}
# equals
intersection_nodes = {'c'} & {'a1', 'b', 'c'} - {'c'}

Is this how it should behave? Or is there currently a bug?

Comment on lines +433 to +434
new_pred = new_branch.pred[node]
if new_pred and graph.pred[node] != new_pred:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generally, would it make sense to check if all keys in "new" have the same values in existing, basically if a dict update would have no effect?

@SimonHeybrock
Copy link
Member

This should fix scipp/sciline#180.

If new_branch has no ancestors, it should be safe to insert it and use the ancestors of self.

@SimonHeybrock test_setitem_fails_when_grandparents_change is failing. I think this test should pass. But in __setitem__, we get intersection_nodes = set() because

        if branch in self.graph:
            graph = _remove_ancestors(self.graph, branch)
            graph.nodes[branch].clear()

removes all ancestors but 'c' such that

intersection_nodes = set(graph.nodes) & set(new_branch.nodes) - {branch}
# equals
intersection_nodes = {'c'} & {'a1', 'b', 'c'} - {'c'}

Is this how it should behave? Or is there currently a bug?

I may be wrong, but your call to __setitem__ will completely replace b. It is therefore ok to have different data/input nodes. If you add another node in the first graph which depends on the "old" b then we would want an exception.

Please see around

def test_setitem_raises_on_conflicting_input_nodes_in_ancestor() -> None:
g1 = nx.DiGraph()
g1.add_edge('a1', 'b')
g1.add_edge('b', 'c')
g1.add_edge('x', 'c')
g2 = nx.DiGraph()
g2.add_edge('a2', 'b')
g2.add_edge('b', 'x')
graph = cb.Graph(g1)
with pytest.raises(ValueError, match="Node inputs differ for node 'b'"):
graph['x'] = cb.Graph(g2)
def test_setitem_replaces_nodes_that_are_not_ancestors_of_unrelated_node() -> None:
g1 = nx.DiGraph()
g1.add_edge('a', 'b')
g1.add_edge('b', 'c')
g1.add_edge('c', 'd')
graph = cb.Graph(g1)
g2 = nx.DiGraph()
g2.add_edge('b', 'c')
graph['c'] = cb.Graph(g2)
assert 'a' not in graph.to_networkx()
for existing tests. Do you see any that are missing?

@SimonHeybrock
Copy link
Member

Having though about this some more, I still have doubts if auto-joining at nodes other than the key in __setitem__ is conceptually sound. However, I was wondering if this is not simply what a update or compose or union function should do (not sure which one, see NetworkX docs)? I have needed one in other circumstances (but managed to work around it), so maybe we should consider adding one?

@jl-wynen
Copy link
Member Author

jl-wynen commented Oct 2, 2024

I'm in favour of using more explicitly named functions. But in this case, we should reconsider whether __setitem__ is the right choice. What does a graph contain? That is, what can you insert with __setitem__ and extract with __getitem__? And does the graph really behave like a container of these things?
I struggle to come up with an answer.

@SimonHeybrock
Copy link
Member

I'm in favour of using more explicitly named functions. But in this case, we should reconsider whether __setitem__ is the right choice. What does a graph contain? That is, what can you insert with __setitem__ and extract with __getitem__? And does the graph really behave like a container of these things? I struggle to come up with an answer.

Not really like a container in the Python sense. I have found it most useful to think about Cyclebane graphs as functions, with input nodes being function arguments, and output nodes are function return values. Then __setitem__ would "extend" the function by attaching another function at an existing function argument. Similar, __detitem__ cuts the function, yielding a new function with arguments that were previously internal variables in the "function".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Blocked
Development

Successfully merging this pull request may close these issues.

Inserting interior subgraphs fails
2 participants