-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compute_mapped
#170
Add compute_mapped
#170
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider how you could integrate this functionality into compute
? E.g., check whether a given key is mapped and then do the equivalent of compute_series
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like that, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok for a prototype. But in the long run, I would prefer merging this with compute
. And I would also prefer not using pandas if possible because users might not use pandas either.
This is not meant to be a prototype.
We could use a plain
It is not yet clear to me that this is a good idea, so I refrained from expressing a preference at this point. |
"\n", | ||
"**Note**\n", | ||
"\n", | ||
"[compute_mapped](https://scipp.github.io/sciline/generated/functions/sciline.compute_mapped.html) depends on Pandas, which is not a dependency of Sciline and must be installed separately, e.g., using pip:\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link and the link below should be relative links.
|
||
candidates = [ | ||
node | ||
for node in graph._cbgraph.graph.nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that this uses a protected attribute of graph
. Should get_mapped_node_names
be a method of Pipeline
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely something that we need to consider for a lot of other functionality (in particular around planner graph operations) that we intend to implement. Either we need to make those properties accessible on the public interface, or potentially add a lot more methods to Pipeline
. Right now I cannot say which is better? So unless you can clearly say which solution should be chosen, I'd like to keep this as it is for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what operations you have in mind that need direct access to the graph. But I would say that we should expose a minimal set of primitive operations that can be composed in, e.g., methods of workflow classes.
src/sciline/pipeline.py
Outdated
|
||
|
||
def get_mapped_node_names( | ||
graph: Pipeline, key: type, index_names: Sequence[Hashable] | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a graph or a pipeline? We should be consistent with names.
@@ -194,3 +199,93 @@ def bind_and_call( | |||
def _repr_html_(self) -> str: | |||
nodes = ((key, data) for key, data in self._graph.nodes.items()) | |||
return pipeline_html_repr(nodes) | |||
|
|||
|
|||
def get_mapped_node_names( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'name' or 'key'? What is the difference?
src/sciline/pipeline.py
Outdated
node for node in candidates if set(node.indices) == set(index_names) | ||
] | ||
if len(candidates) == 0: | ||
raise ValueError(f"'{key}' is not a mapped node.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this check be before filtering by index names? It seems that key
refers to a mapped node but that is filtered out by the index names.
Co-authored-by: Jan-Lukas Wynen <[email protected]>
This fixes a big UX hurdle when computing results that use mapped nodes.
We will later on consider if this should be integrated more tightly with
Pipeline.compute
, for now this should serve as a solution that is 80% there an can be used to gather more insights/experience.