-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add check to determine if graph uses all avaliable parameters #86
Conversation
def _kind_of_provider(p: Callable[..., Any]) -> str: | ||
if _qualname(p) == f'{_qualname(Pipeline.__setitem__)}.<locals>.<lambda>': | ||
return 'parameter' | ||
if _qualname(p) == f'{_qualname(Pipeline.set_param_table)}.<locals>.<lambda>': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be better to make a dummy Pipeline
with a parameter and a table
and retrieve their names, and use them to check these instead?
Or if you add the passing case unit test I mentioned, it might be fine like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand what you mean. Do you mean to compute the expected _qualname
s for each of the different kinds of providers, instead of hardcoding something like f'{_qualname(Pipeline.set_param_table)}.<locals>.<lambda>'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No sth like..
_DUMMY_PIPELINE = sl.Pipeline(params={int: 0})
_PARAMETER_PROVIDER_NAME = _DUMMY_PIPELINE._get_provider(int)[0].__qualname__
pipeline[Unused] = 'Should raise' | ||
assert pipeline.compute(float) == 1.5 | ||
with pytest.raises(RuntimeError): | ||
assert pipeline.compute(float, check=True) == 1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a test that pass,
even if the compute
passes with the check=True
?
@@ -710,6 +725,8 @@ def get( | |||
During development and debugging it can be helpful to use a handler that | |||
raises an exception only when the graph is computed. This can be achieved | |||
by passing :py:class:`HandleAsComputeTimeException` as the handler. | |||
check: | |||
Checks if the constructed graph uses all available parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #43, detecting missing providers was also mentioned, so maybe we need a more specific name for unused parameters check here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need a more meaningful name.
if check: | ||
all_providers = chain( | ||
( | ||
(p[c], v) | ||
for p in self._subproviders | ||
for c, v in self._subproviders[p].items() | ||
), | ||
((p, v) for p, v in self._providers.items()), | ||
) | ||
|
||
for p, v in all_providers: | ||
if p not in graph and _kind_of_provider(v) == 'parameter': | ||
raise RuntimeError( | ||
'Graph was expected to use all available parameters.' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we take a step back, I am not sure this is sufficiently capturing the problem. In general, I think what we want to do is check whether a graph is connected, I suppose? Raising errors about individual disconnected pieces may not be meaningful?
Instead of the check
flag and handling things via exceptions, would it be better to return some list (or other data structure) of disconnected pieces of the graph? This might be a separate method, instead of extending the compute
and get
methods with keyword args.
See #74 (comment), I suggest to put this on hold so we can have a think. |
Fixes #43.
I wasn't sure if this should be added to the task graph rather than to the pipeline. What do you think?
I think there is also an opportunity here to extract the functionality of retrieving all providers to a helper method, it's also used in the
_repr_html_
for example, but I didn't want to complicate things for now.