-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of event data in NXdetector #123
Conversation
Is the dense data typically used as a histogram of the events? If so, there should be an option to select which one to load. Or at least a postprocessing function. |
@@ -84,15 +86,24 @@ def _init_signal(self, name: Optional[str], children): | |||
if name is not None and name in children: | |||
self._signal_name = name | |||
self._signal = children[name] | |||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fishy. There are now 3 blocks that each can set self._signal
regardless of whether it is already set. So if more than one signal-detection method succeeds, only the result of the last will be used.
Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is intentional and I fear also required (even though I am far from happy about this), and not new (just added one more case, in a sense). For example, McStas NeXus files use the "legacy" mechanism for defining the signal. However, for some reason (bad luck?) they also define the non-legacy signal
group-attr, but with an unrelated value. Therefore, the search continues, even if that is found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So can all 3 methods appear in the same file? Or only the 2 that McStas uses?
Can you reverse the order and return after the first signal was found? This would make it clearer that there are multiple, independent encodings. As it stands, one has to read everything in detail to understand that they don't depend on each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And can you log which one gets used to help with debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScippNexus does not log. Printing warnings is also quite user-unfriendly, as there can easily be hundreds of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting to change the current behavior?
No. Why would this change behaviour? From what I understood, the 3rd block will override the signal set earlier (if it finds one). And it will always set aux signals. So moving this to the top and returning if it find a signal should be the same behaviour, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understood, the 3rd block will override the signal set earlier (if it finds one). And it will always set aux signals. So moving this to the top and returning if it find a signal should be the same behaviour, no?
No. It does not override previously found signals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see. How about
self._aux_signals = self._find_aux_signals()
if self._signal is None:
self._signal_name = self._aux_signals.pop(0) # or w/o 0 if the order is irrelevant
self._signal = children[self._signal_name]
to make it more explicit?
I keep asking for changes because I am, as you can tell, confused by this function and the precise logic behind it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am considering changing the mechanism such that a user has to select via definitions which "version" of the standard they want to use. The would prevent support for mixed files though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add comments on #126, as that changed this function some more.
src/scippnexus/v2/nxdata.py
Outdated
sizes = dict(self._grouping.sizes) | ||
sizes.update(self._event_data.sizes) | ||
return sizes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sizes = dict(self._grouping.sizes) | |
sizes.update(self._event_data.sizes) | |
return sizes | |
return {**self._grouping.sizes, **self._event_data.sizes} |
def __init__(self, event_data: Group, grouping_name: str, grouping: Field) -> None: | ||
self._event_data = event_data | ||
self._grouping_name = grouping_name | ||
self._grouping = grouping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what 'grouping' means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a docstring.
I guess that the dense data is typically just meant as a "preview" of what you would get when histogramming the events. But good luck finding that documented anywhere. It is not even clear if having both is legal. |
Based on your answer, I suspect that it will be most useful to return the events as a data array and log that there was dense data that was not read. But this is just a hunch |
This resolves a number of inconsistencies and broken things. For example, slicing pixels works now. Behavior wise, we are now closer to the pre-v2 implementation.
When an NXdetector contains both dense data and events, this is not returning a
Dataset
. I don't know if this is useful in practice, or if we should just go back to returning the events by default?