Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big rewrite #117

Merged
merged 99 commits into from
Apr 5, 2023
Merged

Big rewrite #117

merged 99 commits into from
Apr 5, 2023

Conversation

SimonHeybrock
Copy link
Member

@SimonHeybrock SimonHeybrock commented Mar 24, 2023

This adds, as scippnexus.v2, the future version of ScippNexus. The trigger for this rewrite were significant performance issues with small files that contain thousands of groups and datasets. However, it also attempts to resolve some complicated logic of the old implementation, which did lead, e.g., to hard to understand stack traces in exceptions.

To use this, just try import scippnexus.v2 as snx. The basic API is the same as before.

Apart from internal design and significant performance improvements (not affecting large datasets), there are a number of changes:

  • Fallback to loading as DataGroup is not better at preserving "partial" information, if assembly fails.
  • Transformation chains are not executed and not resolved. We will provide post-processing functions for this.
  • NX*geometry classes are loaded as DataGroup without processing. We will provide post-processing functions for grouping this by detector_number, etc.
  • NXevent_data is not automatically grouped by detector_number. There is a post-processing function for this, but we will need to add some that operate recursively.
  • Error-fields are "absorbed" into data fields. Accessing them independently is not possible any more. May change my mind about this again...

For review:

  • Start with base.py. class File and its helper functions are mostly (but not completely) unchanged.
  • Continue with nxdata.py.
  • Look at the rest in any order.
  • There is some unused code in nxtransformations.py, which I plan to refactor in a follow-up into a couple of user-facing functions for resolving and executing depends_on-chains.

Bugfixes:

  • Fields with shape=(0,) are now correctly loaded as datetime if they should. Previously this always returned a time (offset).
  • Fields with shape=(0,) are now correctly loaded as transformation if they should. Previously this always returned the plain value of the transformation.

Add error handling

Fix reading of fields

Add NXobject subclass handling

Refactor for more automatic child handling

Introduce multiple map layers, I think this does not work

Refactor to new approach

Customize field dims via strategy

Refactor

Rename

Basic NXdata support

Cleanup

Absorb errors handling into Field

Cleanup

Allow for dtype override

Pass (base) definitions from parents

Translate sel for children

Bin-edge handling

Undo change to unrelated file
@SimonHeybrock
Copy link
Member Author

Should be ready for another review @jl-wynen.

Copy link
Member

@jl-wynen jl-wynen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have gone through all files now.

tests/nxtransformations_test.py Outdated Show resolved Hide resolved
src/scippnexus/v2/nxdata.py Outdated Show resolved Hide resolved
src/scippnexus/v2/nxdata.py Outdated Show resolved Hide resolved
tests/nx2_test.py Outdated Show resolved Hide resolved
tests/nx2_test.py Outdated Show resolved Hide resolved
@SimonHeybrock
Copy link
Member Author

Tracking next steps in #120.

@SimonHeybrock
Copy link
Member Author

@jl-wynen I fixed another small issue, and pushed some extra horrible code to deal with event-data fields embedded in NXdetector, to somewhat restore the ability to load SNS files. I am not planning further changes in this PR, so this is ready for a "final" look.

@SimonHeybrock SimonHeybrock requested a review from jl-wynen April 5, 2023 07:58
@SimonHeybrock SimonHeybrock merged commit 40e4d3b into main Apr 5, 2023
@SimonHeybrock SimonHeybrock deleted the nx2 branch April 5, 2023 08:18
SimonHeybrock added a commit that referenced this pull request Nov 2, 2023
I believe this was added in #117 when considering to not group by pixel
during loading. In the end that was not the chosen solution, but
apparently I forgot to remove this now unused helper.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants