-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mesh: Remove Data Order #194
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplifying the standard by integrating dataOrder
into axisLabels
improves understandability a lot. The user will have to be aware of this order, but now ambiguities that reference axisLabels
(but not dataOrder
) are resolved.
STANDARD.md
Outdated
same definition) | ||
- description: this attribute assigns for a matrix `A_{i,j,k}` | ||
human-readible namings for the indices `i`, `j` and `k` | ||
- advice to implementors: dimensions shall be ordered from fastest to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This clarification really makes the PR valuable.
If I understand correctly,
|
@mccoys update: we changed the order from originally proposed fastest-to-slowest to slowest-to-fastest. From a Fortran writer perspective, yes just invert the order of those attributes when writing openPMD 2.0 |
There is also an alternative solution to this change, ordering from slowest-to-fastest varying index. With that the existing Fortran-written files would need to change their index and we would generally have everything in C-like order. Due to the dominance of Py/C++ in our code bases, one could decide for that solution and have more consistency with e.g. numpy attributes >:-) |
STANDARD.md
Outdated
information if you need to invert the access to | ||
`axisLabels` (and other attributes that use the | ||
same definition) | ||
- description: this attribute assigns for a matrix `A_{i,j,k}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify the meaning of assigns for
, or maybe use some other phrasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possibility is
- description: human-reading labels for the `N` axes of the mesh,
ordered *from fastest to slowest varying index when accessing the
mesh contiguously (as 1D flattened logical memory)`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I wanted to write assigns human-readible namings for the indices ... to a matrix ...
@RemiLehe is it ok for you as well if I change the order to slowest to fastest varying index? :) VC: ok with all. |
c8bc604
to
08a6d76
Compare
all right, I updated the PR and PR description! :) |
The data order for indices arises naturally from its flattened memory layout. This also holds true for logical access to the data.
08a6d76
to
a6a5b53
Compare
STANDARD.md
Outdated
a matrix `A[i,j,k]` will have its first index | ||
as the slowest varying index (e.g. `i`); | ||
- if you access a ND array Fortran-like, | ||
a matrix `A[i,j,k]` will have its last index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fortran spells arrays A(i,j,k)
STANDARD.md
Outdated
its slowest varying index name and `axisLabels`: `("z", "y", "x")` | ||
- 3D `cartesian` mesh accessed in C-like as `A[x,y,z]` will have `x` as | ||
its slowest varying index name and `axisLabels`: `("x", "y", "z")` | ||
- 2D `cartesian` mesh accessed in Fortran-like as `A[x,y]` will have `y` as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A(x,y)
STANDARD.md
Outdated
its slowest varying index name and `axisLabels`: `("x", "y", "z")` | ||
- 2D `cartesian` mesh accessed in Fortran-like as `A[x,y]` will have `y` as | ||
its slowest varying index name and `axisLabels`: `("y", "x")` | ||
- `thetaMode` accessed Fortran-like `A[r,z]`, `axisLabels`: `("z", "r")` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A(r,z)
@RemiLehe this PR should be ready to merge. Validator and Updater are up-to-date now :) |
The data order for indices arises naturally from its flattened memory layout.
This also holds true for logical access to the data.
This affects the base standard attributes:
axisLabels
gridSpacing
gridGlobalOffset
shape
of constant record componentsand the ED-PIC extension attributes:
fieldBoundary
particleBoundary
Implements issues: #125 and #129
Description
dataOrder
is superfluous.By defining 1D-flattened dimensions like
axisLabel
are ordered by slowest to fastest varying index one can omitdataOrder
. Slowest to fastest varying index here only means the logical flattened memory layout that one assumes for this data format.This is possible since one does not care how data was written but just about how to label/name indices when reading data. E.g., one would increase from fastest to slowest dimension for all flattened descriptions over dimensions to read data efficiently without seeking.
Let's have a C-like to Fortran-like example
A C-matrix
A[i,j,k]
is written withi
index standing for"x"
,j
for"y"
andk
for"z"
. The associatedaxisLabels
is in the index order k-j-i (slowest is first) which is("x", "y", "z")
.A Fortran-reader reads this matrix as
M(a,b,c)
with the index data order a-b-c (c
is varies slowest). TheaxisLabels
attribute reads("x", "y", "z")
. So one will assign the fastest varying indexc
the label"z"
,b
the label"y"
and the fastest varying indexa
an"x"
. Index order and index label/name conserved - hurray! ✨Affected Components
base
EXT: ED-PIC
Logic Changes
Before one had to to write the access-index order of the writing code. Now one writes the labels in the order "slowest-to-fastest varying index".
This does not change C-style, but Fortran-style access.
Writer Changes
How does this change affect data writers?
Yes, Fortran-style writers need to invert their order now since the fastest varying index is the last.
C-style writers (C/C++/Python/Java/...) can just omit the
dataOrder
attribute now.Unaffected besides the now removed attribute:
PIConGPU
(C++): https://github.com/ComputationalRadiationPhysics/picongpu/...Warp
(Python-Writer): ...FBPIC
(Python): ...Smilei
(C++): https://github.com/SmileiPIC/SmileiSIMEX Platform
(Python): https://github.com/eucall-software/simex_platformopenPMD-api
(C++): https://github.com/openPMD/openPMD-api/...Does this pull request change the interpretation of existing data writers?
Fortran/Matlab/R/Julia codes: ...
Reader Changes
How does this change affect data readers?
Yes, all readers can now just label/name flattened dimension accesses to e.g. labels of indices in the order "slowest-to-fastest index".
What would a reader need to change? Link implementation examples!
openPMD-validator
: Remove Data Order openPMD-validator#42openPMD-viewer
: https://github.com/openPMD/openPMD-viewer/...postpic
: https://github.com/skuschel/postpicyt
: https://github.com/yt-project/yt/...VisIt
: https://github.com/openPMD/openPMD-visit-pluginopenPMD-api
: https://github.com/openPMD/openPMD-api/...Data Converter
Mesh records that declare
dataOrder='F'
need to invert the attributes mentioned above.The unused
dataOrder
attribute should be removed as well.