Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functionality to write ORSO dataset from combination of several runs #87

Closed
jokasimr opened this issue Oct 4, 2024 · 3 comments · Fixed by #92
Closed

Functionality to write ORSO dataset from combination of several runs #87

jokasimr opened this issue Oct 4, 2024 · 3 comments · Fixed by #92
Assignees

Comments

@jokasimr
Copy link
Contributor

jokasimr commented Oct 4, 2024

Currently the Orso provider looks for the NormalizedIofQ from the workflow and produces an ORSO dataset from that.
But often we want to put several curves in one ORSO file, either stitched curves or individual curves stacked in the ORSO file.

Some things are not clear and will have to be figured out:

  • Does ORSO already have support for listing several reflectometry curves in the same file?
  • How is the metadata from different runs combined?
  • How is the Q-resolution from different runs combined?

It's possible that we will have to do a simple version for now and revisit this when we know more.

@paracini
Copy link

paracini commented Oct 8, 2024

Let's start with putting multiple non-stitched angles measured on the same sample into one ORSO file
Some clarifications:

  • ORSO support for multiple datasets in a single file: https://www.reflectometry.org/advanced_and_expert_level/file_format#multiple-data-sets-1
  • MetadataThere are no strict indications on how to treat the metadata in the case of multiple datasets so there is flexibility. Let's start by having most information in the main header such as the identifier of each one of the multiple reduced reflectivity curves contained in the file and what they are (e.g. what sample rotation was used for each one in the case of measurements at different angles). This parameter can then be repeated as a single line header above the 4 columns of each dataset for readability
  • Resolution resolution is reported for each individual curve. The main reason for keeping the datasets separate is the different q resolution of the datasets in the overlapping regions which should not be combined

@jl-wynen
Copy link
Member

jl-wynen commented Oct 9, 2024

There is actually an explanation for how to encode metadata:

overwrite meta data

Below the separator line, metadata might be added. These overwrite the metadata supplied in > the initial main header (i.e. data set 2 does not know anything about the changes made for data set 1 but keeps any values from data set 0 (the header) which is not overwritten.

And note the example below this block that shows how multiple input files might be combined.

@paracini
Copy link

paracini commented Oct 9, 2024

Thanks, I should read the links I post. In addition, there should be in the main header a short section that lists all the multiple measurements present in the dataset. Currently the main header has:
# data_files: raw data from sample
# - file: file name or identifier doi

Since these are then overwritten by the specific metadata above each dataset the main header should contain a list of identifiers of all the files present in the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants