Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NIAC 2018 Proposals] Clarify/generalize use of uncertainties #27

Closed
vasole opened this issue Mar 23, 2018 · 14 comments
Closed

[NIAC 2018 Proposals] Clarify/generalize use of uncertainties #27

vasole opened this issue Mar 23, 2018 · 14 comments
Milestone

Comments

@vasole
Copy link

vasole commented Mar 23, 2018

Rationale

In recent presentations a the Research Data Alliance meeting in Berlin, the subject of uncertainties associated to the data was mentioned, in particular in the frame of application definitions.

Reading the documentation at:

http://download.nexusformat.org/doc/html/design.html?higlight=uncertainty#design-fields

and

http://download.nexusformat.org/doc/html/classes/base_classes/NXdata.html#nxdata

I see two possible ways of specifying uncertainties for a dataset but it is not clear to me if appending _errors to a dataset name is to be considered the official generic solution. To me it looks more like an example. If it so, please make it absolutely clear in the documentation and this issue can be closed.

Proposal

  • Decide about the proper way to associate uncertainties to datasets:

    • One way (only adding _errors or attribute uncertainties recommended)
    • Two ways (both ways recommended)
    • No way (specific to application definitions and therefore each application definition will decide)

My View

I can see arguments in favor and against of any of the above and I could easily defend any of them.

I only ask you to take into account in your evaluation the fact that most likely we'll be dealing with links (internal or even externals).

@vasole
Copy link
Author

vasole commented Apr 24, 2018

Just to clarify. Only in the link describing NXdata one can read three different ways of associating uncertainties:

http://download.nexusformat.org/doc/html/classes/base_classes/NXdata.html#nxdata

prjemian added a commit to nexusformat/definitions that referenced this issue Apr 24, 2018
@prjemian
Copy link
Contributor

And even the link needs to be updated (things change). Here's the new address of that proposition:

http://www.nexusformat.org/2014_axes_and_uncertainties

Since that, things changed, such as uncertainty became the plural uncertainties.

@prjemian
Copy link
Contributor

NeXus and the NIAC are very conservative in many ways. The existence at this time of more than one way to express uncertainties is an example of such conservatism. Preserving backwards compatibility vs. providing clear directions. This proposition came from developments by canSAS to develop a durable data file standard for communicating reduced small-angle scattering data.

The use of an @uncertainties attribute (to declare explicitly a field name with such data) was an improvement over the existing method of searching for a called FIELD_errors. With the @uncertainties attribute proposition came the suggestion that NeXus drop the FIELD_errors.

To be conservative, NeXus should first deprecate the use of FIELD_errors and allow time for data writers (and readers) to refactor their code and use the @uncertainties attribute. Now that we have a versioning scheme in place for the NeXus definitions and schema, the burden of determining which version of the NeXus standard has been used in a given data file should become easier.

@prjemian
Copy link
Contributor

Considering the use of links and the @uncertainties attribute, there must be a field so named in the same group, either as a dataset of a link (internal or external).

One complication is that this relationship must be considered when linking a field. If that field has the @uncertainties attribute, the field named in the attribute must also be linked.

@vasole
Copy link
Author

vasole commented May 4, 2018

Good point.

By one hand one can think that the uncertainty is associated to a dataset, but then the use of links makes things difficult to work with. If one makes sure the @uncertainties is referred relative to the target it may work. I guess reading software will end up looking for the @uncertainties dataset relative to both the link and the target until it is found.

@vasole
Copy link
Author

vasole commented May 4, 2018

So, if I have understood things, historically the latest solution is to use the attribute @uncertainties associated to dataset.

Being the latest historically, can a recommendation be made about it? I would be happy to tell my colleagues what is the recommended way.

@prjemian
Copy link
Contributor

prjemian commented May 4, 2018

👍

@rayosborn
Copy link

Some of this was addressed in a PR, which is still pending, probably because I missed a couple of telcos: nexusformat/definitions#602. Also, note that uncertainties is lower-case.

@vasole
Copy link
Author

vasole commented May 4, 2018

Thanks.

Yes it is lower case but when writing @uncertainties github capitalizes it I will have to make use of the quotes each time...

@zjttoefs
Copy link
Contributor

zjttoefs commented May 8, 2018

There is an open issue for this: nexusformat/definitions#370

@prjemian
Copy link
Contributor

prjemian commented May 8, 2018

But definitions#370 is about documentation. This issue is for the NIAC to make a clear decision about using uncertainties as "the official generic solution".

@zjttoefs
Copy link
Contributor

zjttoefs commented May 8, 2018

We made the decision in 2010 as you commented in the definitions ticket. Unless there are problems with that official generic solution this is nothing but a documentation issue.

@vasole
Copy link
Author

vasole commented May 8, 2018

If the use of @uncertainties is the recommended way, then please state it clearly in the documentation and mark all other possibilities (errors dataset in NXdata, use of VARIABLE_errors, ...) as deprecated.

@prjemian
Copy link
Contributor

closing this now, since there is a definitions issue created

@prjemian prjemian added this to the NIAC 2018 milestone Oct 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants