Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(issue 492) NXcanSAS v1.0 release #516

Merged
merged 35 commits into from
Dec 16, 2016
Merged

(issue 492) NXcanSAS v1.0 release #516

merged 35 commits into from
Dec 16, 2016

Conversation

prjemian
Copy link
Contributor

@prjemian prjemian commented Dec 1, 2016

Reviewers: check your box after you have reviewed the NXcanSAS NXDL code

Copy of HTML documentation:
NXcanSAS.pdf

@prjemian
Copy link
Contributor Author

prjemian commented Dec 1, 2016

Sent emails to each reviewer (@ajj, @butlerpd, @smk78, @toqduj, and Adrian Rennie) requesting their review.

@toqduj
Copy link

toqduj commented Dec 2, 2016

Hi. Just a few small items based on the HTML documentation:

In the section on Q uncertainties:
"Generally, this is the estimate of the uncertainty of each Q. Typically the estimated standard deviation."
should be changed to:
"This is the estimate of the uncertainty of each Q. By default, this will be interpreted to be the estimated standard deviation. In special cases, when a standard deviation cannot possibly be used [note: do such cases exist?], its value can specify another measure of distribution width."
Note that a standard deviation always exists for a population of values, so I do not know if that last clause should be in there or not.

In the section on resolution:
Defining the mean value µ and the standard deviation s, the span described by µ-s to µ+s contains 68% of the population (the standard deviation is, in effect, a half-width of the distribution). That means that the sentence introducing the distribution types in the resolution, which now states:
"Square : note that the width of the square would be ~1.4 times the standard deviation specified in the vector",
should state:
"Square : note that the full width of the square would be ~2.9 times the standard deviation specified in the vector"

On the intensity vector I:
Do we want to stick with typical units in centimetres, or give them proper SI units of metres? That would make the typical units 1/m/sr and m^2. Since both 1/cm and 1/m are commonly applied units, I think we should go for the more correct one. Don't worry, I'm not going to argue (anymore) for using 1/m for Q. That one "should" be 1/nm to stick with common use (ugh).

I/@uncertainties�:
"For Poisson statistics, use 1/√I."
-->
"For Poisson statistics, can not be less than √I."

On dQw/dQl:
w and l specify directions that are ill-defined in 2D systems. Can we modify this to correspond to x and y?

On Qmean:
Is there a possibility for both Q and Qmean to be specified? If so, perhaps an example in the text?

General remark: are the directions x, y specified and/or in accordance with the NeXus coordinate system definitions?

Other than that, I think it looks very good!

@ajj
Copy link

ajj commented Dec 2, 2016

Docs for @uncertainties on Q : agree with @toqduj on rewording. I would prefer people to use the natural definition for each distribution e.g. Full Width would be the natural one for "Square" rather than having to convert to/from standard deviation. However, that gets thorny quickly definition wise.

Do we allow similar syntax for @uncertainties - since we might have components and their sources. Tend to think this should always be standard deviation if it is to be interpreted as an "error bar" in the normal usage.

Docs for @resolution: Says "This may be used to describe the slit-length at each datum" after the Qdev example. This is a bit confusing as Qdev is not the slit-length. I suggest removing that sentence.

Otherwise looks OK to me.

@ajj
Copy link

ajj commented Dec 2, 2016

@toqduj -> typical units in SANS are 1/Å for Q and 1/cm for I(q). At least 1/m would annoy everyone equally ;)

@toqduj -> Q & Qmean -> Yes, always in fact. Shouldn't give Qmean without Q (derived from NIST usage where Qmean is centre for resolution distribution which might be different from nominal Q value of bin).

@toqduj
Copy link

toqduj commented Dec 2, 2016

Regarding @ajj in here
in case you want to use "full width" instead of standard deviation for a square distribution, you will need to specify for every distribution shape what the uncertainties values indicate. This, as you say, gets thorny quickly in the interpreting code. I'd recommend sticking to standard deviations, nicely related to the normal use of the error bar.

As for Q in units of 1/m, there is something nice about not having to worry about the meaning of your reciprocal length variable to figure out which units it should be in. I now use meters and reciprocal meters throughout the McSAS code, which avoids having to pepper the code with conversion factors (actually, we have a nice "units" library, which converts all input to SI units for internal use). These days, an exponent of 10^9 (or its square) is no problem in storing or programming anymore, at least in modern programming languages.

As an added bonus, all your intensity-vs-q plots are now in 1/m vs 1/m units, which nicely cancels out :).

@smk78
Copy link

smk78 commented Dec 2, 2016

toqdui said: 'I now use meters and reciprocal meters throughout the McSAS code, which avoids having to pepper the code with conversion factors (actually, we have a nice "units" library, which converts all input to SI units for internal use).'

I think we need to remember that what we are putting the finishing touches to here is a file format, and not an attempt to dictate how the global SAS community ought/should/could represent their data. To mandate that NXcanSAS only accept data that is 1/m vs 1/m will mean that either every program that writes/reads data will need modifying, or that no one - except maybe toqdui :-) - ever uses NXcanSAS, in which case we are wasting our time. Sorry if this sounds harsh, but we have to be pragmatic. A file format is a data wrapper, it should accept as wide a variety of data (of the specified type) as possible provided that data can be described accordingly.

@smk78
Copy link

smk78 commented Dec 2, 2016

To prjemian: You’ve previously also circulated a structure file. Can you post the latest current example, please? Thanks.

@prjemian
Copy link
Contributor Author

prjemian commented Dec 2, 2016

Here is the structure of the current revision:

NXcanSAS
  entry : NXentry
    @canSAS_class : NX_CHAR = SASentry
    @default : NX_CHAR
    @version : NX_CHAR = 1.0
    definition = NXcanSAS
    run
    title
    collection : NXcollection
      @canSAS_class : NX_CHAR = SASnote
    data : NXdata
      @I_axes : NX_CHAR
      @Mask_indices : NX_CHAR
      @Q_indices : NX_INT
      @canSAS_class : NX_CHAR = SASdata
      @signal : NX_CHAR = I
      I : NX_NUMBER
      Idev : NX_NUMBER
      Q : NX_NUMBER
      Qdev : NX_NUMBER
      Qmean : NX_NUMBER
      ShadowFactor
      dQl : NX_NUMBER
      dQw : NX_NUMBER
    data : NXdata
      @T_axes : NX_CHAR = T
      @canSAS_class : NX_CHAR = SAStransmission_spectrum
      @name : NX_CHAR
      @signal : NX_CHAR = T
      @timestamp : NX_DATE_TIME
      T : NX_NUMBER
      Tdev : NX_NUMBER
      lambda : NX_NUMBER
    instrument : NXinstrument
      @canSAS_class : NX_CHAR = SASinstrument
      collimator : NXcollimator
        @canSAS_class : NX_CHAR = SAScollimation
        distance : NX_NUMBER
        length : NX_NUMBER
        aperture : NXaperture
          shape
          x_gap : NX_NUMBER
          y_gap : NX_NUMBER
      detector : NXdetector
        @canSAS_class : NX_CHAR = SASdetector
        SDD : NX_NUMBER
        beam_center_x : NX_FLOAT
        beam_center_y : NX_FLOAT
        name
        pitch
        roll
        slit_length : NX_NUMBER
        x_pixel_size : NX_FLOAT
        x_position
        y_pixel_size : NX_FLOAT
        y_position
        yaw
      source : NXsource
        @canSAS_class : NX_CHAR = SASsource
        beam_shape
        beam_size_x : NX_NUMBER
        beam_size_y : NX_NUMBER
        incident_wavelength : NX_NUMBER
        incident_wavelength_spread : NX_NUMBER
        radiation = Spallation Neutron Source 
                  | Pulsed Reactor Neutron Source 
                  | Reactor Neutron Source 
                  | Synchrotron X-ray Source 
                  | Pulsed Muon Source 
                  | Rotating Anode X-ray 
                  | Fixed Tube X-ray 
                  | UV Laser 
                  | Free-Electron Laser 
                  | Optical Laser 
                  | Ion Source 
                  | UV Plasma Source 
                  | neutron 
                  | x-ray 
                  | muon 
                  | electron 
                  | ultraviolet 
                  | visible light 
                  | positron 
                  | proton
        wavelength_max : NX_NUMBER
        wavelength_min : NX_NUMBER
    process : NXprocess
      @canSAS_class : NX_CHAR = SASprocess
      date : NX_DATE_TIME
      description
      name
      term
      collection : NXcollection
        @canSAS_class : NX_CHAR = SASprocessnote
      note : NXnote
    sample : NXsample
      @canSAS_class : NX_CHAR = SASsample
      details
      name
      pitch
      roll
      temperature : NX_NUMBER
      thickness : NX_FLOAT
      transmission : NX_NUMBER
      x_position
      y_position
      yaw

@toqduj
Copy link

toqduj commented Dec 2, 2016

@smk78 The units are not fixed in any case (that's what the @Units attribute is for). However, there are recommendations for units in the document. If we don't want to recommend any particular form, that's fine too.

Note, however, that any flexibility in the format specification, requires additional programming effort on the ingestion side to deal with. Therefore, specifying something is not so bad either, to avoid having to write for an infinity number of possible cases.

@prjemian
Copy link
Contributor Author

prjemian commented Dec 2, 2016

read this section in the NeXus manual regarding units.
http://download.nexusformat.org/doc/html/datarules.html#nexus-data-units

@ajj
Copy link

ajj commented Dec 2, 2016

@smk78 there was definitely no intention on my part to impose unit usage. It is true that our examples use the customary units from the neutron world (1/cm and 1/A) rather than those from the x-ray world (1/mm or 1/m and 1/nm), but I think that it is clear in the docs that the writers and readers of the data need to ensure that they are properly handling the units as per NeXus standards.

I was being a bit tongue in cheek in my comments to @toqduj about the 1/m for Q - that is one that is not worth fighting. The nice thing about 1/cm or 1/A give values that are range nicely from around 1 and 3 orders up for I and from 1 and 3 orders down for Q, which makes them more "human scale". There is also the calculation precision question and pre-cancelling your exponents.

We need to ensure that the barrier to entry is sufficiently low - if people have to do unit conversions from their existing data workflows they will just ignore this.

I'm happy as is aside from fixing the resolution slit-length part of the docs (would have ticked box, but that doesn't work!). @prjemian - fix the slit-length comment and then count me as a tick!

@toqduj
Copy link

toqduj commented Dec 2, 2016

Likewise, I'm happy with the unit description in the NeXus docs as linked to by @prjemian. Count me ticked off as well, with the minor changes touched on in my first post.

@prjemian
Copy link
Contributor Author

revised PDF and structure files:

@smk78
Copy link

smk78 commented Dec 12, 2016

In NXcanSAS-xture I note that Idev, Q and Qdev all have @Units = NX_PER_LENGTH but that I has no specified @Units. I can sort of understand why this is (eg, SAXS data that's just counts) but then surely Idev would not have any units, so either way there seems to be an inconsistency?

@smk78
Copy link

smk78 commented Dec 12, 2016

Re my comments 7 days ago: why does the data block for SAStransmission_spectrum have @timestamp but the data block for SASdata does not?

@smk78
Copy link

smk78 commented Dec 12, 2016

Wouldn't @Units for Tdev be NX_DIMENSIONLESS?

@smk78
Copy link

smk78 commented Dec 12, 2016

In NXcanSAS.pdf I notice we still have the direction to use 1/SQRT(I) for Poisson statistics. @adrianrennie cautioned against this above.

@smk78
Copy link

smk78 commented Dec 12, 2016

Actually @prjemian , before I get carried away here, is the pdf at https://github.com/nexusformat/definitions/files/645016/NXcanSAS.pdf (your link 9 hours ago) really the revised version? Only I notice it is timestamped 10/19/2016 04:19 PM !!!

##516 (comment)

Should Tdev minOccurs=0?  Is this field optional?
@prjemian
Copy link
Contributor Author

re #516 (comment): confirmed that "Poisson" text is removed from the current version. PDF file is stale.

added @timestamp to SASdata per comment
#516 (comment)
@prjemian
Copy link
Contributor Author

to ensure that HTML is completely rebuilt, use this command from root-level directory:

make {rm,make}builddir

@prjemian
Copy link
Contributor Author

prjemian commented Dec 12, 2016

revised PDF and structure files:

(PDF is timestamped: 12/12/2016 10:46 AM)

@butlerpd
Copy link

Have lost track of where we are with this. Is it released? or still waiting for check marks from me, Steve and Adrian? If the latter I would check but as mentioned somewhere else I apparently have no such privilege ... The only thing I see is that resolution should be plural as there are many sources to the resolution not just one ..... JOKE .. no really ... that was a joke :-)

Seriously feel free to check me off -- As Steve states "we are putting the finishing touches to here is a file format, and not an attempt to dictate how the global SAS community ought/should/could represent their data" -- I think the current state of the spec, as I've said before, represents a giant step forward and now is sufficiently agnostic to issues which may still be controversial to allow future versions to improve on it if and when consensus is reached, and further does not overly prescribe requirements that could stifle innovation. Are there more issues that need addressing before releasing this version?

@prjemian
Copy link
Contributor Author

With a4a0477, butlerpd will be "ticked off".

@smk78 & @adrianrennie, are you both satisfied? I'd like to merge this over the weekend.

@prjemian
Copy link
Contributor Author

revised PDF and structure files:

@adrianrennie
Copy link

A tick can be placed by my name.

@smk78
Copy link

smk78 commented Dec 16, 2016 via email

@butlerpd
Copy link

Right ... teach me to make a joke --- won't happen again.
Thanks Pete for the hard work and perserverence!

@prjemian
Copy link
Contributor Author

Ok, we're all in. Once I figure out the tagging procedure that NeXus has agreed upon, I'll merge.

Thanks, all, for the hard work. And the giggles, too.

@prjemian prjemian merged commit 2b4424e into master Dec 16, 2016
@prjemian prjemian deleted the NXcansas-492 branch December 23, 2016 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants