First submission for NUrF at LoKI #129

gudlot · 2022-06-21T14:44:37Z

This is my first submission for the NUrF module. Work in progress. nurf.py contains all functions and scipp_947.ipynb contains a Jupyter notebook.
This is a PR to open the discussion on code for the NUrF module. New_fluo_plot_function is my current feature development branch.

…mments

SimonHeybrock

Thanks for the contribution, you have obviously put a lot of effort into this 👍

See comments and questions below. If have at this point not looked into all the details of the more specific functions since I anticipate that they may change significantly if/once some of my more general comments are addressed.

SimonHeybrock · 2022-06-22T04:53:12Z

docs/instruments/loki/nurf/nurf.py

+print("This is scippneutron ", scn.__version__)
+print("This is scippnexus ", snx.__version__)
+print("This is scipp ", sc.__version__)


We should not have print statements at module-level. If output is needed we should use the logging facilities. I would not advocate logging the versions though.

The versions get logged when you configure the ess logger for a workflow. Have a look at cell 2 in https://scipp.github.io/ess/instruments/amor/amor_reduction.html

But this should be done in a workflow, not in a module like nurf.py.

Ok. I remove them. It was more for myself. I am not familiar with logging facilities, but would like to see them.

SimonHeybrock · 2022-06-22T05:00:30Z

docs/instruments/loki/nurf/nurf.py

+def complete_fname(scan_numbers):
+    """Converts a list of input numbers to a filename uses at ILL.


Is this relevant for ESS? If it is just for testing now while you have sample files from ILL I think we should consider moving it elsewhere?

No. That is a function I wrote for my ILL test data. I could put all ILL related functions in a different .py file.

SimonHeybrock · 2022-06-22T05:02:36Z

docs/instruments/loki/nurf/nurf.py

+
+    Parameters
+    ----------
+    scan_numbers: list of int or a single int


We are using a Sphinx extension so the type information here is handled automatically based on type-hints. That is, here we should just have:

Suggested change

scan_numbers: list of int or a single int

scan_numbers:

and the corresponding type hints above needs to be written.

For reference, see

ess/docs/conf.py

Line 45 in d08178f

'sphinx_autodoc_typehints',

for the extension.

Ok. All data types will go into the function call and an empty line (apart from the explanation) remains in the docstring.

SimonHeybrock · 2022-06-22T05:04:04Z

docs/instruments/loki/nurf/nurf.py

+
+    Returns:
+    ----------
+    flist_num: list of str or  one  str


See above, do not type info include in the docstring, use type hints and our extension does the rest.

Suggested change

flist_num: list of str or one str

flist_num:

See also https://scipp.github.io/reference/developer/coding-conventions.html#docstrings.

SimonHeybrock · 2022-06-22T05:05:24Z

docs/instruments/loki/nurf/nurf.py

+    #TODO: wpotrzebowski points out that dark, reference are also data. Should we convert the entry into the proposed LoKI.nxs to 'is_sample'?
+    data = da[da.coords["is_data"]]    


is_sample sounds ok to me.

Ok. It will be is_sample in the next iteration.

I tried this, but then I got into trouble while loading the file. It suddenly asked for other parameters not applicable to my case. For the moment, I will keep is_data

SimonHeybrock · 2022-06-22T06:19:31Z

docs/instruments/loki/nurf/nurf.py

+    for i in range(num_spec):
+        # start parameters
+        p0 = [b0[i], m0[i]]
+        popt, pcov = leastsq(
+            residual,
+            p0,
+            args=(
+                uv_da_filt.coords["wavelength"].values,
+                uv_da_filt["spectrum", i].values,
+            ),
+        )


Can we use scipp.optimize.curve_fit (which uses scipy) instead?

I am not sure. I had a look at the scipy.optimize.curve_fit function while writing my function. But I need here to minimise the difference between the measured spectral data and the turbidity that is fitted. I was not able to realize this with the curve_fit function, hence I chose scipy.optimize.leastsq.

SimonHeybrock · 2022-06-22T06:20:58Z

docs/instruments/loki/nurf/nurf.py

+    # Switch on plots for verification
+    if plot_corrections:


Having such options on a function is often not a good design choice. Is this necessary? Can it be moved to a separate function that can be called on the return value if this function?

I chose it to declutter the output in the notebook. Sometimes it is useful to see the in-between plotting steps, sometimes not. How would you do this, Simon, please? Would you have a dummy code for a design pattern by chance?
And yes, I agree with your other comment - to separate processing tools from plotting workflows.

SimonHeybrock · 2022-06-22T06:24:36Z

docs/instruments/loki/nurf/nurf.py

+    display(multi_uv_turb_corr_da)
+
+    fig, ax = plt.subplots(ncols=3, figsize=(21, 7))
+    legend_props = {"show": True, "loc": 1}
+    num_spectra = multi_uv_turb_corr_da.sizes["spectrum"]
+
+    out = sc.plot(
+        sc.collapse(multi_uv_turb_corr_da, keep="wavelength"),
+        grid=True,
+        ax=ax[0],
+        legend=legend_props,
+        title=f"All turbidity corrected UV spectra for {num_spectra} spectra",
+    )


As mentioned above, I think we should avoid mixing the "business logic" (such as fitting) from visualization/plotting. Keep in mind that we may want to run the processing code in script instead of a notebook, and we don't want to create plots then (or plot to files). Can you find a way to disentangle all the code that creates outputs (HTML or plots) from the rest? This applies to many functions here, not just this one.

Yes, I agree to remove the business (fitting, reducing, normalising, regrouping) from the plotting business. I might have to repat some code.
Yes, I know it applies to many functions I wrote. The plots were helpful for visual inspection as well.

docs/instruments/loki/nurf/nurf.py

SimonHeybrock · 2022-06-22T06:27:09Z

docs/instruments/loki/nurf/nurf.py

+    assert (
+        wllim < wulim
+    ), "Lower wavelength limit needs to be smaller than upper wavelength limit"


If this error is meant for the user, raise an exception instead of using assert.

I am not sure, it can be if I want. At least, it makes the user aware of his wrong choice.
What is the rule, please? assert has the possibility to break the code and I was taught that if the code breaks in a controlled way, it is a good thing.
I don't know when to go for raise ValueError or assert.

SimonHeybrock · 2022-06-22T07:43:46Z

General suggestion: Have you considering submitting the main "logic part" of this as a separate PR? That is, exclude all the plotting and debugging stuff, just keep the actual code for loading, normalization, filtering, fitting, etc.. This would make it easier to bring it into a state everyone is happy with to include in the package. You can then, as a second step, work out which plotting utilities and helpers the user/scientist may need and do all that in follow-up work.

…o the proposed new NeXus format for LoKI

… argument

gudlot · 2022-06-22T18:02:35Z

General suggestion: Have you considering submitting the main "logic part" of this as a separate PR? That is, exclude all the plotting and debugging stuff, just keep the actual code for loading, normalization, filtering, fitting, etc.. This would make it easier to bring it into a state everyone is happy with to include in the package. You can then, as a second step, work out which plotting utilities and helpers the user/scientist may need and do all that in follow-up work.

Yes, it occurred to me while writing the functions. I am constantly torn between writing code "quick and dirty" (to quote somebody else) and writing better code following good programming standards, at least I am trying. The idea was first to show the functionality/workflow and get an idea on the data. Please don't forget my data comes from an experiment where not everything has worked technically in a good way. That affected the datasets (=multiple nxs files). Priority was on the beamtime, and resources to support the implementation of the spectrometer were limited. That will be in the future a task for ESS maybe to ensure that data from the spectrometers are correctly recorded and handles errors. You and the others commentators are the first people, I can discuss my code with in terms of code design/strategy.
My idea would be to do a process function, return a dataarray or dataset (or a function layer in between that gathers the values from multiple files) and build around it the corresponding plot function. I would name all plot functions with plot_ in the beginning.
Thank you very much for the discussion, the helpful comments, and your patience. I try my best to improve and learn.

wpotrzebowski · 2022-06-23T04:37:55Z

used at ILL

@gudlot Yes, I know it is ILL data but I think there was typo in the doc string ("at" missing), which I think has been corrected now.

wpotrzebowski · 2022-06-23T04:59:58Z

I think it is mostly the question for @nvaytet. Sorry if I missed this in conversation but is there an intention to keep all files (including python files) in docs/instruments/loki/nurf/? As far as I know NURF is avaialable both at ISIS and ILL and I think in the future they may also be interested in this scripts in data processing. It can always be addressed later but It may worth to think about it now.

nvaytet · 2022-06-23T14:07:34Z

docs/instruments/loki/nurf/ill_auxilliary_funcs.py

+
+def nurf_file_creator(loki_file, path_to_loki_file, data):
+    """
+    Appends NUrF group to LOKI NeXus file for ESS


Is this the way NUrF data will always be added to the LoKI files? Or is this a temporary way to do things until all the data is properly stored in the Nexus files?
If the latter, I am not sure this should become part of the ess package. To me it looks like a (still experimental?) utility script that patches the data?

When I started this project, we discussed how the Nurf data could be part of a Loki file. I appended my content to an example Loki file. At the moment it is more of a temporary way and my current nxs files contain only the uv and fluo relevant parts, nothing else. But for Kenan I append my data to the Loki file.
No, it does not have to become part of any package. Please consider all functions in this file as helper functions that I needed to convert my data into a better format.

nvaytet · 2022-06-23T14:34:25Z

docs/instruments/loki/nurf/nurf.py

+    wllim=300,
+    wulim=400,
+    wl_unit=None,
+    medfilter=True,


You seem to have several functions that say they do one thing, "keep the data between min and max wavelength" in this case, but are actually doing more under the hood, since you can also apply a mean filter.
This is usually a sign that you should either make two functions and call both one after the other, or change the function name.
In this case, I feel that selecting a wavelength range is quite different from applying a filter. So I would vote for two functions.

Try to have a look at other places where you are doing something similar, and change accordingly.

Thank you. My thoughts were that using a medfilter is optional, depending on the data quality. Sometimes the data has a lot of outliners. With the medfilter I smooth it first and then I select the maximum intensity in a given wavelength range. If there would be an outliner, that could be misinterpreted.
I could separate them into functions, but probably would have a third to bring the two together.

I wrote now for utils.py a little wrapper nurf_median_filter(da,kernel_size) to apply the scipp.ndimage.median_filter in direction of the the wavelength.

nvaytet · 2022-06-23T14:51:58Z

docs/instruments/loki/nurf/nurf.py

@@ -0,0 +1,1892 @@
+# standard library imports


As Simon suggested, we need to split things up before we can move forward here.
Splitting up the processing from the plotting is a good start. I think I would like to break it up even further.

You seem to have two different workflows, UV and fluo. Are the two related (i.e. you cannot process one without the other?), or could you run them independently? Basically, could you submit changes for just one of these first (e.g. just the UV?), and once we nail the design, the general layout could be copied for processing the other one?

My suggestion:

Identify the minimal list of steps you need to do for a workflow (I think for now we should just start with processing and leave exporting data to ascii out of this PR? Maybe things like fitting can also be added later?)

Ideally, your notebook should just read like a list of function calls from the ess.loki.nurf module.

Move the nurf.py file out of the docs and into its own submodule in ess.loki (i.e. in a folder called src/ess/loki/nurf)

Strip out all the plotting code for now, and just have the list of processing steps in a notebook that you put in the docs

Add some final simple plotting in the notebook for now (we will add plotting of intermediate steps later)

Note that if a user wants a highly customized figure made for a paper, it will not be stored here. We will just have simple plots

Don't do all this in one go. Start with item 1. here, once we have a list, we can look at the other items.

Thank you very much.

You seem to have two different workflows, UV and fluo. Are the two related (i.e. you cannot process one without the other?), or could you run them independently?

Yes, there are two workflows, one for the UV spectroscopy one for the fluo spectroscopy. I can process each without the other. Only later in the data interpretation, both will be together as they are complementary.

Basically, could you submit changes for just one of these first (e.g. just the UV?),

Does it mean in two or even three separated python files? UV, fluo, medfilter (because this is a function that can be applied to both workflows?)

and once we nail the design,

Should this be graphically outlined?

the general layout could be copied for processing the other one?

No, the workflows for UV and fluo are really different and extract different parts of each spectrum.

Identify the minimal list of steps you need to do for a workflow (I think for now we should just start with processing and leave exporting data to ascii out of this PR? Maybe things like fitting can also be added later?)

Ok to the minimal list of steps.
Re: export to ascii: This is a wish of the user. The idea behind was if the user needs to further process the data with another program that cannot handle hdf5. The users wishes to have access to all raw data and processed data.
Re fitting: Turbidity is a standard correction in UV spectroscopy I understood

Ideally, your notebook should just read like a list of function calls from the ess.loki.nurf module.

What will be part then of the nurf module? Will it be composed of UV, fluo, plotting?
Will the plot functions not be part of ess.loki.nurf? They are also part of the data interpretation/visualisation and an important part during the experiment.

Strip out all the plotting code for now, and just have the list of processing steps in a notebook that you put in the docs

Okay, that can be done.

Add some final simple plotting in the notebook for now (we will add plotting of intermediate steps later)

What would be considered a final simple plot? UV signal as a function of wavelength, maybe processed? Same for fluo spectra?

Note that if a user wants a highly customized figure made for a paper, it will not be stored here. We will just have simple plots

The user does not request highly customized figures. The plots that I generated are standard plots for UV, fluo spectroscopy. They are just a bit more complicated than an I(q) plot for example. The idea is to measure a perturbation, record SANS, fluo, UV along with it and show this during the experiment to the user.

Don't do all this in one go. Start with item 1. here, once we have a list, we can look at the other items.

Okay. I could provide a workflow list.

A few comments.

For the file/module organisation, I think separating uv, fluo and plotting is good.
So something like:

src/ess/loki/ |--> nurf/ |--> __init__.py |--> fluo.py |--> uv.py |--> tools.py # or utils.py, or common.py, however you want to name it |--> plot.py

?

We would start with either fluo.py or uv.py, probably with the tools.py as well, and then add the rest later.

Should this be graphically outlined?

No need to draw things, just outlining a list of steps or function calls should be fine.

…e uv

nvaytet · 2022-09-28T15:42:48Z

@gudlot What is the status on this? Can I help you with something to get this moving again?

RGBYCP and others added 23 commits June 10, 2022 23:19

First submission

f708c58

plot_fluo_spectrum_selection implemented

0003b18

plot_fluo_spectrum_selection implemented in nurf.py

2cba870

Removed source code for the plot_fluo_spectrum_selection and added co…

3309d04

…mments

Renaming function plot_multiple_peak_int to plot_fluo_multiple_peak_int

1bbfe4f

Renaming function plot_multiple_peak_int to plot_fluo_multiple_peak_int

9babb86

Refinement of plot_fluo_multiple_peak_int

c3b0fd6

Added import of itertools at the top

c3c457d

Removing errors in certain functions

6d8fe6b

removed function dev into nurf.py

9f8f009

Work in progress, working on docs strings

4835688

Snapshot of plot_fluo_spectrum_selection before rewriting this function

f1bbc92

Clean up of plot_fluo_spectrum_selection

83c8d47

Added cleaned function plot_fluo_spectrum_selection

313233d

On the way to zero flake8 errors

4b511df

Start first round ob feedback from ESS

cb436c9

Correction of order of library import order

f4a528c

Added a TODO for function single_contributions

ba872cd

Test Hallo_Welt

2f790ef

Remove hello world

8f166c3

Cleaned all outputs of all cells in jupyter notebook

4ac3eaa

Removed copy error git

d12c2ae

Removed output from jupyter notebook

3014210

SimonHeybrock reviewed Jun 22, 2022

View reviewed changes

gudlot added 5 commits June 22, 2022 10:53

Removed prints about scipp versions

4a3f45f

Start to transfer ILL related functions into this file

f01f8cf

Added the functions to extract data from ILL format and writes it int…

5a92afd

…o the proposed new NeXus format for LoKI

Exchanged is_data for is_sample in the NeXus Loki file

01d96e5

Exchange is_data for is_sample

1c8d270

gudlot added 12 commits June 22, 2022 14:30

Removed in-line comment in process_uv

4d5e9a4

Renamed data to sample in normalize_fluo, docstring and function body

f07183d

normalize_uv: exchange sc.log for sc.log10, now correct

e6764ef

Replaced code with a function process_uv in export_uv

ab41af7

Reversed changes in export_uv

d6ee6c5

Manipulated data_to_save variable

9672a85

Removed comment in fluo_maxint_max_wavelen about import of defaultdic

6bd13ac

fluo_maxint_max_wavelen: replacing hardcoded unit values with the key…

966d2a6

… argument

fluo_maxint_max_wavelen: removed in-line comments

258320f

Remove inline comment in apply_medfilter

2e95d85

fluo_filt.values is equivalent to fluo_filt.data.values

fa9db9d

normalized_avg.values for normalized_avg.data.values

27443ea

gudlot added 5 commits June 23, 2022 15:50

Added functionality to convert from ILL.nxs format to Loki.nxs format

d95e132

A ipynb to convert ILL.nxs to Loki.nxs

b7b5ba9

Reversed is_sample back to is_data

6f0541c

Fixed error in uv_multi_turbidity_fit

a32f7d3

Added a cell to show what happens if a Loki.nxs is loaded

e87f415

nvaytet reviewed Jun 23, 2022

View reviewed changes

gudlot added 7 commits June 29, 2022 15:41

Correction of docstring in load_uv, returned keys: sample

a4ac93f

Modified export_uv. Exports darf, ref, raw sample uv, corrected sampl…

6a05264

…e uv

Corrected docstring in export_uv

2a6af0c

Extended docstring in export_uv

a2fd750

Adapted export_fluo to client's wish

9488703

Cleaned jupyter notebook

7e27eea

Removed print in export_fluo

b6ca0b6

wpotrzebowski mentioned this pull request Dec 1, 2022

Py separation #147

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First submission for NUrF at LoKI #129

First submission for NUrF at LoKI #129

gudlot commented Jun 21, 2022

SimonHeybrock left a comment

SimonHeybrock Jun 22, 2022

jl-wynen Jun 22, 2022

gudlot Jun 22, 2022

SimonHeybrock Jun 22, 2022

gudlot Jun 22, 2022

SimonHeybrock Jun 22, 2022

gudlot Jun 22, 2022

SimonHeybrock Jun 22, 2022

SimonHeybrock Jun 22, 2022

gudlot Jun 22, 2022

gudlot Jul 4, 2022

SimonHeybrock Jun 22, 2022

gudlot Jun 22, 2022 •

edited

Loading

SimonHeybrock Jun 22, 2022

gudlot Jun 22, 2022

SimonHeybrock Jun 22, 2022

gudlot Jun 22, 2022

SimonHeybrock Jun 22, 2022

gudlot Jun 22, 2022 •

edited

Loading

SimonHeybrock commented Jun 22, 2022

gudlot commented Jun 22, 2022

wpotrzebowski commented Jun 23, 2022

wpotrzebowski commented Jun 23, 2022 •

edited

Loading

nvaytet Jun 23, 2022

gudlot Jun 23, 2022 •

edited

Loading

nvaytet Jun 23, 2022

gudlot Jun 23, 2022

gudlot Aug 25, 2022

nvaytet Jun 23, 2022

gudlot Jun 23, 2022 •

edited

Loading

nvaytet Jun 29, 2022

nvaytet commented Sep 28, 2022

		def complete_fname(scan_numbers):
		"""Converts a list of input numbers to a filename uses at ILL.

		#TODO: wpotrzebowski points out that dark, reference are also data. Should we convert the entry into the proposed LoKI.nxs to 'is_sample'?
		data = da[da.coords["is_data"]]

First submission for NUrF at LoKI #129

Are you sure you want to change the base?

First submission for NUrF at LoKI #129

Conversation

gudlot commented Jun 21, 2022

SimonHeybrock left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gudlot Jun 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gudlot Jun 22, 2022 • edited Loading

Choose a reason for hiding this comment

SimonHeybrock commented Jun 22, 2022

gudlot commented Jun 22, 2022

wpotrzebowski commented Jun 23, 2022

wpotrzebowski commented Jun 23, 2022 • edited Loading

Choose a reason for hiding this comment

gudlot Jun 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gudlot Jun 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nvaytet commented Sep 28, 2022

gudlot Jun 22, 2022 •

edited

Loading

gudlot Jun 22, 2022 •

edited

Loading

wpotrzebowski commented Jun 23, 2022 •

edited

Loading

gudlot Jun 23, 2022 •

edited

Loading

gudlot Jun 23, 2022 •

edited

Loading