Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question][fNIRS] The use of .dat file #7894

Closed
swy7ch opened this issue Jun 12, 2020 · 23 comments · Fixed by #7898
Closed

[Question][fNIRS] The use of .dat file #7894

swy7ch opened this issue Jun 12, 2020 · 23 comments · Fixed by #7898

Comments

@swy7ch
Copy link
Contributor

swy7ch commented Jun 12, 2020

Dear MNE devs and users,

I'm not quite used to fNIRS nor the MNE package (started my internship last week) and I can't figure out the use the *.dat file.

In your tutorial concerning the Preprocessing of fNIRS data, the function mne.io.read_raw_nirx(fnirs_raw_dir, verbose=True) (line 11 of the first box) requires a *.dat file to create a RawNIRX instance, although my tutor explained to me that *.dat are essentially files used to display data in real-time (i.e. during the experiment) and as such, are based on standard parameters : they're not usable after the experiment because they do not contain useful (to be processed) data, since the probed data is stored in *.wl1 and *.wl1, events in *.evt and the metadata in *.hdr and *_probeInfo.mat.

I searched through the code and even though you do load the file, I can't find where you use it (or if you use it all)? Am I missing something ?

Thanks in advance,

David

@swy7ch swy7ch changed the title Use of .dat file [Question] The use of .dat file Jun 12, 2020
@swy7ch swy7ch changed the title [Question] The use of .dat file [Question][fNIRS] The use of .dat file Jun 12, 2020
@agramfort
Copy link
Member

agramfort commented Jun 12, 2020 via email

@swy7ch
Copy link
Contributor Author

swy7ch commented Jun 12, 2020

.dat is a terrible extension choice as it is usedfor very different things.

Definitely...

The fNIRS is still pretty new in our lab, but I'm almsot sure of it yeah. We're using both NIRScout and NIRSport, if it may help. The data provided by machines happen to perfectly suit (in most cases) the requirements (*.wl1, *.wl2, *.hdr ...) but some of our directories lack a *.dat file, which is required by the fNIRS module : RuntimeError: Expect one dat file, got 0

My tutor explained ro me what they are for the machines, I don't know if the required *.dat file is exactly the same or not. Regardless, I can't find where it is used (apart from being loaded)

@agramfort
Copy link
Member

agramfort commented Jun 12, 2020 via email

@swy7ch
Copy link
Contributor Author

swy7ch commented Jun 12, 2020

Yes, both NIRScout and NIRSport. I'm not the one using them though, I'm "just" here to process the data

@agramfort
Copy link
Member

@rob-luke ?

@cbrnr
Copy link
Contributor

cbrnr commented Jun 12, 2020

Actually, read_raw_nirx requires either the name of the directory with all the data files, or the header file (extension .hdr). Where exactly in the tutorial is a .dat file required?

@swy7ch
Copy link
Contributor Author

swy7ch commented Jun 12, 2020

The read_raw_nirx does not require a *.dat file per se : it returns a RawNIRX instance. Thing is, the RawNIRX constructor does require a *.dat file :

# Check if required files exist and store names for later use
files = dict()
keys = ('dat', 'evt', 'hdr', 'inf', 'set', 'tpl', 'wl1', 'wl2',
        'config.txt', 'probeInfo.mat')
for key in keys:
    files[key] = glob.glob('%s/*%s' % (fname, key))
    if len(files[key]) != 1:
        raise RuntimeError('Expect one %s file, got %d' %
                            (key, len(files[key]),))
    files[key] = files[key][0]

The forloop raises an error if the number of <any_of_the_keys> file differs from 1, especially in my case, a 'dat' file.

EDIT : For what it's worth, here's the commit that introduced said loop; it was authored by @rob-luke and committed by @larsoner

@cbrnr
Copy link
Contributor

cbrnr commented Jun 12, 2020

Got it. Indeed it looks like 'dat', 'set', 'tpl', and 'config.txt' are never used in the code. Therefore, a quick fix would be to issue a warning if one of these files are missing (since I assume that they could be necessary). Even better would be to determine which of these files are required, and which are just optional (in which case it wouldn't even be necessary to warn). Let's wait for @rob-luke.

@rob-luke
Copy link
Member

Thanks for addressing this @swy7ch,

You are correct that the dat is not used in the current MNE read nirx function. We could exclude this from the file check as you have pointed out. All the NIRScout data I have seen from several labs over the last two years has contained a dat file. Did your device not create it (perhaps this is a NIRSport thing?), or did you delete it?

Regarding the other files that are currently not used. The current nirx reader is in minimum working form, I am still expanding it to read additional information (from last few days see #7891). The data saved by the device has lots of duplication, for example the same trigger/event information is stored in both the .evt and hdr file, currently I am just reading one of these, but the plan is to eventually read both and test to ensure they are consistent. Other files contain information that I would like to add to the reader but haven't had a chance yet, such as channel noise measurements (perhaps useful for setting bad channels).

My understanding is that all of these files are always saved by the vendor software (please correct me if I am wrong here, I don't have access to all devices). My preference is that people simply copy the saved directory and load that directly in MNE. If files are missing it means people have played with the data, and that is something I try to avoid.

So my preference is that we keep the code as is (provided the dat file is always created by vendor software). People shouldn't modify their raw data, and for one hour measurements the dat file is approximately 8 MB which shouldn't break the data bank.

However, if everyone disagrees with me then my second preference is to throw a warning if the dat file is missing saying something like "Warning: The NIRX dat file is missing, this means the data has been modified."

Third preference is to simply remove dat from the list of files to check. I'm ok with this solution too. However, I think we should keep the check for set, tp, and config as I do plan to read this in future versions.


On another note. @swy7ch I do not have access to a NIRSport device and so there are no NIRSport recordings in the test set. Would you be willing to take a short measurement (10 seconds or so, doesn't need a head attached) and send it to me so I can add it to the test set? If so, here is an example of the detail I would require from you.

@swy7ch
Copy link
Contributor Author

swy7ch commented Jun 13, 2020

Thanks for addressing this @swy7ch,

You are correct that the dat is not used in the current MNE read nirx function. We could exclude this from the file check as you have pointed out. All the NIRScout data I have seen from several labs over the last two years has contained a dat file. Did your device not create it (perhaps this is a NIRSport thing?), or did you delete it?

As I said earlier in the thread, I'm a newcomer in the lab I work at (I've been there for 2 weeks) and I'm not the one using the machines : I'm part of the "Technical Team" whose purpose is to provide tools for non-developpers to ease data processing. I will ask my coworkers about this.

My understanding is that all of these files are always saved by the vendor software (please correct me if I am wrong here, I don't have access to all devices). My preference is that people simply copy the saved directory and load that directly in MNE. If files are missing it means people have played with the data, and that is something I try to avoid.

Haven't had the time to fully scrap the NIRx website, but the data they provide here as samples do not contain *.dat nor *.avg files. They also do not provide *_probeInfo.mat files, so I don't know if they deleted stuff from the output or not.

So my preference is that we keep the code as is (provided the dat file is always created by vendor software). People shouldn't modify their raw data, and for one hour measurements the dat file is approximately 8 MB which shouldn't break the data bank.

However, if everyone disagrees with me then my second preference is to throw a warning if the dat file is missing saying something like "Warning: The NIRX dat file is missing, this means the data has been modified."

I tend to prefer the 2nd option : if we don't actually use a *.dat file, I think it's better to throw a warning instead of exiting. Now of course, if the file data that may be useful, it must be kept in the code, just like the files you cite next. Of course, I agree with the fact that "people shouldn't modify their raw data".

Third preference is to simply remove dat from the list of files to check. I'm ok with this solution too. However, I think we should keep the check for set, tp, and config as I do plan to read this in future versions.

On another note. @swy7ch I do not have access to a NIRSport device and so there are no NIRSport recordings in the test set. Would you be willing to take a short measurement (10 seconds or so, doesn't need a head attached) and send it to me so I can add it to the test set? If so, here is an example of the detail I would require from you.

As I said I'm a newcomer and didn't conduct the experiment myself, but I will definitely see with my tutor and other researchers in the lab, no problem !

--

Alright I had an epiphany. You said you only worked with NIRScout datasets, and I know some of ours was produced through NIRSport (I just don't know which one of them)... Correct me if I'm wrong, but the NIRSport doesn't provide real-time data display, right ? On the NIRx website, no photo of people wearing a NIRSport shows the device being linked to a display... And if the *.dat (and *.avg) files are only created to provide real-time data display and the NIRSport doesn't support it, then maybe the files are not even created in the first place ?

I'll talk about it with my coworkers and try to sort our datasets out to see if my hypothesis makes sense, and I'll come back to you next week. Thanks for the answer, and have a good week-end :)

@cbrnr
Copy link
Contributor

cbrnr commented Jun 13, 2020

I'd prefer the second option too. It is better to have a loader that works with a minimum of required files. If there's additional information contained in separate files, and these files are missing, a warning would be much better than not loading the file at all. Also, I wouldn't say that if a file is missing automatically means that the data has been modified - it means that a file is missing, which is different from having modified the data.

Also, some files do not seem to be created by other NIRX devices, which means they should really be treated as optional.

Even if you do plan to add support for reading some other files, in the meantime I'd still fall back to warn if these are not present.

@rob-luke
Copy link
Member

rob-luke commented Jun 14, 2020

the NIRx website, but the data they provide here as samples

I have repeatedly encountered large differences between the NIRX manual, the website, and data created by the latest NIRX software. For example I just went to the link you posted and downloaded the first finger tapping example, it does not include a .inf file and the header file does not state the version it was recorded with. I would recommend you verify everything based on data you record yourself. And similarly, I will only approve code based on measured data, not on NIRX docs.

Also, some files do not seem to be created by other NIRX devices, which means they should really be treated as optional.

This is yet to be established. We should wait for Swy7ch to get clean files from the devices, not data that's been passed between people.

Even if you do plan to add support for reading some other files, in the meantime I'd still fall back to warn if these are not present.

I disagree, if the file is not required now, people will think it is not required and may delete it. Then in the future we will never be able to retrieve it. I think people should simply copy the data saved by the machine and load that. We will require these files in the future, so I don't want to create compatibility issues. The situation I imagine is that someone says "in MNE version 0.X you didn't require these files so I deleted them, now in MNE version 0.Y you require these files", then we are going to have to maintain a code base for with and without the files.

I will open a PR for option two stated above. I will warn if the dat file is missing. If Swy7ch demonstrates that the NIRSport does not save a dat file, I will add a check for the device type (based on header) and not warn for NIRSport.

@rob-luke
Copy link
Member

For example I just went to the link you posted and downloaded the first finger tapping example

Ok I also went and downloaded their example NIRSport2 device example files and they have a very different folder structure as shown in screenshot below. This seems very different to the NIRScout structure. It may be that a more serious rewrite is required to support NIRSport devices.

image

To move forward I think we...

  1. Change the missing dat from a fail to warn (in PR above).
  2. Wait for someone with access to a NIRSport device to take some test data and confirm the data format with latest vendor software
  3. Determine if dats are created with NIRSport. If not, create a PR to not warn if they are missing for this device type.
  4. Determine if a larger rewrite is required for this device type. If so, we can read the device type from the header and throw a warning to user that NIRSport is not fully supported/tested.

@swy7ch
Copy link
Contributor Author

swy7ch commented Jun 15, 2020

Sorry for the late reply.

I have repeatedly encountered large differences between the NIRX manual, the website, and data created by the latest NIRX software. [...] I would recommend you verify everything based on data you record yourself. And similarly, I will only approve code based on measured data, not on NIRX docs.

Well, that's what I figured out. That's a shame really.

I disagree, if the file is not required now, people will think it is not required and may delete it.

Indeed, better safe than sorry.

I think people should simply copy the data saved by the machine and load that.

Definitely ! No pre-preprocessing is better.

We should wait for @swy7ch to get clean files from the devices, not data that's been passed between people.

I've asked my coworkers about the NIRSport data, I'll come back when I get some news.

@rob-luke
Copy link
Member

I've asked my coworkers about the NIRSport data, I'll come back when I get some news.

Thanks @swy7ch, no rush, we will get the dat file issue merged and then tackle the rest as it comes.

@cbrnr
Copy link
Contributor

cbrnr commented Jun 15, 2020

I totally agree that it is extremely important not to modify the raw data. However, in practice I'd still prefer a reader that can read NIRS data even though one or two optional files are missing. At the end of the day, this makes the difference between being able to extract the signal vs. reading an error message and not getting anything at all. If the reader emits clear warnings I don't see a reason to intentionally not read the data even though we could. I've noticed that EDFbrowser is very strict in parsing EDF files vs. our own EDF readers - I have seen several EDF files that I could open with MNE, whereas EDFbrowser just informed me that the file was not a valid EDF file and simply did not open it.

@cbrnr
Copy link
Contributor

cbrnr commented Jun 15, 2020

BTW, I just found out that we have a NIRScout NIRSport and a NIRScout NIRSport 2 devices in our lab. I will check which files these devices generate. In the near future, we will also have a NIRScout, so I can also go and check there.

@swy7ch
Copy link
Contributor Author

swy7ch commented Jun 15, 2020

I just found out that we have a NIRScout and a NIRScout 2

NIRSport, right ?

@cbrnr
Copy link
Contributor

cbrnr commented Jun 15, 2020

Ups, sorry, yes, NIRSport 1 and 2.

@swy7ch
Copy link
Contributor Author

swy7ch commented Jun 15, 2020

Well, that was quick ! Thank you guys !

@cbrnr
Copy link
Contributor

cbrnr commented Jun 17, 2020

@rob-luke @swy7ch here is the list of files produced by a NIRSport 2 device:

 34K Jun 17 11:49 2020-06-17_001.nirs
 29K Jun 17 11:49 2020-06-17_001.wl1
 29K Jun 17 11:49 2020-06-17_001.wl2
 77K Jun 17 11:49 2020-06-17_001.zip
4.1K Jun 17 11:48 2020-06-17_001_config.json
123B Jun 17 11:49 2020-06-17_001_description.json
3.4K Jun 17 11:48 2020-06-17_001_probeInfo.mat
795B Jun 17 11:49 digpts.txt

The zip file contains a .roh text file (containing I think the raw data).

And here is the list for the older NIRSport device:

NIRS-2014-01-01_002.evt
NIRS-2014-01-01_002.hdr
NIRS-2014-01-01_002.set
NIRS-2014-01-01_002.wl1
NIRS-2014-01-01_002.wl2
NIRS-2014-01-01_002_config.txt

So it looks like these devices produce very different sets of files, but the core data is present in both devices: .wl1 and .wl2. The metainfo is contained in an .hdr file for NIRSport, but NIRSport2 uses a .json file for the same purpose. Some data is redundant, e.g. markers are already present in the .hdr file, but also in an .evt file (for NIRSport).

IMO a NIRX reader should be able to read signals recorded from all three devices, even if only the bare minimum is available (i.e. .wl1 and .wl2 and some form of header). But I guess we should discuss this in a new issue and/or PR.

@rob-luke
Copy link
Member

Thanks @cbrnr!

IMO a NIRX reader should be able to read signals recorded from all three devices, even if only the bare minimum is available (i.e. .wl1 and .wl2 and some form of header).

Based on this new information I totally agree. Once we get some NIRScout test files from both version 1 and 2 we should do a rewrite. I do not have access to a NIRScout device.

34K Jun 17 11:49 2020-06-17_001.nirs

The nirs file can also be generated by the NIRScout system. It is an extra checkbox that you click. So this may be the same for the NIRScout. The .nirs format is from the Homer toolbox and has been embraced by several other toolboxes. However, it is being replaced by the SNIRF format so I do not think its a priority to implement a .nirs reader. But we absolutely should have a .snirf reader as this is the new standard and will be an export option in future devices. I am waiting for a NIRX to SNIRF converter, then was planning to implement this. SNIRF will also become the approved BIDS file format.

@cbrnr
Copy link
Contributor

cbrnr commented Jun 17, 2020

Based on this new information I totally agree. Once we get some NIRScout test files from both version 1 and 2 we should do a rewrite. I do not have access to a NIRScout device.

I have the files I've listed above. I will link them in #7905 once I get permission to share them. Also, my lab will soon have a NIRScout device, so I can provide all the NIRX info we need 😄.

I completely agree that supporting .nirs files should be very low priority and that .snirf is much more important. But I also think that an option to load the primary raw data directly (what we currently have) is the most important thing.

Let's continue our discussion in #7905.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants