Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: muscle artifact detection #7407

Merged
merged 35 commits into from
Apr 7, 2020
Merged

Conversation

AdoNunes
Copy link
Contributor

@AdoNunes AdoNunes commented Mar 9, 2020

closes #7359

I have added a new function detect_muscle in artifact_detection.py. I also included the example and test.

@codecov
Copy link

codecov bot commented Mar 9, 2020

Codecov Report

Merging #7407 into master will not change coverage by %.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #7407   +/-   ##
=======================================
  Coverage   90.10%   90.10%           
=======================================
  Files         452      452           
  Lines       83102    83102           
  Branches    13165    13165           
=======================================
  Hits        74882    74882           
  Misses       5387     5387           
  Partials     2833     2833           

@AdoNunes
Copy link
Contributor Author

AdoNunes commented Mar 9, 2020

@larsoner I am getting again a test_nesting error. Not sure if it's my fault...

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty clean @AdoNunes !

did you test this code on many subjects @AdoNunes ?

My biggest concern is on the validation for EEG data.

@mmagnuski
Copy link
Member

I can test with eeg data later (today or tomorrow), a few steps down the road it would be nice to have a simple detection GUI similar to what is available in fieldtrip.

@AdoNunes
Copy link
Contributor Author

AdoNunes commented Mar 9, 2020

looks pretty clean @AdoNunes !

did you test this code on many subjects @AdoNunes ?

My biggest concern is on the validation for EEG data.

@agramfort thanks for your comments. I run the example on the Brainstorm just because the annotate_movement was run on it. The only difference with EEG will be the threshold, as it has fewer channels.
The function was written by @bloyl (hence his name is in the script authorship). In the code sprint I tested against FieldTrip function and was almost equivalent. I also run it on our data and the detection was as expected.

Latter, in between my thesis writing rests I will address your comments.

@agramfort
Copy link
Member

agramfort commented Mar 9, 2020 via email

@mmagnuski
Copy link
Member

can you find a parametrization that is immune to the number of channels?

In fieldtrip IIRC they divide by sqrt(n_channels)

@AdoNunes
Copy link
Contributor Author

AdoNunes commented Mar 9, 2020

Actually, in fieldtrip they just take the sum of all the channels zscores, thus the threshold can be very high, typically 10. Here the mean is taken. Do you suggest to divide the sum by the sqrt(n_channels)?

@agramfort
Copy link
Member

agramfort commented Mar 9, 2020 via email

@AdoNunes
Copy link
Contributor Author

I will work on testing sum(zcores)/sqrt(n_chans) tomorrow.

I wanted to use eegbci data for the example but the sampling rate is 160Hz. I will look for other eeg datasets.

raw_copy.pick_types(ref_meg=False) # Remove ref chans just in case
# Only one type of channel, otherwise z-score will be biased
assert(len(set(raw_copy.get_channel_types())) == 1), 'Different channel ' \
'types, pick one type'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw an explicit ValueError error instead of AssertionError

@mmagnuski
Copy link
Member

I wanted to use eegbci data for the example but the sampling rate is 160Hz. I will look for other eeg datasets.

I think it is better to be more flexible in the filtering step (for example allow the user to specify what frequency range should be considered) than to ignore datasets with such sampling rate.

I am also wondering whether ignoring signal segments with bad annotations in the z score computation would be helpful. (imagine there is some known period with huge artifacts that would bias the z scores) If so, these segments could be nan'ed out (or removed) prior to zscoring.

@larsoner
Copy link
Member

I am also wondering whether ignoring signal segments with bad annotations in the z score computation would be helpful. (imagine there is some known period with huge artifacts that would bias the z scores) If so, these segments could be nan'ed out (or removed) prior to zscoring.

The way I would do it is make it so that this function ignores any segments that are already marked BAD, it's more or less standard way of handling these things nowadays via a skip_by_annotation kwarg.

@AdoNunes
Copy link
Contributor Author

The way I would do it is make it so that this function ignores any segments that are already marked BAD, it's more or less standard way of handling these things nowadays via a skip_by_annotation kwarg.

I can't find how to use skip_by_annotation, it seems that is for filtering the data. I add it in the last PR, but I guess it shouldn't be in the filtering function...

@agramfort
Copy link
Member

agramfort commented Mar 11, 2020 via email

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the specific comments below, generally speaking I'd encourage you to interleave the code and explanation a little more, instead of having just a few big code blocks. One example is the note about notch-filtering before running muscle artifact detection; it would make sense to have this note come after all the data is loaded, right before a small code block where you run the notch filtering. Similarly, the text about only using one channel type could be right next to the pick_types line.

@AdoNunes
Copy link
Contributor Author

Following @mmagnuski conversation on the code suggestions debating over if it is a good idea to slow pass filtering of the z-scores to smooth peaks. I looked at two datasets, the sample elekta data, and the brainstorm auditory data.

I have to say that both recordings are very clean, probably with well trained subjects, and in my case, with children or patients data it does not look that clean. So the low-pass smoothing advantage is not that clear.
Below are the z-scores, first row: low-passed, second: no low-pass, first col: magnetometers, second: gradiometers
image
Magnetometers are more sensitive to muscle compared to planar gradiometers. Looking at the data, I would mark the three peaks of the low-passed z-scores. The non low-passed if the threshold is increased it would also find three muscle artifacts. However, the third peak magnitude is almost twice as big. The first is more sustained but less intense, and in more noisy data it might not get picked up.

Below are the z-scores for brainstorm auditory subject one:
image
and the dataplot for the second peak detected with no low-pass:
image
Also a very well behaved subject, in a visual annotation I would have selected the main peak. When not doing the smoothing transient noisy peaks get closer to real artifacts. When selecting muscle artifacts in my data, it was annoying to unmark false positive sporadic peaks.

LMKWYT

@AdoNunes
Copy link
Contributor Author

The raw.copy() inside the function is necessary otherwise it will change the raw

If you use, for example, mne.pick_types(raw.info, meg='mag') instead of raw.pick_types(meg='mag') then you'll get back an array of integers, which you can use in a call to raw.get_data(), and then just work with a numpy array instead of the raw object from that point on. That should allow you to avoid making a copy of the whole raw object (including all the channels you're not. For that matter, raw.get_data() allows you to pass a channel type string, so you could possibly even skip the mne.pick_types() step and just do raw.get_data(picks='mag'). It will always make a copy (but only of the channels you ask for and only the data values), so there is no risk of modifying the original raw object.

I see. I think that at this point, I would prefer to leave it for another PR. This function will go into the 0.21 version and there will be time if it is needed.

@AdoNunes AdoNunes requested a review from drammock March 26, 2020 04:25
@bloyl
Copy link
Contributor

bloyl commented Mar 26, 2020

+1 for delaying removing raw.copy() until a different PR.

The complication is raw_copy.filter() and raw_copy.apply_hilbert() which act in place. While doable i think some reorganization of those methods would be need to act directly on data arrays.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only a few minor docstring nitpicks left at this point. +1 for merge.

@drammock
Copy link
Member

drammock commented Mar 26, 2020 via email

@AdoNunes
Copy link
Contributor Author

only a few minor docstring nitpicks left at this point. +1 for merge.

+10 for merge

@AdoNunes
Copy link
Contributor Author

@larsoner @agramfort I thought that the PR was already merged after 2 approvals?

@drammock
Copy link
Member

@larsoner @agramfort I thought that the PR was already merged after 2 approvals?

Please exhibit some patience. Version 0.20 just shipped (which is a lot of work); everyone is dealing with a global pandemic, and it is the weekend. I understand that you have invested a lot of time and energy in this PR, and maybe are eager to be done with it. Don't get pushy. Some PRs are merged after one approval, some after two, sometimes more. It depends on the content of the PR and the expertise of the person(s) reviewing it.

@AdoNunes
Copy link
Contributor Author

AdoNunes commented Apr 6, 2020

In case you are waiting for me to do something, plz let me know it. I just marked the conversations as resolved in case you were waiting for it.

@mmagnuski
Copy link
Member

Hi @AdoNunes, it would be good to just add a test for the ValueError part you added, you can use:

with pytest.raises(ValueError, match="No M/EEG channel types found"):
    # call annotate_muscle_zscore with data without meg or eeg

and all would be good. :)

@AdoNunes
Copy link
Contributor Author

AdoNunes commented Apr 7, 2020

with pytest.raises(ValueError, match="No M/EEG channel types found"):
    # call annotate_muscle_zscore with data without meg or eeg

@mmagnuski what a surprise 😆
What should I put as an indented block from the pytest.raises?

@drammock
Copy link
Member

drammock commented Apr 7, 2020

@mmagnuski what a surprise

It shouldn't be a surprise. We try to test every single warning and error that we raise, to make sure that they are actually happening in the conditions that we want them to happen.

What should I put as an indented block from the pytest.raises?

As @mmagnuski said, put in a call to your function, passing in data that doesn't have meg or eeg channels.

@AdoNunes
Copy link
Contributor Author

AdoNunes commented Apr 7, 2020

@drammock Sorry I don't get it. Now I am more surprised -well confused-, I thought that I had to put it in the annotate_muscle_zscore.
I have to create a test function in the test_artifact_detection?

@drammock
Copy link
Member

drammock commented Apr 7, 2020

Quoting from the contributor guide:

All new functionality must have test coverage

For example, a new mne.Evoked method in mne/evoked.py should have a corresponding test in mne/tests/test_evoked.py.

So yes, the test should go in mne/preprocessing/tests/test_artifact_detection.py

@AdoNunes
Copy link
Contributor Author

AdoNunes commented Apr 7, 2020

@mmagnuski done, the new test went through all the checks. All good now? 😄

@larsoner larsoner merged commit e3cc4b1 into mne-tools:master Apr 7, 2020
@larsoner
Copy link
Member

larsoner commented Apr 7, 2020

Thanks @AdoNunes !

@mmagnuski
Copy link
Member

👍

larsoner added a commit to larsoner/mne-python that referenced this pull request Apr 10, 2020
* upstream/master: (1522 commits)
  FIX: Show bug
  MRG, FIX: Datetime call in gdf 2.x age calculation (mne-tools#7581)
  DOC: Simplify Darwin installation (mne-tools#7584)
  MRG, ENH: Allow picking without preload (mne-tools#7507)
  DOC: Document anonymization better (mne-tools#7587)
  Rework _Brain show (mne-tools#7580)
  DOC: Fixes in tutorial (mne-tools#7579)
  ENH: muscle artifact detection (mne-tools#7407)
  MRG: Remove toolbars in PyVista plotter (mne-tools#7572)
  WIP: Deregister plotter from the figure list in close() (mne-tools#7573)
  MRG: Fix mouse wheel event in _TimeViewer (mne-tools#7563)
  FIX: Fix toggle all (mne-tools#7567)
  MRG, FIX: parallel n_jobs check (mne-tools#7566)
  Rename artifact detection to movement detection (mne-tools#7569)
  ENH: Update spelling check [ci skip] (mne-tools#7565)
  MRG, ENH: Dont require preload for raw resample (mne-tools#7508)
  MRG: Add interpolation for NIRS signals (mne-tools#7428)
  WIP: Add temporal derivative distribution repair algorithm (mne-tools#7556)
  DOC: fix link in docstr [skip ci] (mne-tools#7562)
  ENH: Custom figure title when plotting Dipole locations (mne-tools#7558)
  ...
larsoner added a commit to larsoner/mne-python that referenced this pull request Apr 25, 2023
* upstream/master: (1522 commits)
  FIX: Show bug
  MRG, FIX: Datetime call in gdf 2.x age calculation (mne-tools#7581)
  DOC: Simplify Darwin installation (mne-tools#7584)
  MRG, ENH: Allow picking without preload (mne-tools#7507)
  DOC: Document anonymization better (mne-tools#7587)
  Rework _Brain show (mne-tools#7580)
  DOC: Fixes in tutorial (mne-tools#7579)
  ENH: muscle artifact detection (mne-tools#7407)
  MRG: Remove toolbars in PyVista plotter (mne-tools#7572)
  WIP: Deregister plotter from the figure list in close() (mne-tools#7573)
  MRG: Fix mouse wheel event in _TimeViewer (mne-tools#7563)
  FIX: Fix toggle all (mne-tools#7567)
  MRG, FIX: parallel n_jobs check (mne-tools#7566)
  Rename artifact detection to movement detection (mne-tools#7569)
  ENH: Update spelling check [ci skip] (mne-tools#7565)
  MRG, ENH: Dont require preload for raw resample (mne-tools#7508)
  MRG: Add interpolation for NIRS signals (mne-tools#7428)
  WIP: Add temporal derivative distribution repair algorithm (mne-tools#7556)
  DOC: fix link in docstr [skip ci] (mne-tools#7562)
  ENH: Custom figure title when plotting Dipole locations (mne-tools#7558)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: what to do with muscle artifacts
7 participants