Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG, ENH: Allow picking without preload #7507

Merged
merged 9 commits into from
Apr 8, 2020
Merged

Conversation

larsoner
Copy link
Member

@larsoner larsoner commented Mar 24, 2020

I realized that a clean path to picking without preload is to follow this to-do list:

  • Ensure that self._read_segment_file for all readers is implemented such that:

    1. self._raw_extras is always a list of dict, and self._raw_extras[fi]['orig_nchan'] gets automatically added by BaseRaw to store the original number of channels on disk (as it's used by many readers, might as well add it in one place).
    2. It only uses the attributes self._filenames and self._raw_extras. This mostly requires removing uses of self.info, but these uses are fairly rare.

    Most of our readers were already close to being stuctured this way, so this was a fairly simple refactoring. This is in the current commit. It uses a little wrapper _ReadSegmentFileProtector to make sure only the correct two attributes are used. It also changes idx to always be a ndarray of int rather than possibly being a slice because this simplifies some code and shouldn't incur much of any performance penalty (will check against master before merge).

    The one annoying thing I had to do to make this work was update egimff annotation setting, which was previously being done using a hack marked with XXX but now uses self.set_annotations like elsewhere in mne/io/*/*.

  • Make a new variable self._read_picks that is used to sub-index when reading. Currently this is just np.arange(self._raw_extras[fi]['orig_nchan']), so it's a no-op, but should allow us to ...

  • Add support for raw.pick_types without preload by subselecting self._read_picks. Currently this is not done because I want to make sure the changes thus far lead to green CIs (I think they will).

  • Add tests for pick-without-preload.

  • Update examples where picking is used to do it before load_data to save memory.

  • Double-check that this PR does not slow down I/O with profiling.

  • Update latest.inc

TL;DR: BaseRaw keeps track of which picks from the original data on disk we actually need and doing the appropriate subselection (the new thing), and each individual reader just needs to care about reading a set of idx indices of data from the original ones on disk (which has always been the case, just clearer now).

Marking for 0.21 because this is a big enough change that we should probably do it at the beginning of a release cycle rather than the end.

Closes #1766.

@larsoner larsoner added this to the 0.21 milestone Mar 24, 2020
@codecov
Copy link

codecov bot commented Mar 24, 2020

Codecov Report

Merging #7507 into master will increase coverage by 0.10%.
The diff coverage is 93.46%.

@@            Coverage Diff             @@
##           master    #7507      +/-   ##
==========================================
+ Coverage   90.12%   90.23%   +0.10%     
==========================================
  Files         450      452       +2     
  Lines       82709    83638     +929     
  Branches    13074    13377     +303     
==========================================
+ Hits        74544    75469     +925     
- Misses       5344     5349       +5     
+ Partials     2821     2820       -1     

@larsoner
Copy link
Member Author

Okay so using Hubert's benchmarking script benchmark_fif_edf (uploaded a variant here) I did see a 1-2 ms slowdown due to idx not being a slice, so I put code back in to slice-ify it where applicable. Now timings are the same here as on master.

@larsoner larsoner changed the title WIP, ENH: Allow picking without preload MRG, ENH: Allow picking without preload Mar 27, 2020
@larsoner
Copy link
Member Author

This is now ready for review/merge

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I see a red Azure but it's weird as it's green when I click....

@larsoner
Copy link
Member Author

larsoner commented Apr 8, 2020

It was red due to download problems and I restarted it, no idea why GitHub didn't update to reflect the pass

@larsoner larsoner merged commit c0b139b into mne-tools:master Apr 8, 2020
@larsoner larsoner deleted the pick branch April 8, 2020 20:39
larsoner added a commit to larsoner/mne-python that referenced this pull request Apr 10, 2020
* upstream/master: (1522 commits)
  FIX: Show bug
  MRG, FIX: Datetime call in gdf 2.x age calculation (mne-tools#7581)
  DOC: Simplify Darwin installation (mne-tools#7584)
  MRG, ENH: Allow picking without preload (mne-tools#7507)
  DOC: Document anonymization better (mne-tools#7587)
  Rework _Brain show (mne-tools#7580)
  DOC: Fixes in tutorial (mne-tools#7579)
  ENH: muscle artifact detection (mne-tools#7407)
  MRG: Remove toolbars in PyVista plotter (mne-tools#7572)
  WIP: Deregister plotter from the figure list in close() (mne-tools#7573)
  MRG: Fix mouse wheel event in _TimeViewer (mne-tools#7563)
  FIX: Fix toggle all (mne-tools#7567)
  MRG, FIX: parallel n_jobs check (mne-tools#7566)
  Rename artifact detection to movement detection (mne-tools#7569)
  ENH: Update spelling check [ci skip] (mne-tools#7565)
  MRG, ENH: Dont require preload for raw resample (mne-tools#7508)
  MRG: Add interpolation for NIRS signals (mne-tools#7428)
  WIP: Add temporal derivative distribution repair algorithm (mne-tools#7556)
  DOC: fix link in docstr [skip ci] (mne-tools#7562)
  ENH: Custom figure title when plotting Dipole locations (mne-tools#7558)
  ...
larsoner added a commit to larsoner/mne-python that referenced this pull request Apr 25, 2023
* upstream/master: (1522 commits)
  FIX: Show bug
  MRG, FIX: Datetime call in gdf 2.x age calculation (mne-tools#7581)
  DOC: Simplify Darwin installation (mne-tools#7584)
  MRG, ENH: Allow picking without preload (mne-tools#7507)
  DOC: Document anonymization better (mne-tools#7587)
  Rework _Brain show (mne-tools#7580)
  DOC: Fixes in tutorial (mne-tools#7579)
  ENH: muscle artifact detection (mne-tools#7407)
  MRG: Remove toolbars in PyVista plotter (mne-tools#7572)
  WIP: Deregister plotter from the figure list in close() (mne-tools#7573)
  MRG: Fix mouse wheel event in _TimeViewer (mne-tools#7563)
  FIX: Fix toggle all (mne-tools#7567)
  MRG, FIX: parallel n_jobs check (mne-tools#7566)
  Rename artifact detection to movement detection (mne-tools#7569)
  ENH: Update spelling check [ci skip] (mne-tools#7565)
  MRG, ENH: Dont require preload for raw resample (mne-tools#7508)
  MRG: Add interpolation for NIRS signals (mne-tools#7428)
  WIP: Add temporal derivative distribution repair algorithm (mne-tools#7556)
  DOC: fix link in docstr [skip ci] (mne-tools#7562)
  ENH: Custom figure title when plotting Dipole locations (mne-tools#7558)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Preload subset of channels or channel types from raw.fif
2 participants