Preload subset of channels or channel types from raw.fif #1766

haribharadwaj · 2015-01-28T15:56:10Z

I have simultaneous MEG/EEG data sampled at 5 kHz (because I'm also recording brainstem activity) and it would nice to be able to preload say only EEG or only subsets of channels etc. and do analysis/eye-balling.. There are probably other good use cases for this too..

I can see this being possible quickly in a few different ways:
(1) Add kwargs like eeg=True, meg=False etc. to pick only certain channel types..
(2) Take channel names (as opposed to indices) list for a pickNames argument..
(3) Add a regular picks argument and then the user does two steps... (i) load with preload=False and determine picks from the info and then (ii) re-run the file read with preload=True and picks=picks

Thoughts?

larsoner · 2015-01-28T16:07:24Z

There is now a simple preload_data function in master. Whatever you do, I'm
-1 on adding more arguments to the Raw constructor and +1 on putting them
there for such specialized use cases. Your option 3 would fit nicely there.

I suspect that adding these options will make things quite a bit more
complex, though. Have you looked at the raw data reading functionality? It
is already somewhat complex, so I worry that making it more complex will
lead to bugs.

If we do go this route, you'll probably have to make a ton of tests to make
sure we don't miss corner cases, and do performance profiling to make sure
we aren't reducing read performance for standard use cases.

agramfort · 2015-01-28T16:18:55Z

something along these lines:

raw = mne.io.Raw(...)
picks = mne.pick_types(raw.info, ...)
ch_names = [raw.ch_names[k] for k in picks]
raw.pick_channels(ch_names)
raw.preload_data()

?

but currently you cannot pick_channels on non-preloaded data....

larsoner · 2015-01-28T16:25:37Z

The API I'm proposing is more like this:

raw = mne.io.Raw(..., preload=False)
picks = mne.pick_types(raw.info, ...)
raw.preload_data(picks=picks)

And the raw data fetching functions, and all functions that use raw._preloaded (or whatever we call it) are made smart enough to deal with partially preloaded data.

You're right that doing pick_channels might also work for the use case, especially if there is a complementary function that allows the user to re-combine the two (or more) datasets after they've been split. But in that case, you're likely to end up back at having data partially pre-loaded again, so I'm not sure it buys too much. It does depend on the specific use cases, though.

agramfort · 2015-01-28T16:28:02Z

I like what you suggest. +1 for picks argument in preload_data

haribharadwaj · 2015-01-28T16:29:18Z

Ah... This is more complex than I imagined.. I just learnt about raw.preload_data()... +1 for picks arg in preload_data().. Is it Ok that the raw object would be left with a sort of out-dated info?

larsoner · 2015-01-28T16:33:02Z

No -- the idea is that the info would still be correct, as all channels would still be present. But doing this would only load the selected data from disk into memory:

picks_eeg = pick_types(raw.info, meg=False, eeg=True)
raw.preload_data(picks=picks_eeg)

So that this operation would be immediate:

raw[picks_eeg]

And things like filtering only the EEG channels would work. But operations like this would still be permitted:

picks_meg = pick_types(raw.info, meg=True, eeg=False)
raw[picks_meg]

This would thus read from disk, instead of memory. And filtering the MEG channels would not be permitted, at least not until someone did this:

raw.preload_data(picks=picks_meg)

Which would cause the MEG data to also be preloaded (kind of like an append operation). Eventually we could add a preload_data(picks='clear') or something that would make it un-preload data if we need it, but we wouldn't need it for the first pass.

haribharadwaj · 2015-01-28T16:41:03Z

Got it! So if the picks is simply propagated to the sel arg of _read_segment(.) it would do the job, wouldn't it?

larsoner · 2015-01-28T16:50:28Z

I'm sure that would be part of it, but the overall job would be much more complex. You'd probably need to, at a minimum:

Allow raw._preloaded to be a list, or a boolean.
Change all checks of raw._preloaded to raw._preloaded is True or <all picks in _preloaded list>.
Deal with multiple calls to .preload_data(picks=picks) with proper creation / expansion / checking of raw._data array.
Deal with calls to the get data segment functions, so that they pull from the partially loaded raw._data, or the disk, as necessary.
Deal with how raw concatenation works with ._preloaded -- if two files have the same subset of channels preloaded, they should stay preloaded when being concatenated.
Implement tests for all functions that check the preload state that check different cases of partial preload_data calls.
Check that performance has not decreased due to the changes.

There are probably other things, but this is what comes to mind when I think about how I'd implement it.

agramfort · 2015-01-28T17:10:14Z

let's keep it simple: preload_data can only be called once and everything
stays almost the same.

dengemann · 2015-01-28T17:14:38Z

Do we really need this? If so please keep it as simple as possible.

On 28 Jan 2015, at 18:10, Alexandre Gramfort [email protected] wrote:

let's keep it simple: preload_data can only be called once and everything
stays almost the same.
—
Reply to this email directly or view it on GitHub.

haribharadwaj · 2015-01-28T18:27:06Z

@agramfort, @dengemann: So you would suggest the

raw.pick_channels(...)
raw.preload_data()

workflow?

dengemann · 2015-01-28T18:35:43Z

@haribharadwaj we could make our pick/drop_channels function make return self
raw.pick_channels(...).preload_data() if you need a one liner.

I'm also fine with only allowing to call preload_data onece and add a picks etc argument.

agramfort · 2015-01-28T21:54:38Z

I would do:

raw = mne.io.Raw(...)
picks = mne.pick_types(raw.info, ...)
raw.preload_data(picks=picks)

and no later possibility to load other channels

larsoner · 2015-01-28T21:56:29Z

FWIW I don't think the possibility of later loading channels is the really challenging part of the coding. Most of the challenge lies in keeping track of which channels are preloaded and which are not, and triaging behavior appropriately. Adding more channels to preload later is pretty trivial by comparison (append/mix matrices, and extend list of loaded channels)...

agramfort · 2015-01-28T21:59:31Z

doable but it's more work. You'd need to have raw._data maybe as a list of
arrays to avoid reallocs. With my suggestion you just call pick_info and
you're done. You then have in info only the preloaded channels.

larsoner · 2015-01-28T22:01:47Z

As a user I wouldn't expect the function preload_data(picks=...) to restrict my raw file (and info) to then only contain a subset of channels. It seems like the API workflow would be better as raw.pick_channels(...).preload_data() as @dengemann suggested.

dengemann · 2015-01-28T22:02:15Z

+1 let's add one basic version now. it's a rare usecase anyways, I suppose

On 28 Jan 2015, at 22:59, Alexandre Gramfort [email protected] wrote:

doable but it's more work. You'd need to have raw._data maybe as a list of
arrays to avoid reallocs. With my suggestion you just call pick_info and
you're done. You then have in info only the preloaded channels.
—
Reply to this email directly or view it on GitHub.

agramfort · 2015-01-28T22:04:32Z

As a user I wouldn't expect the function preload_data(picks=...) to restrict my raw file (and info) to then only contain a subset of channels.

I would say as long as it's documented and error message are clear
that would be fine with me. It's also a really simple change. But you
may feel brave and nerd snipped :)

larsoner · 2015-01-28T22:08:18Z

I see how it would be simpler from an implementation/developer perspective, but I don't think it makes sense from an API/user perspective. To me, channel picking/restricting should be done by pick_channels, and data preloading should be done by preload_data (regardless of whether it loads all or a subset of the data). Mixing both channel-picking and preloading functionality into preload_data seems like a strange, unpythonic move to me.

Maybe it's not so difficult to have raw.pick_channels() work with non-preloaded data... I'd have to think about it :) The _read_raw_segment function already can read a specific section of channels, so I think it would just require keeping track of which original file indices are the correct ones.

dengemann · 2015-01-28T22:09:30Z

I agree with hari that chaining would be maximum clean and self-documenting

It's already supported as is I think, so nothing to add. just make sure that the pick/drop methods return self if copy=False --- if not a minimum change

On 28 Jan 2015, at 23:04, Alexandre Gramfort [email protected] wrote:

As a user I wouldn't expect the function preload_data(picks=...) to restrict my raw file (and info) to then only contain a subset of channels.

I would say as long as it's documented and error message are clear
that would be fine with me. It's also a really simple change. But you
may feel brave and nerd snipped :)
—
Reply to this email directly or view it on GitHub.

agramfort · 2015-01-28T22:12:46Z

I'll review the PR :)

larsoner · 2015-01-28T23:01:12Z

@dengemann IIUC the challenge with this approach is that pick_channels currently only works for preloaded data. Thus my comment about needing to keep track of which indices in the original data file need to be used to load the data after doing raw.pick_channels().

larsoner · 2015-01-28T23:01:38Z

In any case, @haribharadwaj feel free to take a stab at it. If you're too busy I'll try to find time this week or next.

haribharadwaj · 2015-01-29T02:15:29Z

Thanks all..
@Eric89GXL, I am always too busy but lets see where I am in the next couple of days :)

dengemann · 2015-01-29T09:32:32Z

IIUC the challenge with this approach is that pick_channels currently only works for preloaded data.

but we could change that policy. I don't remember the exact reaosons why we introduced that in the first place. If data are not preloaded it would amount to picking the info and some attributes.

agramfort · 2015-01-29T11:00:57Z

simplicity was the reason.

dengemann · 2015-01-29T11:05:18Z

simplicity was the reason.

no there was another reason .... I remember havin opened an issue once / added this constraint.
But thinking about simplicity, picking without preload + chaining would be an optimal and clean API.
If not preloaded it should ammount returning/ making an object with a picked info.

larsoner · 2015-12-12T15:28:30Z

@haribharadwaj do you still need this, or were there suitable workarounds?

larsoner · 2017-03-01T00:26:40Z

I think most of these use cases can (hopefully) be solved by doing allowing pick_channels without data being loaded. @jaeilepp did something similar for EDF or some other reader recently, so hopefully it can be generalized.

kingjr · 2017-03-01T01:43:05Z

+1

…

On 28 February 2017 at 19:26, Eric Larson ***@***.***> wrote: I think most of these use cases can (hopefully) be solved by doing allowing pick_channels without data being loaded. @jaeilepp <https://github.com/jaeilepp> did something similar for EDF or some other reader recently, so hopefully it can be generalized. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1766 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEp7DHM2bR23NaaumNt9tq89hvFtul-_ks5rhLtBgaJpZM4DYaS3> .

larsoner mentioned this issue Jan 18, 2017

Q: Cleaning stimulus channel without preloading everything #3914

Closed

larsoner self-assigned this Mar 1, 2017

larsoner added the mne sprint label Mar 21, 2017

larsoner mentioned this issue Apr 16, 2018

MRG: Refactor EDF channel picking #5137

Merged

larsoner added Core and removed mne sprint labels Mar 26, 2019

lkorczowski mentioned this issue Mar 24, 2020

pick_channels or pick doesn't work when preload=False #7503

Closed

larsoner mentioned this issue Mar 24, 2020

MRG, ENH: Allow picking without preload #7507

Merged

7 tasks

larsoner closed this as completed in #7507 Apr 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preload subset of channels or channel types from raw.fif #1766

Preload subset of channels or channel types from raw.fif #1766

haribharadwaj commented Jan 28, 2015

larsoner commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

agramfort commented Jan 28, 2015

haribharadwaj commented Jan 28, 2015

larsoner commented Jan 28, 2015

haribharadwaj commented Jan 28, 2015

larsoner commented Jan 28, 2015

agramfort commented Jan 28, 2015

dengemann commented Jan 28, 2015

haribharadwaj commented Jan 28, 2015

dengemann commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

dengemann commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

dengemann commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

larsoner commented Jan 28, 2015

haribharadwaj commented Jan 29, 2015

dengemann commented Jan 29, 2015

agramfort commented Jan 29, 2015

dengemann commented Jan 29, 2015

larsoner commented Dec 12, 2015

larsoner commented Mar 1, 2017

kingjr commented Mar 1, 2017 via email

Preload subset of channels or channel types from raw.fif #1766

Preload subset of channels or channel types from raw.fif #1766

Comments

haribharadwaj commented Jan 28, 2015

larsoner commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

agramfort commented Jan 28, 2015

haribharadwaj commented Jan 28, 2015

larsoner commented Jan 28, 2015

haribharadwaj commented Jan 28, 2015

larsoner commented Jan 28, 2015

agramfort commented Jan 28, 2015

dengemann commented Jan 28, 2015

haribharadwaj commented Jan 28, 2015

dengemann commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

dengemann commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

dengemann commented Jan 28, 2015

agramfort commented Jan 28, 2015

larsoner commented Jan 28, 2015

larsoner commented Jan 28, 2015

haribharadwaj commented Jan 29, 2015

dengemann commented Jan 29, 2015

agramfort commented Jan 29, 2015

dengemann commented Jan 29, 2015

larsoner commented Dec 12, 2015

larsoner commented Mar 1, 2017

kingjr commented Mar 1, 2017 via email