Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

combine_fooofs fails when there are "gaps" in frequencies #237

Closed
cusinatr opened this issue Apr 27, 2022 · 3 comments
Closed

combine_fooofs fails when there are "gaps" in frequencies #237

cusinatr opened this issue Apr 27, 2022 · 3 comments
Labels

Comments

@cusinatr
Copy link

Hi guys,

I found an error when using the above function inside the fooof.objs.utils module, if a FOOOFGroup object is created by non-consecutive frequencies (as can happen when notch-filtering for example).

A minimal code to reproduce the error is the following

import numpy as np
from fooof import FOOOFGroup
from fooof.objs.utils import combine_fooofs

freqs = np.array([1, 2, 3, 5, 6])   # missing 4
PSD = np.random.randn(5, len(freqs)) ** 2

fg = FOOOFGroup()
fg.fit(freqs, PSD)
fgs = combine_fooofs([fg])

And the error is:

Exception has occurred: ValueError
all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 6 and the array at index 1 has size 5

The problem stems in the combine_fooofs function where the frequencies and PSDs are initialized as:

n_freqs = len(gen_freqs(meta_data.freq_range, meta_data.freq_res))
temp_power_spectra = np.empty([0, n_freqs])

Hence, ending up with more frequencies than there actually are and causing the error in the np.vstack function. A possible solution could be to get rid of the initial definitions and initialize from the first object in the passed list, as:

meta_data = fooofs[0].get_meta_data()

  # Add FOOOF results from each FOOOF object to group
  for i, f_obj in enumerate(fooofs):

      # Add FOOOFGroup object
      if isinstance(f_obj, FOOOFGroup):
          fg.group_results.extend(f_obj.group_results)
          if f_obj.power_spectra is not None:
              if i == 0:
                  temp_power_spectra = f_obj.power_spectra
              else:
                  temp_power_spectra = np.vstack([temp_power_spectra, f_obj.power_spectra])

Thanks again for the wonderful tool!

@TomDonoghue
Copy link
Member

TomDonoghue commented Apr 28, 2022

Hmm, interesting. You are right that this situation would fail, although I would say that as currently organized, this is basically by design rather than a bug per se, in that we do basically assume a consistent / continuous frequency definition. Various parts of the code do this, for example FOOOF._regenerate_freqs, so this issue would therefore arise if saving out and reloading data.

The reason we try to impose this constraint is that things could potential get weird fitting peaks around missing frequency values. Granted, we don't typically expect peaks in line noise frequencies, but in general it could get a little tricky to generally support missing frequencies (I haven't ever looked too far into trying to do this).

Rather than, or in addition to, notch filtering, another strategy for dealing with line noise regions is to interpolate away those values, which maintains the frequency range.

If nothing else, I agree that this error message is not helpful, and that given the assumption of the data we have, we could do a much more explicit check for this situation, and provide a clearer warning / error.

Long story short - in the current design, this behaviour is basically expected / by design. The easiest thing here is perhaps to switch strategy for dealing with line noise regions. In a future version we could think about relaxing the even-frequency sampling requirement - but I'm a little wary of being able to do this in a clean way that doesn't have some potentially quirky side effects.

@cusinatr
Copy link
Author

cusinatr commented May 2, 2022

Thank you for your detailed reply. I see the point in having this behaviour coded by design and agree that otherwise potential pitfalls may arise. Maybe the only suggestion would be make it more explicit in the documentation, together with the solution of interpolation if some "gaps" are present; so one can decide if an interpolation is good for the data in the interpolation range or if a different range needs to be selected.

Thanks again for the great work!

@TomDonoghue
Copy link
Member

Agreed on making this a bit more clear & explicit - related to which the following updates are in progress:

Further practical discussion can happen in the linked issues / PRs - so I'm going to close this issue now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants