mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python

Home Page:https://mne.tools

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inconsistent behavior between `Epochs` and `Evoked` in `get_data()` method when picking EEG channels and bads are present

hoechenberger opened this issue · comments

Description of the problem

Epochs.get_data(picks="eeg") excludes bad channels, but Evoked.get_data(picks="eeg") doesn't

Steps to reproduce

# %%
import mne

sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / "MEG" / "sample" / "sample_audvis_raw.fif"

raw = mne.io.read_raw_fif(sample_fname)
raw.crop(tmax=60)

events = mne.find_events(raw, stim_channel="STI 014")
raw.pick("eeg")

epochs = mne.Epochs(raw, events=events, preload=True)
evoked = epochs.average()

# %%
for inst_type, inst in zip(["epochs", "evoked"], [epochs, evoked]):
    print(f"{inst_type}:")
    print(f"    bads: {inst.info['bads']}")
    print(f"    data shape: {inst.get_data(picks='eeg').shape}")

Link to data

No response

Expected results

Both methods should behave the same in regards to bad channel treatment.

Actual results

epochs:
    bads: ['EEG 053']
    data shape: (86, 59, 421)
evoked:
    bads: ['EEG 053']
    data shape: (60, 421)

As you can see, the data from Evoked contains the bad channel, while the data from Epochs does not.

Additional information

This is with main

The docstrings are identical, so I think both methods should behave identically. In addition, the docstring says:

None (default) will pick all channels.

I think this should be:

None (default) will pick all good channels.

At least that's how it currently works for Epochs, but not for Evoked apparently.

https://mne.tools/dev/generated/mne.Epochs.html#mne.Epochs.get_data
https://mne.tools/dev/generated/mne.Evoked.html#mne.Evoked.get_data

@cbrnr Note that I did not pass None in my MWE! I did pass picks="eeg"

The doctoring says:

Note that channels in info['bads'] will be included if their names or indices are explicitly provided.

So something about this point is inconsistent between epochs and evoked at least when passing picks="eeg".

I think what you're mentioning here is being tracked at #12197

So something about this point is inconsistent between epochs and evoked at least when passing picks="eeg".

Neither get_data has an exclude param, sounds like one is implicitly using exclude="bads" and the other exclude=() under the hood. To change the behavior of one to match the other and be consistent I think we'd need a deprecation cycle, not sure whether or not it's worth the code churn. To update the docstring of both to reflect their actual behavior it wouldn't require a deprecation cycle. Adding exclude to both (to make the behavior more explicit and controllable) is probably worthwhile either way.

Even if documented I think it would be a bit surprising if Epochs and Evoked used different pick defaults. So I'd document the behavior with whatever version we think makes more sense, be consistent for both methods and deprecate the inconsistent behavior. And yes, an explicit exclude parameter would definitely make sense to me.

I tested the MWE with Raw and now it seems it's actually Epochs who's the odd one:

# %%
import mne

sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / "MEG" / "sample" / "sample_audvis_raw.fif"

raw = mne.io.read_raw_fif(sample_fname)
raw.crop(tmax=60)

events = mne.find_events(raw, stim_channel="STI 014")
raw.pick("eeg")

epochs = mne.Epochs(raw, events=events, preload=True)
evoked = epochs.average()

# %%
for inst_type, inst in zip(["raw", "epochs", "evoked"], [raw, epochs, evoked]):
    print(f"{inst_type}:")
    print(f"    bads: {inst.info['bads']}")
    print(f"    data shape: {inst.get_data(picks='eeg').shape}")
raw:
    bads: ['EEG 053']
    data shape: (60, 36038)
epochs:
    bads: ['EEG 053']
    data shape: (86, 59, 421)
evoked:
    bads: ['EEG 053']
    data shape: (60, 421)

More digging:

  • BaseSpectrum.get_data() features exclude="bads'
  • BaseTFR ditto

My feeling is that we should add exclude parameters to all these get_data() methods if they don't have one already, retaining the current behavior; and do a deprecation cycle for transitioning to either exclude=() or exclude="bads" for all of them