Inconsistent behavior between `Epochs` and `Evoked` in `get_data()` method when picking EEG channels and bads are present

Question

Inconsistent behavior between `Epochs` and `Evoked` in `get_data()` method when picking EEG channels and bads are present

hoechenberger opened this issue 2 months ago · comments

Richard Höchenberger commented 2 months ago

Description of the problem

Epochs.get_data(picks="eeg") excludes bad channels, but Evoked.get_data(picks="eeg") doesn't

Steps to reproduce

# %%
import mne

sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / "MEG" / "sample" / "sample_audvis_raw.fif"

raw = mne.io.read_raw_fif(sample_fname)
raw.crop(tmax=60)

events = mne.find_events(raw, stim_channel="STI 014")
raw.pick("eeg")

epochs = mne.Epochs(raw, events=events, preload=True)
evoked = epochs.average()

# %%
for inst_type, inst in zip(["epochs", "evoked"], [epochs, evoked]):
    print(f"{inst_type}:")
    print(f"    bads: {inst.info['bads']}")
    print(f"    data shape: {inst.get_data(picks='eeg').shape}")

Link to data

No response

Expected results

Both methods should behave the same in regards to bad channel treatment.

Actual results

epochs:
    bads: ['EEG 053']
    data shape: (86, 59, 421)
evoked:
    bads: ['EEG 053']
    data shape: (60, 421)

As you can see, the data from Evoked contains the bad channel, while the data from Epochs does not.

Additional information

This is with main

Clemens Brunner · Answer 1 · Fri Apr 26 2024 16:29:51 GMT+0800 (China Standard Time)

The docstrings are identical, so I think both methods should behave identically. In addition, the docstring says:

None (default) will pick all channels.

I think this should be:

None (default) will pick all good channels.

At least that's how it currently works for Epochs, but not for Evoked apparently.

https://mne.tools/dev/generated/mne.Epochs.html#mne.Epochs.get_data
https://mne.tools/dev/generated/mne.Evoked.html#mne.Evoked.get_data

Richard Höchenberger · Answer 2 · Fri Apr 26 2024 16:55:00 GMT+0800 (China Standard Time)

@cbrnr Note that I did not pass None in my MWE! I did pass picks="eeg"

The doctoring says:

Note that channels in info['bads'] will be included if their names or indices are explicitly provided.

So something about this point is inconsistent between epochs and evoked at least when passing picks="eeg".

I think what you're mentioning here is being tracked at #12197

Eric Larson · Answer 3 · Sat Apr 27 2024 01:09:32 GMT+0800 (China Standard Time)

So something about this point is inconsistent between epochs and evoked at least when passing picks="eeg".

Neither get_data has an exclude param, sounds like one is implicitly using exclude="bads" and the other exclude=() under the hood. To change the behavior of one to match the other and be consistent I think we'd need a deprecation cycle, not sure whether or not it's worth the code churn. To update the docstring of both to reflect their actual behavior it wouldn't require a deprecation cycle. Adding exclude to both (to make the behavior more explicit and controllable) is probably worthwhile either way.

Clemens Brunner · Answer 4 · Sat Apr 27 2024 02:36:12 GMT+0800 (China Standard Time)

Even if documented I think it would be a bit surprising if Epochs and Evoked used different pick defaults. So I'd document the behavior with whatever version we think makes more sense, be consistent for both methods and deprecate the inconsistent behavior. And yes, an explicit exclude parameter would definitely make sense to me.

Richard Höchenberger · Answer 5 · Sat Apr 27 2024 03:47:40 GMT+0800 (China Standard Time)

I tested the MWE with Raw and now it seems it's actually Epochs who's the odd one:

# %%
import mne

sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / "MEG" / "sample" / "sample_audvis_raw.fif"

raw = mne.io.read_raw_fif(sample_fname)
raw.crop(tmax=60)

events = mne.find_events(raw, stim_channel="STI 014")
raw.pick("eeg")

epochs = mne.Epochs(raw, events=events, preload=True)
evoked = epochs.average()

# %%
for inst_type, inst in zip(["raw", "epochs", "evoked"], [raw, epochs, evoked]):
    print(f"{inst_type}:")
    print(f"    bads: {inst.info['bads']}")
    print(f"    data shape: {inst.get_data(picks='eeg').shape}")

raw:
    bads: ['EEG 053']
    data shape: (60, 36038)
epochs:
    bads: ['EEG 053']
    data shape: (86, 59, 421)
evoked:
    bads: ['EEG 053']
    data shape: (60, 421)

Richard Höchenberger · Answer 6 · Sat Apr 27 2024 03:50:48 GMT+0800 (China Standard Time)

More digging:

BaseSpectrum.get_data() features exclude="bads'
BaseTFR ditto

Richard Höchenberger · Answer 7 · Sat Apr 27 2024 03:53:25 GMT+0800 (China Standard Time)

My feeling is that we should add exclude parameters to all these get_data() methods if they don't have one already, retaining the current behavior; and do a deprecation cycle for transitioning to either exclude=() or exclude="bads" for all of them