Explore Paradigm Object#

A paradigm defines how the raw data will be converted to trials ready to be processed by a decoding algorithm. This is a function of the paradigm used, i.e. in motor imagery one can have two-class, multi-class, or continuous paradigms; similarly, different preprocessing is necessary for ERP vs ERD paradigms.

A paradigm also defines the appropriate evaluation metric, for example AUC for binary classification problems, accuracy for multiclass, or kappa coefficients for continuous paradigms.

This tutorial explores the paradigm object, with 3 examples of paradigm :
  • MotorImagery

  • FilterBankMotorImagery

  • LeftRightImagery

# Authors: Alexandre Barachant <alexandre.barachant@gmail.com>
#          Sylvain Chevallier <sylvain.chevallier@uvsq.fr>
# License: BSD (3-clause)

import numpy as np

from moabb.datasets import BNCI2014_001
from moabb.paradigms import FilterBankMotorImagery, LeftRightImagery, MotorImagery



First, let’s take an example of the MotorImagery paradigm.

paradigm = MotorImagery(n_classes=4)

N-class motor imagery.

    Metric is 'roc-auc' if 2 classes and 'accuracy' if more


    events: List of str
        event labels used to filter datasets (e.g. if only motor imagery is

    n_classes: int,
        number of classes each dataset must have. If events is given,
        requires all imagery sorts to be within the events list.

    fmin: float (default 8)
        cutoff frequency (Hz) for the high pass filter

    fmax: float (default 32)
        cutoff frequency (Hz) for the low pass filter

    tmin: float (default 0.0)
        Start time (in second) of the epoch, relative to the dataset specific
        task interval e.g. tmin = 1 would mean the epoch will start 1 second
        after the beginning of the task as defined by the dataset.

    tmax: float | None, (default None)
        End time (in second) of the epoch, relative to the beginning of the
        dataset specific task interval. tmax = 5 would mean the epoch will end
        5 second after the beginning of the task as defined in the dataset. If
        None, use the dataset value.

    baseline: None | tuple of length 2
            The time interval to consider as “baseline” when applying baseline
            correction. If None, do not apply baseline correction.
            If a tuple (a, b), the interval is between a and b (in seconds),
            including the endpoints.
            Correction is applied by computing the mean of the baseline period
            and subtracting it from the data (see mne.Epochs)

    channels: list of str | None (default None)
        list of channel to select. If None, use all EEG channels available in
        the dataset.

    resample: float | None (default None)
        If not None, resample the eeg data with the sampling rate provided.

The function get_data allow you to access preprocessed data from a dataset. this function will return 3 objects. A numpy array containing the preprocessed EEG data, the labels, and a dataframe with metadata.

Return the data for a list of subject.

return the data, labels and a dataframe with metadata. the dataframe
will contain at least the following columns

- subject : the subject indice
- session : the session indice
- run : the run indice

    A dataset instance.
subjects: List of int
    List of subject number
return_epochs: boolean
    This flag specifies whether to return only the data array or the
    complete processed mne.Epochs
return_raws: boolean
    To return raw files and events, to ensure compatibility with braindecode.
    Mutually exclusive with return_epochs
cache_config: dict | CacheConfig
    Configuration for caching of datasets. See :class:`moabb.datasets.base.CacheConfig` for details.
postprocess_pipeline: Pipeline | None
    Optional pipeline to apply to the data after the preprocessing.
    This pipeline will either receive :class:`mne.io.BaseRaw`, :class:`mne.Epochs`
    or :func:`np.ndarray` as input, depending on the values of ``return_epochs``
    and ``return_raws``.
    This pipeline must return an ``np.ndarray``.
    This pipeline must be "fixed" because it will not be trained,
    i.e. no call to ``fit`` will be made.

X : Union[np.ndarray, mne.Epochs]
    the data that will be used as features for the model
    Note: if return_epochs=True,  this is mne.Epochs
    if return_epochs=False, this is np.ndarray
labels: np.ndarray
    the labels for training / evaluating the model
metadata: pd.DataFrame
    A dataframe containing the metadata.

Lets take the example of the BNCI2014_001 dataset, known as the dataset IIa from the BCI competition IV. We will load the data from the subject 1. When calling get_data, the paradigm will retrieve the data from the specified list of subjects, apply preprocessing (by default, a bandpass between 7 and 35 Hz), epoch the data (with interval specified by the dataset, unless superseded by the paradigm) and return the corresponding objects.

The epoched data is a 3D array, with epochs on the first dimension (here 576 trials), channels on the second (22 channels) and time sample on the last one.

(576, 22, 1001)

Labels contains the labels corresponding to each trial. in the case of this dataset, we have the 4 types of motor imagery that was performed.

['feet' 'left_hand' 'right_hand' 'tongue']

Metadata have at least 3 columns: subject, session and run.

  • subject is the subject id of the corresponding trial

  • session is the session id. A session denotes a recording made without removing the EEG cap.

  • run is the individual continuous recording made during a session. A session may or may not contain multiple runs.

   subject session run
0        1  0train   0
1        1  0train   0
2        1  0train   0
3        1  0train   0
4        1  0train   0

For this data, we have one subject, 2 sessions (2 different recording days) and 6 runs per session.

        subject session  run
count     576.0     576  576
unique      NaN       2    6
top         NaN  0train    0
freq        NaN     288   96
mean        1.0     NaN  NaN
std         0.0     NaN  NaN
min         1.0     NaN  NaN
25%         1.0     NaN  NaN
50%         1.0     NaN  NaN
75%         1.0     NaN  NaN
max         1.0     NaN  NaN

Paradigm objects can also return the list of all dataset compatible. Here it will return the list all the imagery datasets from the MOABB.

['AlexandreMotorImagery', 'BNCI2014-001', 'BNCI2014-002', 'BNCI2014-004', 'BNCI2015-001', 'BNCI2015-004', 'Cho2017', 'FakeDataset-imagery-10-2--60-60--120-120--fake1-fake2-fake3--c3-cz-c4', 'GrosseWentrup2009', 'Lee2019-MI', 'Liu2024', 'Ofner2017', 'PhysionetMotorImagery', 'Schirrmeister2017', 'Shin2017A', 'Stieger2021', 'Weibo2014', 'Zhou2016']

FilterBank MotorImagery#

FilterBankMotorImagery is the same paradigm, but with a different preprocessing. In this case, it applies a bank of 6 bandpass filter on the data before concatenating the output.

Filter bank n-class motor imagery.

    Metric is 'roc-auc' if 2 classes and 'accuracy' if more


    events: List of str
        event labels used to filter datasets (e.g. if only motor imagery is

    n_classes: int,
        number of classes each dataset must have. If events is given,
        requires all imagery sorts to be within the events list.

Therefore, the output X is a 4D array, with trial x channel x time x filter

(288, 22, 1001, 6)

LeftRight MotorImagery#

LeftRightImagery is a variation over the BaseMotorImagery paradigm, restricted to left- and right-hand events.

paradigm = LeftRightImagery()

Motor Imagery for left hand/right hand classification.

    Metric is 'roc_auc'

The compatible dataset list is a subset of motor imagery dataset that contains at least left and right hand events.

['BNCI2014-001', 'BNCI2014-004', 'Cho2017', 'GrosseWentrup2009', 'Lee2019-MI', 'Liu2024', 'PhysionetMotorImagery', 'Schirrmeister2017', 'Shin2017A', 'Stieger2021', 'Weibo2014', 'Zhou2016']

So if we apply this to our original dataset, it will only return trials corresponding to left- and right-hand motor imagination.

['left_hand' 'right_hand']

