Benchmark function (#264)

* benchmark init commit move functions processing pipelines to moabb.pipelines.utils, init commit of the benchmark function * minor doc string corr * fixing a missing line in download.py * Update benchmark.py * Update utils.py * adding a print statement for visibility in output * remove verbose, debug params; add evaluations save folder params * update run.py to use benchmark function * correct analyze statement * including, excluding datasets functionality * some descript edits * Update benchmark.py * correct include datasets and return dataframe * correct n_jobs in run.py * Apply suggestions from code review Co-authored-by: Sylvain Chevallier <[email protected]> * whatsnew and remove MI result combo - Updating whats new - Removing MotorImagery for combination of Filterbank and Single Pass results * Adding select_paradigms Using an exception for flake8 function too complex error. May want to find a better implementation and remove exception. * rectify select_paradigms implementation * fix error when select_paradigms is None * add benchmark to __init__ for direct call * fix overwriting results for different paradigms, add printing results * unittest for benchmark * fix docstring and error msg, remove unused code * remove unwanted print * fix warning for dataframe * abstractproperty is deprecated since 3.3 * fix doc typo, add is_valid to fake paradigms * update whatsnew * add an example for benchmark * update readme, correct typos, add link to wiki * fix typos, introduce example * add link to wiki in dataset documentation * fix typos, rename force to overwrite, select_paradigms to paradigms * fix refactoring typo * fix refactoring typo 2 * fix refactoring typo 3 * fix example typo Co-authored-by: Sylvain Chevallier <[email protected]> Co-authored-by: Sylvain Chevallier <[email protected]> Co-authored-by: Sylvain Chevallier <[email protected]>
NeuroTechX · Jan 2, 2023 · ae9f7c7 · ae9f7c7
1 parent 6b41ad1
commit ae9f7c7
Show file tree

Hide file tree

Showing 24 changed files with 716 additions and 172 deletions.
diff --git a/README.md b/README.md
@@ -65,8 +65,8 @@ The Mother of all BCI Benchmarks allows to:
 
 - Build a comprehensive benchmark of popular BCI algorithms applied on an extensive list
   of freely available EEG datasets.
-- The code will be made available on github, serving as a reference point for the future
-  algorithmic developments.
+- The code is available on GitHub, serving as a reference point for the future algorithmic
+  developments.
 - Algorithms can be ranked and promoted on a website, providing a clear picture of the
   different solutions available in the field.
 
@@ -131,6 +131,9 @@ can upgrade your pip version using: `pip install -U pip` before installing `moab
 The list of supported datasets can be found here :
 https://neurotechx.github.io/moabb/datasets.html
 
+Detailed information regarding datasets (electrodes, trials, sessions) are indicated on
+the wiki: https://github.com/NeuroTechX/moabb/wiki/Datasets-Support
+
 ### Submit a new dataset
 
 you can submit a new dataset by mentioning it to this
@@ -173,13 +176,13 @@ our [code of conduct](CODE_OF_CONDUCT.md) in all interactions both on and offlin
 ## Contact us
 
 If you want to report a problem or suggest an enhancement, we'd love for you to
-[open an issue](../../issues) at this github repository because then we can get right on
+[open an issue](../../issues) at this GitHub repository because then we can get right on
 it.
 
 For a less formal discussion or exchanging ideas, you can also reach us on the [Gitter
 channel][link_gitter] or join our weekly office hours! This an open video meeting
 happening on a [regular basis](https://github.com/NeuroTechX/moabb/issues/191), please ask
-the link on the gitter channel. We are also on [NeuroTechX slack #moabb
+the link on the gitter channel. We are also on [NeuroTechX Slack #moabb
 channel][link_neurotechx_signup].
 
 ## Architecture and Main Concepts

diff --git a/docs/source/README.md b/docs/source/README.md
@@ -32,7 +32,6 @@ one of the sections below, or just scroll down to find out more.
 - [Supported datasets](#supported-datasets)
 - [Who are we?](#who-are-we)
 - [Get in touch](#contact-us)
-- [Documentation](#documentation)
 - [Architecture and main concepts](#architecture-and-main-concepts)
 - [Citing MOABB and related publications](#citing-moabb-and-related-publications)
 
@@ -64,8 +63,8 @@ The Mother of all BCI Benchmarks allows to:
 
 - Build a comprehensive benchmark of popular BCI algorithms applied on an extensive list
   of freely available EEG datasets.
-- The code will be made available on github, serving as a reference point for the future
-  algorithmic developments.
+- The code is available on GitHub, serving as a reference point for the future algorithmic
+  developments.
 - Algorithms can be ranked and promoted on a website, providing a clear picture of the
   different solutions available in the field.
 
@@ -130,6 +129,9 @@ can upgrade your pip version using: `pip install -U pip` before installing `moab
 
 The list of supported datasets can be found here : https://neurotechx.github.io/moabb/
 
+Detailed information regarding datasets (electrodes, trials, sessions) are indicated on
+the wiki: https://github.com/NeuroTechX/moabb/wiki/Datasets-Support
+
 ### Submit a new dataset
 
 you can submit a new dataset by mentioning it to this
@@ -174,13 +176,13 @@ in all interactions both on and offline.
 ## Contact us
 
 If you want to report a problem or suggest an enhancement, we'd love for you to
-[open an issue](https://github.com/NeuroTechX/moabb/issues) at this github repository
+[open an issue](https://github.com/NeuroTechX/moabb/issues) at this GitHub repository
 because then we can get right on it.
 
 For a less formal discussion or exchanging ideas, you can also reach us on the [Gitter
 channel][link_gitter] or join our weekly office hours! This an open video meeting
 happening on a [regular basis](https://github.com/NeuroTechX/moabb/issues/191), please ask
-the link on the gitter channel. We are also on NeuroTechX slack channel
+the link on the gitter channel. We are also on NeuroTechX Slack channel
 [#moabb][link_neurotechx_signup].
 
 ## Architecture and Main Concepts
@@ -195,7 +197,7 @@ the workflow.
 
 A dataset handles and abstracts low-level access to the data. The dataset will read data
 stored locally, in the format in which they have been downloaded, and will convert them
-into a MNE raw object. There are options to pool all the different recording sessions per
+into an MNE raw object. There are options to pool all the different recording sessions per
 subject or to evaluate them separately.
 
 ### Paradigm

diff --git a/docs/source/whats_new.rst b/docs/source/whats_new.rst
@@ -18,7 +18,7 @@ Develop branch
 Enhancements
 ~~~~~~~~~~~~
 
-- None
+- Adding a comprehensive benchmarking function (:gh:`264` by `Divyesh Narayanan`_ and `Sylvain Chevallier`_)
 
 Bugs
 ~~~~

diff --git a/examples/README.txt b/examples/README.txt
@@ -0,0 +1,8 @@
+Simple examples
+-----------------
+
+These examples demonstrate how to use MOABB, and its main concepts the
+``dataset``, ``paradigm`` and ``evaluation``. Those examples are using
+only a small number of subjects, and a small number of sessions, to
+keep the execution time short. In practice, you should use all the
+subjects and sessions available in the dataset.
diff --git a/examples/advanced_examples/README.txt b/examples/advanced_examples/README.txt
@@ -1,7 +1,7 @@
 Advanced examples
 -----------------
 
-These examples shows various advanced topics:
+These examples show various advanced topics:
 
 * using scikit-learn pipeline with MNE inputs
 * selecting electrodes or resampling signal

diff --git a/examples/plot_benchmark.py b/examples/plot_benchmark.py
@@ -0,0 +1,101 @@
+"""
+=======================
+Benchmarking with MOABB
+=======================
+
+This example shows how to use MOABB to benchmark a set of pipelines
+on all available datasets. For this example, we will use only one
+dataset to keep the computation time low, but this benchmark is designed
+to easily scale to many datasets.
+"""
+# Authors: Sylvain Chevallier <[email protected]>
+#
+# License: BSD (3-clause)
+
+import matplotlib.pyplot as plt
+
+from moabb import benchmark, set_log_level
+from moabb.analysis.plotting import score_plot
+from moabb.paradigms import LeftRightImagery
+
+
+set_log_level("info")
+
+###############################################################################
+# Loading the pipelines
+# ---------------------
+#
+# The ML pipelines used in benchmark are defined in YAML files, following a
+# simple format. It simplifies sharing and reusing pipelines across benchmarks,
+# reproducing state-of-the-art results.
+#
+# MOABB comes with complete list of pipelines that cover most of the sucessful
+# approaches in the literature. You can find them in the
+# `pipelines folder <https://github.com/NeuroTechX/moabb/tree/develop/pipelines>`_.
+# For this example, we will use a folder with only 2 pipelines, to keep the
+# computation time low.
+#
+# This is an example of a pipeline defined in YAML, defining on which paradigms it
+# can be used, the original publication, and the steps to perform using a
+# scikit-learn API. In this case, a CSP + SVM pipeline, the covariance are estimated
+# to compute a CSP filter and then a linear SVM is trained on the CSP filtered
+# signals.
+
+with open("sample_pipelines/CSP_SVM.yml", "r") as f:
+    lines = f.readlines()
+    for line in lines:
+        print(line, end="")
+
+###############################################################################
+# The ``sample_pipelines`` folder contains a second pipeline, a logistic regression
+# performed in the tangent space using Riemannian geometry.
+#
+# Selecting the datasets (optional)
+# ---------------------------------
+#
+# If you want to limit your benchmark on a subset of datasets, you can use the
+# ``include_datasets`` and ``exclude_datasets`` arguments. You will need either
+# to provide the dataset's object, or a the dataset's code. To get the list of
+# available dataset's code for a given paradigm, you can use the following command:
+
+paradigm = LeftRightImagery()
+for d in paradigm.datasets:
+    print(d.code)
+
+###############################################################################
+# In this example, we will use only the last dataset, 'Zhou 2016'.
+#
+# Running the benchmark
+# ---------------------
+#
+# The benchmark is run using the ``benchmark`` function. You need to specify the
+# folder containing the pipelines to use, the kind of evaluation and the paradigm
+# to use. By default, the benchmark will use all available datasets for all
+# paradigms listed in the pipelines. You could restrict to specific evaluation and
+# paradigm using the ``evaluations`` and ``paradigms`` arguments.
+#
+# To save computation time, the results are cached. If you want to re-run the
+# benchmark, you can set the ``overwrite`` argument to ``True``.
+#
+# It is possible to indicate the folder to cache the results and the one to save
+# the analysis & figures. By default, the results are saved in the ``results``
+# folder, and the analysis & figures are saved in the ``benchmark`` folder.
+
+results = benchmark(
+    pipelines="./sample_pipelines/",
+    evaluations=["WithinSession"],
+    paradigms=["LeftRightImagery"],
+    include_datasets=["Zhou 2016"],
+    results="./results/",
+    overwrite=False,
+    plot=False,
+    output="./benchmark/",
+)
+
+###############################################################################
+# Benchmark prints a summary of the results. Detailed results are saved in a
+# pandas dataframe, and can be used to generate figures. The analysis & figures
+# are saved in the ``benchmark`` folder.
+
+score_plot(results)
+plt.show()
diff --git a/examples/sample_pipelines/CSP_SVM.yml b/examples/sample_pipelines/CSP_SVM.yml
@@ -0,0 +1,23 @@
+name: CSP + SVM
+paradigms:
+  - LeftRightImagery
+
+citations:
+  - https://doi.org/10.1007/BF01129656
+  - https://doi.org/10.1109/MSP.2008.4408441
+
+pipeline:
+  - name: Covariances
+    from: pyriemann.estimation
+    parameters:
+      estimator: oas
+
+  - name: CSP
+    from: pyriemann.spatialfilters
+    parameters:
+      nfilter: 6
+
+  - name: SVC
+    from: sklearn.svm
+    parameters:
+      kernel: "linear"
diff --git a/examples/sample_pipelines/TSLR.yml b/examples/sample_pipelines/TSLR.yml
@@ -0,0 +1,23 @@
+name: Tangent Space LR
+
+paradigms:
+  - LeftRightImagery
+
+citations:
+  - https://doi.org/10.1016/j.neucom.2012.12.039
+
+pipeline:
+  - name: Covariances
+    from: pyriemann.estimation
+    parameters:
+      estimator: oas
+
+  - name: TangentSpace
+    from: pyriemann.tangentspace
+    parameters:
+      metric: "riemann"
+
+  - name: LogisticRegression
+    from: sklearn.linear_model
+    parameters:
+      C: 1.0
diff --git a/moabb/__init__.py b/moabb/__init__.py
@@ -1,4 +1,5 @@
 # flake8: noqa
 __version__ = "0.4.6"
 
-from moabb.utils import set_log_level
+from .benchmark import benchmark
+from .utils import set_log_level
diff --git a/moabb/analysis/__init__.py b/moabb/analysis/__init__.py
@@ -19,7 +19,7 @@ def analyze(results, out_path, name="analysis", plot=False):
 
     Given a results dataframe, generates a folder with
     results and a dataframe of the exact data used to generate those results,
-    aswell as introspection to return information on the computer
+    as well as introspection to return information on the computer
 
     parameters
     ----------
@@ -44,8 +44,6 @@ def analyze(results, out_path, name="analysis", plot=False):
 
     unique_ids = [plt._simplify_names(x) for x in results.pipeline.unique()]
     simplify = True
-    print(unique_ids)
-    print(set(unique_ids))
     if len(unique_ids) != len(set(unique_ids)):
         log.warning("Pipeline names are too similar, turning off name shortening")
         simplify = False

diff --git a/moabb/analysis/meta_analysis.py b/moabb/analysis/meta_analysis.py
@@ -23,7 +23,11 @@ def collapse_session_scores(df):
         Aggregated results, samples are index, columns are pipelines,
         and values are scores
     """
-    return df.groupby(["pipeline", "dataset", "subject"]).mean().reset_index()
+    return (
+        df.groupby(["pipeline", "dataset", "subject"])
+        .mean(numeric_only=True)
+        .reset_index()
+    )
 
 
 def compute_pvals_wilcoxon(df, order=None):
-Original file line number
+Diff line change
@@ Expand Up / @@ -18,7 +18,7 @@ Develop branch @@
     Enhancements
     ~~~~~~~~~~~~
-    - None
+    - Adding a comprehensive benchmarking function (:gh:`264` by `Divyesh Narayanan`_ and `Sylvain Chevallier`_)
     Bugs
     ~~~~
@@ Expand Down @@