Best practices in data analysis and sharing in neuroimaging using MEEG

Non-invasive neuroimaging methods, including magnetoencephalography and electroencephalography (MEEG), have been critical in advancing the understanding of brain function in healthy people and in individuals with neurological or psychiatric disorders. Currently, scientific practice is undergoing a tremendous change, aiming to improve both research reproducibility and transparency in data collection, documentation and analysis, and in manuscript review. To advance the practice of open science, the Organization for Human Brain Mapping created the Committee on Best Practice in Data Analysis and Sharing (COBIDAS), which produced a report for MRI-based data in 2016. This effort continues with the OHBM’s COBIDAS MEEG committee whose task was to create a similar document that describes best practice recommendations for MEEG data. The document was drafted by OHBM experts in MEEG, with input from the world-wide brain imaging community, including OHBM members who volunteered to help with this effort, as well as Executive Committee members of the International Federation for Clinical Neurophysiology. This document outlines the principles of performing open and reproducible research in MEEG. Not all MEEG data practices are described in this document. Instead, we propose principles that we believe are current best practice for most recordings and common analyses. Furthermore, we suggest reporting guidelines for Authors that will enable others in the field to fully understand and potentially replicate any study. This document should be helpful to Authors, Reviewers of manuscripts, as well as Editors of neuroscience journals.


Introduction
Over the last decade or so, more and more discussion has been focused on concerns regarding reproducibility of scientific findings and a potential lack of transparency in data analysis -prompted in large part by the Open Science Collaboration (2015) , which could only replicate 39 out of 100 previously published psychological studies. Since then, there has been an ongoing discussion about these issues in the wider scientific community, including the neuroimaging field. There has also been a push to implement the practice of 'open science', which among other things promotes: (1) transparency in reporting data acquisition and analysis parameters, (2) sharing of analysis code and the data itself with other scientists, as well as (3) implementing an open peer review of scientific manuscripts. Within the Organization for Human Brain Mapping (OHBM) community, there have been ongoing discussions both at OHBM Council level as well as at the grassroots level regarding how the neuroimaging community can improve its standards for performing and reporting research studies. In June 2014, OHBM Council created a " Statement on Neuroimaging Research and Data Integrity ", and in a practical move created a Committee on Best Practices in Data Analysis and Sharing (COBIDAS). The COBIDAS committee's brief was to create a white paper based on best practices in MRI-based data analysis and sharing in the neuroimaging community. The COBIDAS MRI report was completed and made available to the OHBM community on its website , as well as in a preprint that was submitted in 2016 and published in (preprint Nichols et al., 2016published paper Nichols et al., 2017 ).
At the OHBM "Town Hall" or General Assembly and Feedback Forum in 2017, the issue of an additional COBIDAS initiative -this time focused on EEG and MEG data (or MEEG for short) -was suggested by Aina Puce and Cyril Pernet. Over the remaining time of the OHBM 2017 scientific meeting, discussions with the OHBM Council Chair and Chair-Elect involved making plans to organize and constitute a COBIDAS MEEG committee, made up of OHBM members with varying expertise in electroencephalography (EEG) and magnetoencephalography (MEG), with Puce and Pernet as Co-Chairs. A general email call for volunteers to serve on the committee was made in August 2017, and 115 OHBM members signalled their interest in working on the committee. In October 2017, an 11 member COBIDAS MEEG committee was formed. Contact with members of the International Federation for Clinical Neurophysiology (IFCN) was also made, as the IFCN is involved in generating guideline documents for best practices in clinical neurophysiology. A draft of the COBIDAS MEEG document was shared with the IFCN Executive prior to Pernet speaking on the topic of data sharing at their annual scientific meeting in Washington DC in May, as well as with the remaining 104 OHBM members who had answered the original call to help with drafting the COBIDAS MEEG document. Close to 300 comments and edits were made on the initial document. Edits on the first complete draft of the document were completed before sharing the white paper with OHBM Council in early June 2018. The draft of the white paper was discussed by OHBM Council and Puce made a progress report to OHBM Members at the OHBM General Assembly and Feedback Forum in Singapore in June 2018. Subsequently collected feedback from these two OHBM sources was incorporated into the draft before re-circulation to the COBIDAS MEEG Committee in July 2018. A progress report on the COBIDAS MEEG process was presented at the OHBM General Assembly and Feedback Forum at the Rome OHBM meeting in June 2019 by Pernet. The https://cobidasmeeg.wordpress.com/ blog will be maintained, so that the COBIDAS MEEG document can remain as a living entity that can be responsive to future changes in hardware, software as well as scientific practice.

Approach
The approach taken in this document parallels that for COBIDAS MRI. Our aim is to generate a set of best practice guidelines for research methods, data analysis and data sharing in the MEEG discipline. Tables with recommendations and checklists (see Appendix) may seem very detailed, but we recommend these as a reference source for essential details that should be reported in any MEEG study, in order to ensure its reproducibility/replicability. The replication of MEEG studies is currently a challenge, as many reported studies continue to omit basic and important methodological details. These details should also assist those who are new to the area in considering what is important in designing an experiment, collecting and analysing data, as well as reporting the study. Additionally, we hope that the COBIDAS MEEG document will be useful for Authors, Editors and Reviewers of scientific manuscripts employing MEEG -in the same way that the COBIDAS MRI document has been used by the MRI community.

Scope
The COBIDAS MEEG document focuses on best practices in non-invasively recorded MEG and EEG data. The practices are broken down into six components for reporting: (1) experimental design, (2) data acquisition, (3) preprocessing and processing, (4) biophysical and statistical modelling, (5) results, as well as (6) data sharing and reproducibility.
Similar to the COBIDAS MRI document, we also make a clear distinction between reproducibility and replicability (see definitions in Barba, 2018 ). Reproducibility relates to working with (possibly) the same data and analysis methods to reproduce the same final observations/results. Replicability relates to using different data (and potentially different methods) to demonstrate similar findings across laboratories. Replication internally, i.e., across experiments within the laboratory, is a practice that might be considered by investigators as a means of validation and mitigation of exploratory induced biases.
It should be also said at the outset that the MEEG community has always been exceptionally proactive in the discussion of good experimental practice and reporting, as evidenced by a long history of published guidelines (e.g., Donchin et al., 1977;Pivik et al., 1993 ;Picton et al., 2000 ;Duncan et al., 2009 ;Gross et al., 2013 ;Keil et al., 2014 ;Kane et al., 2017 ;Hari et al., 2018 ). The continual update of guidelines has been necessitated by rapidly changing developments in hardware and software and has come from various parts of the MEEG community ( Hansen et al., 2010 ) -including both the research and clinical areas. The OHBM endorsed COBIDAS MEEG recommendations follow this tradition, while also highlighting practices that aid with reproducibility, something that has not been a focus of previous guidelines. For instance, Section 2 deals with issues pertaining to experimental design: explaining the use of common and desirable terminology, criteria for participant selection and statistical power, most of these topics which have not previously been addressed at length. Some of the best basic practices proposed to date have unfortunately remained confined to the earlier MEEG literature and have not easily made the transition to the general neuroimaging community. This is largely because the MEEG field has now grown to include new neuroimaging researchers, who are beginning to work with these established methods, may not be familiar with the earlier literature, and may not have contact with established investigators in the field. Hence, we have extended and updated these guidelines to also tackle known common pitfalls in data recording and analysis. This problem will continue to grow in the MEEG field with recent and new developments in e.g., increasingly high-density recordings, portable EEG systems, non-helium based MEG systems, and data analysis and modelling approaches including estimates of functional and effective connectivity.

Experimental Design
With respect to experimental design, the goal of replicable research requires the reporting of how the participants were screened and selected, as well as what type of experimental paradigm was employed. This enables a critical Reader to evaluate, for instance, whether the findings will generalize to other populations. In case there was an experimental manipulation that included a task, a specification of the instructions given to the participant is very important. All pertinent information regarding the experiment and the recording environment (cf. Section 3) should be noted to facilitate the efforts of others wishing to replicate the work (e.g., stimuli, timing, apparatus, sessions, runs, trial numbers, conditions, randomization or other condition-ordering procedures, periods of rest, or other intervals etc.). Ideally, the scripts and stimuli used (when not collecting resting state data) are shared together with the manuscript, thus making exact experimental reproduction possible.

Lexicon of MEEG design
Below is a list of MEEG terminology commonly used to describe stimulation and task parameters and protocols. Although we recognize that some wording is used more often than others (e.g., a block versus a run, and a trial versus an event), the list follows the terminology used by the Brain Imaging Data Structure (BIDShttp://bids.neuroimaging.io/ ) for MEG ( Galan et al., 2017 ), EEG  and intracranial EEG ( Holdgraf et al., 2019 ). Because some terms are used interchangeably in the literature, this can add to the confusion in trying to replicate experiments or analyses, hence this effort in standardizing the nomenclature across all areas of human brain imaging. Where applicable, we highlight distinctions between the COBIDAS MEEG and COBIDAS MRI documents.
Session . A logical grouping of neuroimaging and behavioural data collected consistently across participants. A session includes the time involved in completing all experimental tasks. This begins when a participant enters the research environment until he/she leaves it. This would typically start with informed consent procedures followed by participant preparation (i.e., electrode placement and impedance check for EEG; fiducial and other sensor placement for MEG) and ends when the electrodes are removed (for EEG) or the participant exits the MEG room, but could potentially also include a number of pre-or post-MEEG observations and measurements (e.g., anatomical MRI, additional behavioural or clinical testing, questionnaires), even on different days. Defining multiple sessions is appropriate when several identical or similar data acquisitions are planned and performed on all (or most) participants, often in the case of some intervention between sessions (e.g., training or therapeutics) or for longitudinal studies.
Run . An uninterrupted period of continuous data acquisition without operator involvement. Note that continuous data need not be saved continuously; in some paradigms, especially with long inter-trial intervals, only a segment of the data (before and after the stimulus of interest) are saved. In the MEEG literature, this is also sometimes referred to as a block. (Note the difference with the 'block' term in COBIDAS MRI, where multiple stimuli in one condition can be presented over a prolonged and continuous period of time.) Event . An isolated occurrence of a presented stimulus, or a subject response recorded during a task. It is essential to have exact timing information in addition to the identity of the events, synchronized to the MEEG signals. For this, a digital trigger channel with specific marker values, or a text file with marker values and timing information can be used. (This term has been defined here in a more narrow and explicit sense than that for COBIDAS MRI, mainly because of the specialized requirements surrounding the high temporal resolution acquisition of MEEG data.) Trial . A period of time that includes a sequence of one or more events with a prescribed order and timing, which is the basic, repeating element of an experiment. For example, a trial may consist of a cue followed after some time by a stimulus, followed by a response, followed by feedback. An experimental condition is a functional unit defined by the design and usually includes many trials of the same type. Critical events within trials are usually represented as time-stamps or "triggers" stored in the MEEG data file, or documented in a marker file.
Epoch . In the MEEG literature, the term epoch designates the outcome of a data segmentation process. Typically, epochs in event-related designs (for analysis of event related potentials or event related spectral perturbations) are time-locked to a particular event (such as a stimulus or a response). Epochs can also include an entire trial, made up of multiple events, if the data analysis plan calls for it. (This terminology is not used in the COBIDAS MRI specification.) Sensors . Sensors are the physical objects or transducers that are used to perform the analogue recording, i.e., EEG electrodes and MEG magnetometers/ gradiometers. Sensors are connected to amplifiers, which not only amplify, but also filter the MEEG activity.
Channels . Channels refer to the digital signals that have been recorded by the amplifiers. It is thus important to distinguish them from sensors. A 'bad channel' refers to a channel that is producing a consistently artifactual or low-quality signal.
Fiducials . Fiducials are markers placed within a well-defined location, which are used to facilitate the localization and co-registration of sensors with other geometric data (e.g., the participant's own anatomical MRI image, an anatomical MRI template or a spherical model). Some examples are vitamin-E markers , reflective disks, felt-tip marker dots placed on the face, or sometimes even the EEG electrodes themselves etc. Fiducials are typically placed at a known location relative to, or overlying, anatomical landmarks.
Anatomical landmarks. These are well-known, easily identifiable physical locations on the head (e.g., nasion at the bridge of the nose; inion at the bony protrusion on the midline occipital scalp) that have been acknowledged to be of practical use in the field. Fiducials are typically placed at anatomical landmarks to aid localization of sensors relative to geometric data.
Sensor space. Sensor space refers to a representation of the MEEG data at the level of the original sensors, where each of the signals maps onto the spatial location of one of the sensors.
Source space . Source space refers to MEEG data reconstructed at the level of potential neural sources that presumably gave rise to the measured signals (according to an assumed biophysical model). Each signal maps onto a spatial location that is readily interpretable in relation to individual or template-based brain anatomy.

Statistical power
There is currently no agreed-upon single method for computing statistical power for MEEG data. The committee recommendations are: 1 -that all decisions related to computing statistical power be made prior to starting the experiment ; 2 -to define from the literature (if available) the main data feature(s) of interest; and 3 -to estimate the minimal effect size of interest to determine power. A minimal effect size is the smallest effect considered as relevant for a given hypothesis. An effect size should be determined using estimates from independent data, existing literature and/or pilot data that should not be included in the final sample (e.g. if the hypothesis states a modulation of a given spectral band, estimate from the literature or pilot data the amount of change expected and compute the required statistical power). It is, however, important to keep in mind that errors in effect size calculations and subsequent power calculations can be introduced by small sample sizes when using pilot data ( e.g., see Albers & Lakens, 2018 ).
Statistical power determines the researcher's ability to observe an experimental effect. Under the assumption that the effect exists, and along with the quality of the experiment, statistical power thus determines the replicability of a study and is, therefore, an important factor to consider. For instance, in order to observe a behavioural effect in terms of response times, an estimated number of at least 1600 observations (e.g., 40 participants with 40 trials each for a given condition) has been suggested when using a mixed model analysis approach ( Brysbaert & Stevens, 2018 ). As the neural effects in MEEG studies likely have a lower signal-to-noise ratio than response time effects, and some trials/epochs will be rejected due to artifacts, thus diminishing the number of trials/epochs included in statistical analyses, there is a need for more events and/or participants than has been used in current common practice. However, there is a complex balance between the number of trials and the number of participants that depends, on one hand on the experimental design (within versus between participants e.g. see Boudewyn et al., 2017 ) and the statistical method to be used, and on the other hand the MEEG feature of interest, its location, orientation and distance to the detectors.

Participants
The population from which the participants are sampled is critical to any experiment, not just to those from clinical samples. The method of participant selection ( Martínez-Mesa et al., 2016 ), the population from which they were selected (e.g., laboratory members, university undergraduates, hospital community, general population), recruitment method (e.g., direct mailing, advertisements), specific inclusion and exclusion criteria, and compensation (financial or other type) should be described clearly. Any specific sampling strategies that constrain inclusion to a particular group should be reported.
One should take special care with defining a "typical" versus "healthy" sample. Screening for lifetime neurological or psychiatric illness (e.g., as opposed to "current" ones) could have unintended consequences. For example, in older individuals this could exclude up to 50% of the population ( Kessler, 2005 ) and this restriction could induce a bias towards a "super-healthy" atypical participant sample, thus limiting the generalization to the population as a whole. The use of inclusive language when recruiting participants is also recommended (e.g., using gender-neutral pronouns in recruiting materials).
Participant demographic information such as age, gender, handedness and education (total years of education) and highest qualification should be included in the experimental description at a minimum, as these variables have been associated with changes in brain structure and function ( BRAINS, 2017 ). Medications that affect the central nervous system should be reported (unless these were part of the exclusion criteria). Additional ancillary investigations (e.g., questionnaires, psychological assessments etc.) should also be reported. Finally, it is important to include information related to obtaining written informed consent for adult participants (or parental informed consent/informed assent in minors), with a specific mention of the institutional review board that approved the study.

Task and stimulation parameters
It is helpful to describe the characteristics of the overall testing environment, task-related instructions and number of experimenters. In task-free recordings of resting state activity, while there are no stimulation parameters, it is important to report the instructions given to the participant. As a minimum, even in resting state studies, whether the eyes were open or closed needs to be noted, and for studies with eyes open whether there was a fixation point or not. Participant position (e.g., seated or lying down) should also be noted.
If there is a task with stimuli, stimulus properties need to be described in sufficient detail to allow replication, and this includes standardization procedures used in stimulus creation. The means of producing the stimuli should be reported: for example, whether stimuli from existing stimulus sets or databases are used, the name/website of the database (or subset of stimuli used) should be provided. If stimuli are created or manipulated, specific software or algorithms (and their versions) need to be identified.
It is important to note that the high time resolution of MEEG signals makes them highly sensitive to stimulus properties and stimulus/task timing. For visual presentation, stimulus size in degrees of visual angle, viewing distance, clarity (i.e., visual contrast, intensity, etc.), colour, site of stimulation (i.e., monocular versus binocular, full-field versus hemifield/quadrant), position in the visual field, as well as the display device and method of projection (including refresh rate or response time of the monitors) should be reported. Any differences in intensity or contrast between different stimulus conditions should be noted. For auditory presentation, stimulus properties (e.g., frequency content, duration, onset/offset envelope, etc.), intensity (e.g., relative to the subject's individual hearing threshold, or as Sound Pressure Level [dB SPL]), ear of stimulation, and the type, manufacturer and model of the delivery device (e.g., ear inserts, panel speakers, etc.) are important to include. Further, the presence of contralateral ear masking stimulation, and its intensity, should be noted. For somatosensory stimulation, stimulus type (e.g., electrical, air puff) and characteristics (e.g., duration, frequency), manufacturer and model of delivery device, location on the body with reference to anatomical landmarks, and strength (ideally with respect to some sensory or motor threshold) should be reported. The distance between the site of peripheral stimulation and brain, and skin temperature are also important as they will affect response latency independent of the experimental manipulation. For other modalities of stimulation, providing sufficient details regarding stimulus properties, timing and intensity will be critical for replicability. Calibration procedures, including software (type, version and operating system) and hardware used, should also be described. Where relevant, the rationale for selecting a specific parameter (e.g., contrast, harmonic content) should be indicated. If features were determined individually for each participant, the criteria and the psychophysical method used should be detailed.
For tasks that are self-paced and not explicitly driven by stimuli, e.g., voluntary movements in readiness potential (Bereitschafts Potential) experiments, the instructions given for each run or block of the experiment, and how the task-relevant events (e.g., movement onset or offset) are determined, quantified and stored need to reported.
For all tasks, it is essential to describe the overall structure and timing of the task including practice sessions, number of trials per condition, the interstimulus (offset to onset) or stimulus-onset-asynchrony (SOA, onset-to-onset) intervals and any temporal jitter in these intervals between sequential events (whether intended or not), the order of stimulus presentation, feedback or handling of errors, and whether conditions were counterbalanced. Storage of stimulus and response triggers in the datafile should also be mentioned (discussed in more detail in Section 3).

Behavioural measures collected during an MEEG session
A number of behavioural measures can be acquired during an MEEG experiment. The most common measures are obtained via a button press on a response pad or keyboard, mouse or joystick; however, many other response types are possible. These can include responses by voice, movements of the hands, fingers, feet (most typically assessed via accelerometry recordings), eyes (assessed via electroculography (EOG), infra-red video recordings or eye tracker), or specific contractions of muscles (most typically assessed via electromyographic (EMG) recordings). In case accelerometry or EMG is used, the positioning of accelerometers or recording electrodes for EMG and data acquisition parameters should be described (see Section 3 and 6, BIDS standards). The same applies to EOG recordings, where ideally, separate recordings of horizontal and vertical eye movements should be captured.
Regardless of the actual type of response, it is imperative to describe the exact nature of the response acquisition device, including product name, model numbers, manufacturer, as well as any pertinent recording parameters. Further, the method by which the device interfaces with the MEEG data needs to be described, as well as any modifications made to the off-the-shelf product. If devices are built in-house, the components and basic function of the device need to be well described (ideally providing a schematic diagram of the device or a description of the basic circuit might be helpful).
In addition to response devices, appropriate descriptions of the assessment of behavioural response metrics (e.g., central measures like mean or median as well as measures of variability) and performance (e.g., response time, accuracy, false alarms, etc.) should be provided in the Results section.

MEEG device
MEEG studies should report basic information on the type of acquisition system being used (including the manufacturer and model), the number of sensors and their spatial layout. For example, for EEG studies spatial layout will most likely correspond to the International 10-20 (Jasper, 1985;Klem et al., 1999), International 10-10 (Chatrian et al., 1985), International 10-5 (Oostenveld & Praamstra, 2001), or geodesic systems (Tucker, 1993). Additionally, the sensor material should be specified (e.g., Ag/AgCl electrodes) and whether the electrodes are active or passive.
For MEG studies, the type of sensors should also be specified (e.g., planar or axial gradiometers, or magnetometers; cryogenic or room-temperature), as well as the location and type of any reference sensors. Means of determining the position of the participant's head with respect to the MEG sensor array should be reported, and also when this operation was performed (e.g., continuously, or at the start of each session). The type of shielded room (when used) should also be specified.
Additionally, for MEG studies, it is advisable to include "empty room" recordings using the same experimental set-up as during the experiment (but without the participant present) to characterize any participant-unrelated artifacts. For EEG studies, ideally access to the data in the calibration procedure (which has been carried out on the amplifiers prior to each recording session) would allow the variations in channel gains to be documented. In a similar fashion, it would be desirable to also be able to store/report electrode impedances that have been measured in each subject. This information would allow to compare raw effect sizes (in fT, uV) between studies and potentially harmonizing data across laboratories.

Acquisition parameters
For MEEG studies it is mandatory to specify basic parameters such as acquisition type (continuous, epoched), sampling rate and analogue filter bandwidth (including the parameters of the low pass anti-aliasing filter-an obligatory part of the recording system-as well as any high pass filtering). Notch filtering (to eliminate line noise), if used during recording, should also be reported. The inclusion of digitisation resolution (e.g., 16-bit or 24-bit) is also helpful. It should be noted that during data acquisition all MEEG recording systems will use some filter bandpass potentially as a default that may not be altered by the user. The inclusion of parameters related to filter type and roll-offs is essential in some situations (e.g., when discussing the timing of ERP components or spectral components). Note that the filter bandpass may also be adjusted post hoc for analysis, and this should also be reported when describing analysis procedures (see Section 4.3).
For EEG recordings, the location of reference and ground electrodes used in data acquisition should be specified. Similarly, reference electrode(s) used in data analysis should also be reported (see Section 4.4). For data acquisition, physically linked earlobe/mastoid electrodes should not be used, as they are not actually a neutral reference and make further modelling intractable (see also Katznelson, 1981). Further, distortions in EEG activity can occur as a result of relative differences in impedances between two earlobe electrodes. While it has been recommended by various investigators that the left earlobe/mastoid be used as acquisition reference, it should be noted that cardiac artifacts could be exaggerated for a left earlobe/mastoid reference. An alternative would be to use the right earlobe instead.
Sensor position digitization procedures, if performed, should be described. For EEG, the type of approach used, and the manufacturer and model of the device should be specified, as well as the time in relation to the experiment that this procedure was performed. In MEG studies, when determining the position of the head with respect to the sensor array, the locations of EEG, other electrodes, or head localisation coils may be digitized at the same time. If high-resolution anatomical MRI scans of participants' heads are acquired for the purposes of source localization, details of MRI scanning protocol, as well as fiducial types, their locations relative to anatomical landmarks, and the native coordinate system, should be described. If less commonly used fiducial positions are adopted, example photographs of fiducial placement might be helpful. Methods for co-registering MEEG sensors and fiducials to individual anatomical MRI scans or templates (including software name and version) should be reported (see also Sections 2.1 and 4.6).
Skin preparation methods used during electrode application, as well as the electrode material and the conducting gels or saline solutions (if used) should be described. The procedure used to measure impedances should be reported, especially for passive electrode systems. For systems using active electrodes it is not required nor always possible to record impedances, but nevertheless recommended if possible to report the impedance measurement procedure and values. Note that acceptable levels for electrode impedances vary relative to the ambient noise levels (e.g. whether recordings are done in a Faraday cage), the amplifier's input impedance, and the type of electrodes being used (passive or active). Therefore it is advisable to include a statement on what the acceptable electrode impedances are for the specific setup (as suggested by the manufacturers), as well as what the actual values were (on average, or an upper bound). The time(s) at which impedances were measured during the course of the experiment e.g., start, middle, end, should also be noted. It is advisable to store the impedance measurements digitally, together with the EEG data, if at all possible.
Additional electrodes may be applied to the scalp/face to measure electro-oculographic (EOG) signals in either EEG or MEG studies. Additionally, EMG activity may be recorded from any part of the body. For EOG and EMG electrodes, their exact spatial locations should be specified, preferably relative to well-known anatomical landmarks (e.g., outer canthus of the eye). It should be specified if these data are collected with the same or different filter and gain settings to the MEEG data.
In MEEG recordings the position of the participant (e.g. sitting, lying supine) should be clearly documented. Head position is known to affect the strength of different EEG rhythms as it produces displacements of brain compartments and therefore has an appreciable effect on source modelling ( Rice et al., 2013 ). This is likely to be an issue for MEG recordings also, as well as being an additional source of variance in comparison to fMRI data in the same participants where in one session the participant sits upright (in EEG or MEG) and in another (fMRI) the participant lies supine.
In some clinically based studies, some participants may be studied under sedation or anaesthesia. The anaesthetic agents may affect the MEEG data significantly, hence the agent, dosage and administration method (intravenous, intramuscular, etc) should be reported.

Stimulus presentation and recording of peripheral signals
Information on the type of stimulator (including manufacturer and model) should be provided (see Section 2). If being digitally controlled, the type and version of the software should also be reported. Calibration procedures for stimulators, if applicable, should be described. Similarly, manufacturer and model of devices used for collecting peripheral signals, such as a microphone to record speech output should be reported.
As MEEG methods have a very high temporal resolution, it is also essential to measure and report any time delays between stimulus timing or recording of peripheral signals with respect to the time course of the MEEG signals. For example, a visual or auditory stimulus setup may include a systematic delay from the trigger sent by the stimulus software to the actual arrival of the stimulus at the sensory organs. While a fixed delay is common and easy to fix a posteriori during analysis, randomness in temporal jitter can be highly problematic. Any information that may influence the interpretation of the results, such as stimulus strength or timing, visual angle, microphone placement etc should be reported. For studies involving hyperscanning, a description of the synchronization of multiple data acquisition systems (e.g. EEG-EEG, MEG-EEG, EEG-fMRI) should be provided.

Vendor specific information
When providing acquisition information in a manuscript keep in mind that Readers may use a different manufacturer of EEG or MEG device, and thus one should minimize the use of vendor-specific terminology. To provide comprehensive acquisition detail we recommend reporting vendor-specific information in particular regarding hardware parameters, but with generic and agreed terminology (see e.g. the brain imaging data structure, or BIDS). If space constraints are a problem in manuscript preparation, these details could be provided as supplementary material.

Preprocessing and processing reporting 4.1. Software-related issues
Many of the available EEG and MEG systems come with analysis software packages with varying levels of detailed descriptions of how the different preprocessing tools are implemented. In addition, several freely available software packages that run on MATLAB/Python/R platforms, or commercial data analysis packages offer alternative implementations of data analysis tools. In addition, custom-written software can be used. The software that has been used for the preprocessing and subsequent analysis must be indicated (including the version). In-house software should be described in explicit detail with reference to the peer-reviewed or pre-print materials. The source code should be publicly released and access links should be provided (e.g., GitHub or another readily accessible internet-based location).

Defining workflows
Preprocessing is a crucial step in MEEG signal analysis as data can be typically distorted due to various factors. The sequence of steps in the preprocessing pipeline and their order influences the data to be used for subsequent analysis. The workflow, therefore, has to be described step-by-step and with such a level of detail that it could be exactly reproduced by another researcher. For most studies, recommended steps after general visual data inspection include: 1) Identification and removal of electrodes/sensors with poor signal quality i.e. identification of bad channels. It is essential to clearly describe the methodology and the criteria used, particularly if interpolation is used. 2) Artifact identification and removal. State the method and criteria used to identify artifacts. If a tool is used to automate this step, details on its implementation and parameters used should be provided. 3) Detrending (when and if appropriate) 4) Downsampling (if performed). 5) Digital lowand high-pass filtering with filter-type characteristics (see below). 6) Data segmentation (if performed). 7) Additional identification/elimination of physiological artifacts (blinks, cardiac activity, muscle activity etc.). 8) Baseline correction (when, and if, appropriate). 8) Re-referencing for EEG (e.g., earlobe/mastoid-reference, common-average reference, bipolar) and expression of the data in another form (e.g. surface Laplacian; when and if desired). The steps and sequence described above are appropriate for most basic analyses of data. That said, for specific analyses, or due to specific data characteristics, the order of processing may vary for scientific reasons. For example, data segmentation could occur at different points in the pipeline, depending in part on the specific artifact removal methods used. Note, however, that filtering should be performed before data segmentation to avoid edge effects, or alternatively sufficient data padding should be used. Data re-referencing could also theoretically be performed at various points in the pipeline, but it is important to note that re-referencing can introduce a spatial spread of artifacts. The committee recognizes that investigators require a pipeline where the order of steps is taken for specific reasons, and hence we are not prescriptive about a particular order of data analysis. That said, for each study, the order of the steps in the preprocessing pipeline should be motivated and made explicit, so that other investigators can replicate the study.
Visual inspection of the spatiotemporal structure in the signals after each step is recommended and, if needed, remaining segments of poor data quality should be marked and excluded from further analysis. When such epochs are additionally rejected, a record should be provided such that the same analysis could be reproduced from the raw data. Ideally, storing it in samples relative to the onset of the data record, would be desirable to avoid the potential ambiguity which can arise when reporting more or less arbitrary ordinal epoch numbers. During preprocessing, topographic maps of the distribution of the means and variances of scalp voltages (for EEG) and magnetic fields (for MEG) can serve as an additional tool for spotting channels with poor data quality that might escape detection in waveform displays ( Michel et al., 2009 ).

Artifacts and filtering
Artifacts from many different sources can contaminate MEEG data and must be identified and/or removed. Artifacts can be of non-physiological (bad electrode contact, power line noise, flat MEG or EEG channel etc.) or physiological (pulse, muscle activity, sweating, movement, ocular blinks etc.) origin. The data should first be visually inspected to assess what types of artifact are actually present in the data. This evaluation should not be biased by the knowledge of the experimental conditions. Subsequently, established artifact identification/removal pipelines can be run, or an alternative motivated cleaning procedure can be implemented. Artifacts can be dealt with in different ways, from simply removing artifact-contaminated segments or channels from the data, to separating signal from noise using e.g. linear projection/spatial filtering techniques.
If automatic artifact detection methods are used, they should be followed up by visual inspection of the data. Any operations performed on the data (see Section 4.1 workflow) should therefore be described, specifying the parameters of the algorithm used. It is recommended to describe in detail the type of detrending performed and the algorithm order (e.g., linear 1st order, piecewise, etc). When automatic artifact rejection/correction is performed, which method was used and what was the range of parameters (e.g., EEG data with a range larger than 75 microV, epoch rejected based on 3 standard deviations from the mean kurtosis). Similarly for channel interpolation, it is essential to specify the interpolation method and additional parameters (e.g., trilinear, spline order). For example, when independent component analysis (ICA, Brown et al., 2001, Jung et al., 2001 is used, describe the algorithm and parameters used, including the number of ICs that were obtained. If artifacts are rejected using ICA or other signal space separation methods, it is important to report how these were identified and how back projection was performed. For instance, ICA can be performed in combination with a high-pass filter, the back projected data without the artefact component can then be obtained with or without that filter. Such level of details are necessary if one wants Readers to reproduce the method used. It is worthwhile to also consider including topographies of components in the Supplementary Materials section of manuscripts (when available). If interactive artifact rejection procedures are used, it is essential to describe what types of features in the MEEG signal were identified and define the criteria used to reject segments of data. This also allows the Reader to reproduce the results, as well as to be able to compare results between studies (see above on reporting visually removed trials, or epochs, for instance). Once artifacts have been removed, the average number of remaining trials per condition should be reported.
In addition to removing artifact-contaminated segments or using ICA as a popular linear projection technique, MEG allows for the application of specialized linear projection techniques, which in some situations can be used in isolation. For example, signal-space projection methods (SSP, Uusitalo & Ilmoniemi, 1997) use "empty room" measurements to estimate the topographic properties of the sensor noise and project it out from recordings containing brain activity. Related tools with a similar purpose include signal space separation (SSS) methods and their temporally extended variants (tSSS, Taulu et al., 2004;Taulu & Simola, 2006 ) that rely on the geometric separation of brain activity from noise signals in MEG data. SSS methods have been recommended as being superior to SSP ( Haumann et al., 2016 ). The ordering of preprocessing steps for cleaning MEG data is particularly important, due to potential data transformation -for some caveats see Gross et al., 2013 . For both MEG and EEG data, particular attention must be taken to describe temporal filtering, both for data acquisition and post-processing, as this can have dramatic consequences on estimating time-courses and phases ( Rousselet, 2012 ;Widmann et al 2015 ), with no effect on scalp topography (although possibly shifted) but possible effect on the topography on non-stationary dynamic signals (e.g. components). Some investigators have advocated the use of an acquisition sampling rate that is 4 times above the intended cut-off frequency of the low pass filter ( Luck et al., 2014 and latest IFCN guidelines). That said, the roll-off rate/slope of the filter should also be taken into consideration, because there will still be some signal that is present above the filter cut-off frequency. Therefore specifying the type and parameters of any applied post-hoc filter and re-computed references (for EEG, EOG and EMG) has to be specified: filter type (high-pass, low-pass, band-pass, band-stop; FIR: e.g., windowed sinc incl. window type and parameters, ParksMcClellan, etc.; IIR: e.g., Butterworth, Chebyshev, etc.), cutoff frequency (including definition: e.g., -3 dB/half-energy, -6 dB/half-amplitude, etc.), filter order (or length), roll-off or transition bandwidth, passband ripple and stopband attenuation, filter delay (zero-phase, linear-phase, non-linear phase) and causality, and direction of computation (one-pass forward/reverse, or two-pass forward and reverse). In the case of two-pass filtering it must be specified whether reported cutoff frequencies and filter order apply to the one-pass or the final two-pass filter.
Data preprocessing also forms an essential part of multivariate techniques, and can dramatically affect decoding performance ( Guggenmos et al., 2018 ). We recommend to carefully describe the method used, in particular, if noise normalization is performed channel wise (univariate normalization) or for all channels together (multivariate normalization, or whitening). For the latter, the covariance estimation procedure must be specified (based on baseline, epochs, or for each time point) as its strong impact on results ( Engemann & Gramfort, 2015 ) can hinder any attempt to reproduce the analyses.

Re-referencing
EEG is a differential measure and in non-clinical EEG is usually recorded relative to a fixed reference (in contrast to clinical practice, which usually uses bipolar montages). While EEG is always recorded relative to some reference, it can later be re-referenced by subtracting the values of another channel or weighted sum of channels from all channels. The need for re-referencing depends on the goals of the analysis and EEG measures used (e.g., common average reference, see below) and can be beneficial for evaluation of connectivity and for source modelling. However, note that, independently of the actual re-referencing scheme, sensor level interpretation of connectivity is invariably confounded by spatial leakage of source signals ( Schoffelen & Gross, 2009 ). Re-referencing does not change the contours of the overall scalp topography since relative amplitude differences are maintained. This can, however, cause issues when working on single channels or clusters, because amplitudes do change locally with referencing ( Hari & Puce, 2017 ). Specifically, the shape of the recorded waveforms at specific electrodes can be altered and this will also affect the degree of distortion of waveforms by artifacts. Hence, when comparing across experiments, the references used should be taken into account, and if unusual, the reference choice should be justified. For EEG, the channel(s) or method used for re-referencing must be specified. MEG is essentially reference free, but some systems may allow for "re-referencing" of the signals recorded close to the brain, using signals recorded at a set of reference coils far away from the brain. If these types of balancing techniques are used, they should be adequately described.
Re-referencing relative to the average of all channels (common average reference, CAR) is most common for high-density recordings as the first step in current practice. The main assumption behind the CAR is that the summed potentials from electrodes spaced evenly across the entire head should be zero ( Bertrand et al., 1985, Yao, 2017. Although it is generally admitted that this is a good approximation for EEG data sets of 128 channels or more ( Srinivasan et al., 1998 ;Nunez & Srinivasan, 2006 ), the effect of re-referencing to a CAR has been found to be of no close relation to the electrode density. The sum of the potential is mainly affected by the coverage area and the neural source activating orientation ( Hu et al., 2018a ). F or low density recordings and ROI-based analyses in sensor space, there is a serious risk of violating the assumptions for the average reference and the possibility of introducing shifts in potentials ( Hari & Puce, 2017 ) and thus CAR should be avoided in low-density recordings (<128 channels).
An alternative to the CAR approach is the "infinite reference" one, also known as Reference Electrode Standardization Technique (REST and regularized REST) ( Yao, 2001 ). Both the CAR and REST have been shown to be the extremes of a family of Bayesian reference estimators ( Hu et al., 2018 b). REST utilizes the prior that EEG signals are correlated across electrodes due to volume conduction, while CAR takes the prior that EEG signals are independent over electrodes (for reviews see Yao et al., 2019 ;Hu et al., 2019 ) . If the focus of the data analysis is on source space inference (see Section 4.6), re-referencing is, in theory, not necessary but may be useful for comparisons to existing literature. Of note, any linear transform applied to the data (e.g. CAR) should also be applied to the forward matrix used for source space analysis. Such important details are generally taken care of by software tools in the field (and some require data to be in CAR form), but it is worthwhile ensuring that this is done. Finally, it should also be noted that there are so-called "reference-free" methods, the most common one being the current source density (CSD) transformation, that usually relies on the spatial Laplacian of the scalp potential i.e. the second spatial derivative of the scalp voltage topography ( Tenke & Kayser, 2005 ). Such techniques attempt to compensate, in EEG, for the signal smoothing due to the low electrical conductivity of the scalp and skull. When this is used, the software and parameter settings (interpolation method at the channel level and algorithm of the transform) must be specified.

Spectral and time-frequency analysis
A common approach for the analysis of MEEG data is to examine the data in terms of its frequency content, and these analyses are applicable for both task-related as well as resting state designs. One important caveat for these types of analyses is that the highest frequencies that could occur in the data be first considered. The selected data acquisition rate must be at least 2 times (Nyquist theorem) the highest frequency in the data, but is often higher because of the filter roll-off (see Section 4.3) -underscoring the importance of planning all data analyses prior to data acquisition, ideally during the design of the study. Similarly, the lowest frequencies of interest should also be considered, as in this case an adequate pre-stimulus baseline should be specified for evoked MEEG data i.e. the baseline duration should be equal to at least 3 cycles of the slowest frequency to be examined (Cohen, 2014).
In task-related designs, MEEG activity can be classified as evoked (i.e., be phase-locked to task events/stimulus presentation) or induced (i.e., related to the event, but not exactly phase-locked to it). Hence, it is important to specify what type of activity is being studied. The domain in which the analysis proceeds (time and frequency or frequency alone) should be specified, as should the spectral decomposition method used (see below), and whether the data are expressed in sensor or source space. These methods can be the precursor to the assessment of functional connectivity (see Section 4.6).
The spectral decomposition algorithm, as well as parameters used, should be specified in sufficient detail since these crucially affect the outcome. Therefore, depending on the decomposition method used (e.g., wavelet convolution, Fourier decomposition, Hilbert transformation of bandpass-filtered signals, or parametric spectral estimation), one should describe the type of wavelet (including the tuning parameters), the exact frequency or time-frequency parameters (frequency and time resolutions), exact frequency bands, number of data points, zero padding, windowing (e.g., a Hann or Hanning window), and spectral smoothing ( Cohen, 2018 ). It is relevant to note that the required frequency resolution is defined as the minimum frequency interval that two distinct underlying oscillatory components need to have in order to be dissociated in the analysis ( Bloomfield, 2004 ;Boashash, 2003 ). This should not be mistaken with the increments at which the frequency values are reported (e.g., when smoothing or oversampling is used in the analyses). When using overlapping windows (e.g., in Welch's method) or using Multi-taper windows for robust estimation, the potential spectral smoothing may lead to closely spaced narrow frequency bands to blend. This should be carefully considered and reported. 4.6. Source modelling MEEG data are recorded from outside the head. Source modelling is an attempt to explain the spatio-temporal pattern of the recorded data in sensor space as resulting from the activity of specific neural sources within the brain (in source space), a process known as solving the inverse problem . Since there is no unique solution to the inverse problem (i.e. it is mathematically ill-posed), additional assumptions are needed to constrain the solution. Source modelling requires a forward model , which models the sensor level distribution of the EEG potential or MEG magnetic field for a (set of) known source(s), modelling the effect of the tissues in the head on the propagation of activity to MEEG sensors. Forward and inverse modelling require a volume conduction model of the head and a source model, both of which can crucially influence the accuracy and reliability of the results ( Baillet et al., 2001 ;Michel & He, 2018 ). Practically, the forward model (or lead field matrix) describes the magnetic field or potential distributions in sensor space that result from a predefined set of (unit amplitude) sources. The sources are typically defined either in a volumetric grid, or on a cortically constrained sheet. Information from the forward model is then used to estimate the solution of the inverse problem, in which the measured MEEG signals are attributed to active sources within the brain. It is important to note that source modelling procedures essentially provide approximations of the inverse solution as solved under very specific assumptions or constraints.
In addition to the MEEG data itself, forward and inverse modelling requires a specification of the spatial locations of the sensors relative to the head (Section 3.2), a specification of the candidate source locations, the source model, and geometric data that are used as a volume conduction model of the head, e.g., a spherical head model, or a more anatomically realistic model, based on an individual anatomical MRI of the entire head (i.e. including the scalp and face). Note that this may have implications for subject privacy when sharing data (see Section 7.2). The procedure used to coregister the locations of measurement sensors and fiducials with geometric data must be described (see Section 2.1 for definitions; Section 3.2 for sensor digitization methods). If using anatomical MRI data, it should be made clear if a normalized anatomical MRI volume such as the MNI152 template, or individual participant MRIs have been used for data analysis. If individual MRIs have been used, the data acquisition parameters should be described.
It is essential that all details of the head model and the source model are given. The numerical method used for the forward model (e.g., boundary element modelling (BEM), finite element modelling (FEM)) must be reported, and the values of electrical conductivity of the different tissues that were used in the calculations must be specified. This is less of a problem for MEG where magnetic fields are not greatly distorted by passing through different tissue types ( Baillet, 2017). The procedure for the segmentation of the anatomical MRI into the different tissue types should be described. For the source model, the number of dipole locations should be reported, as well as their average positions. Moreover, it should be specified how the source model was constructed, whether it describes a volumetric 3D-grid, or a cortically constrained mesh. When using cortically constrained (surface-based or volumetric) source models, these should ideally be based on an individual MRI of the participant's head, especially in clinical studies where brain lesions or malformations may be involved, or in pediatric studies where the status of the fontanelles can vary across individuals of the same young age. That said, it has been argued that in certain clinical settings, approximate head models might be adequate, although their limitations should be explicitly acknowledged ( Valdés-Hernández et al., 2009 ). The source localization method (e.g., equivalent current dipole fitting, distributed model, dipole scanning), software and its version (e.g., BESA, Brainstorm ( Tadel et al., 2011 ), Fieldtrip , EEGLAB ( Delorme & Makeig, 2004 ), LORETA, MNE ( Gramfort et al., 2013 ), Nutmeg ( Dalal et al., 2004 ), SPM ( Litvak et al., 2011 ), etc.) must be reported, with inclusion of parameters used (e.g., the regularization parameter) and appropriate reference to the technical paper describing the method in detail. Finally, it should be noted that the original mixing from the neural sources to the scalp/sensors signals cannot be completely undone with even perfect source reconstruction, and this is specifically an important confounder for connectivity analyses ( Schoffelen & Gross, 2009, Palva et al., 2018, Pascual-Marqui et al., 2018.

Connectivity analysis
We refer here to connectivity analyses as any method that aims to detect the coupling between two or more channels or sources, and re-emphasise that the distinction between functional (correlational) and effective (causal) connectivity should be respected ( Friston 1994 ). It is also important to report and justify the use of either sensor, or source space for the calculation of derived metrics of coupling (e.g., network measures such as centrality or complexity).

Making Networks
Networks are typically derived in one of two ways: data driven (e.g. clustering of correlations, ICA) or model driven. For MEEG, temporal ICA is typically used to partition the data into separate networks of maximally independent temporal dynamics , Eichele et al., 2011 from which metrics are derived. For anatomically/model driven networks, particular attention should be given to the parcellation scheme, explaining how this was performed (see e.g. Douw et al., 2017 ). Recent results have also shown strong differences for connectivity computed in subject spaces vs. template space ( Farahibozorg et al., 2018, Mahjoory et al., 2017 and choices must be explained.

Sensor vs. Source connectivity
While the committee agrees that statistical metrics of dependency can be obtained at the channel level, it should be clear that these are not per se measures of neural connectivity ( Haufe et al., 2012 ). The latter can only be obtained by an inferential process that compensates for volume conduction and spurious connections due to unobserved common sources or cascade effects. In spite of that, dependency measures can be useful for e.g., biomarking. Connectivity from ICA falls in between these two approaches, as ICA acts as a spatial filter separating out neural sources (see e.g. Brookes et al., 2012 ) but does not reconstruct them per se, nor accounts volume conduction, common sources, etc. The possible insight into brain function derived from these measures should be critically discussed. This is particularly important since the interpretation of MEEG-based connectivity metrics may be confounded by aspects of the data that do not directly reflect true neural events ( Schoffelen & Gross J, 2009 ;Valdes Sosa et al., 2011 ). Inference about connectivity between neural masses can only be performed with dependency measures at the source level and correct inferential procedures. For potential issues in dealing with connectivity analyses across channels versus sources, see Lai et al., 2018.
Special care must be taken when describing the metric used. E poch length must be reported as it influences greatly connectivity values especially considering sensor vs source space ( Fraschini et al., 2016 ) and if dynamic connectivity is computed, measures must be described by including temporal parameters (window size, overlap, wavelet frequency, etc -see Tewarie et al., 2019 for an overview). When computing measures of data-driven spectral coherence or synchrony (Halliday et al., 1995) the following aspects should be considered and reported: the exact formulation (or reference), whether the measure has been debiased, any subtraction or normalisation with respect to an experimental condition or a mathematical criterion. When using multivariate measures (either data-driven or model-based) such as partial coherence and multiple coherence, all of the variables used must be described. Importantly, it must be described which variables with respect to which, the data are partialised, marginalised, or conditioned, or orthogonalized (e.g. Brookes et al., 2012, Colclough et al., 2015. In case of Auto-Regressive (AR)-based multivariate modelling (e.g., in the Partial Directed Coherence group of measures; Baccala & Sameshima, 2001 ), the exact model parameters (number of variables, data points and window lengths, as well as the estimation methods and fitting criteria) should be reported.

Properties of the data submitted to statistical analysis
When analysis focuses on specific channels, source-level regions of interest, peaks, components (see also Section 6.1.1 on nomenclature related to this term), time and/or frequency windows, it is essential to report how these were determined, and where appropriate, why this mode of selection is unbiased. One should also report whether specific data were left out and how much of the total data this represents.
Special care must be taken to avoid circular analyses, also known as "double dipping" e.g., by selecting for analysis specific channels on the grounds that they show grand average differences and then performing statistical testing with the same data on those channels ( Kriegeskorte et al., 2009 ;Kriegeskorte et al., 2010 ). In other words, the criteria for selecting a given channel or component must be independent from the statistical test of interest (e.g., based on an orthogonal contrast or the adequacy of the component to reflect the data, independently of the effect), or on a priori assumptions derived from previous studies/independent data).

Region-of-interest analyses
There are many ways MEEG data can be analyzed. The committee does not make any recommendations regarding which features in the data are best, or which statistical method is best. Indeed, the most important aspect is that the feature selection and the statistical method best answer the particular scientific question being asked ( Kass et al., 2016 ).
Region-of-interest (ROI) analysis in time, frequency or space (peak analysis, window average, etc) is as legitimate as any other analysis approach, but it should be used with caution. Unless justified a priori or via independent data (session or run), it is better accompanied by an analysis incorporating the full data space, as post-hoc selection (e.g. using the grand average) increases largely the false positive rate (see Luck & Gaspelin, 2017 ). For time/frequency ROIs, defining how peaks, components, latencies were measured (e.g., manually or automatically) and whether peak amplitude (or peak-to-peak amplitude), averages around the peak or area under the curve measures were used is paramount, both ensuring no bias was introduced during feature selection and because this is a key element to reproduce the analysis. When peaks are the object of analysis, the following items should be specified: whether the peak latency was determined on the group average and then the amplitude was measured at or around this latency for every participant, or whether the peak latency was determined individually for each participant and by which criterion (e.g., the most negative value within a given window). If automated methods were used, report which criteria/parameters were applied or if applicable, which peak detection method (and software) was used. Reporting this information is especially pertinent in ERP studies because of the specification of the "baseline" period relative to which sensory, cognitive or motor activity is referenced. For spatial ROIs, because of the smooth spatial distribution of MEEG data, focus on isolated regions of interest, without consideration of spatial distribution of signal strength in their wider neighbourhood, may yield incorrect estimates of activation and connectivity patterns. The dimensionality of source-level descriptions may be reduced by merging neural signals for a reasonable number of cortical parcels; the parcellation scheme must be defined.
Regardless of the statistical framework employed to analyse ROI data, it is recommended that assumptions used in the model be checked (e.g., normality of residuals) and appropriate corrections be performed to make the statistical tests more conservative and maintain the false positive rate at the nominal level.

Mass univariate statistical modelling
Mass-univariate statistics can be performed at the participant level, group level, or both, using a hierarchical or mixed model approach, and for the whole data volume (3D space for source analysis), and/or the spatio-temporal space for channel analysis over time ( Kilner et al., 2005 ;Pernet et al., 2011 ). It is essential to report the detail of each design, including the software (and its version), as well as its functions. For instance, all regressors included at the participant level should be described, as well as which ones were used at the group level. When stimuli or participant parameters are regressed, describe how the regressors (predictors and interactions) in the final model were selected and which model selection procedures were used, if any. If only group-level analyses are performed on averages, specify if weighting has been performed and/or if a pooling of channels was implemented. Compared to tomographic methods, MEEG can have missing data (e.g., bad channels, or transient intervals with artifacts). It is essential to report whether missing data have been dealt within the dataset itself, e.g., replacement of bad channels by means of interpolation (see Section 4), or if missing data have been handled in statistical analyses.
Since many statistical tests are typically performed on MEEG datasets, results must be corrected for multiple testing/comparisons (e.g., full brain analyses or multiple feature/component maxima). The method used (e.g., Bonferroni, false discovery rate, empirical Baye, random field theory, maximum statistics based on permutation or bootstrap (max value, max cluster, max threshold-free cluster enhancement)) must be reported together with the adopted threshold. Note that both a priori and a posteriori (i.e., derived from autocorrelation on observed data) thresholds based on successive data points ( Guthrie & Buchwald, 1991 ) do not provide adequate techniques to control for Type 1 family-wise error and should, therefore, be avoided ( Piai et al., 2015 ) . Special attention must also be given to the data smoothness when using random field theory ( Eklund et al., 2016 ). This is in contrast to a posteriori thresholds using null distributions (bootstrap and permutations), which have been shown to control well for the family-wise Type 1 error rate ( Maris & Oostenveld, 2007 ;Pernet et al., 2015 ). When used, report which technique and software (and version) were used.

Multivariate statistical inference
Multivariate statistical tests (e.g. MANOVA, Linear Discriminant Analysis) can be performed on MEEG data and often proceed using one data dimension, leading to many statistical tests. For example, a linear discriminant analysis (LDA) can be performed over sensor space repeatedly over time and/or frequencies. Conversely, multiple predetermined time/frequency points for each channel (or source location) can be used, and the classification can be performed per channel. In any case, this results in a multiple comparisons problem that needs to be properly addressed, typically incorporated into a resampling scheme (bootstrap or permutation) ( Pantazis et al., 2005 ).

Multivariate pattern classification
When a decoding approach is used, one must describe: (i) the classifier used (e.g., LDA, Support Vector Machine (SVM), Naive Bayes, Elastic Net, etc.) and its implementation/software; (ii) the distance metric (e.g., Euclidean distance, Pearson correlation, Spearman correlation); (iii) whether there was any parameter selection for the classifier (e.g., by optimizing parameters within a grid of possible values, in a subset of trials/participants, keeping the default options of some software); (iv) how chance performance was computed (e.g., empirically, with random permutations, etc.); (v) the validation scheme (e.g., leave one/two out, N-fold cross-validation) in which the test set is independent of the training set, minimising bias and unrealistically high classification rates, commonly referred to as "overfitting". To avoid overfitting while setting model parameters, a nested cross-validation should be employed. It consists in optimizing parameters on a "validation set" different from the left-out "test set" used to report prediction performances. It is also important to motivate the data-split choice, with leave-one-out approaches likely to give bias estimates ( Varoquaux et al., 2017 ). Finally, if surrogate data creation is part of the analysis, then the technique and also details of parameters used to generate surrogate data to evaluate the chance performance of the decoder should be recorded.

Source modelling
Source modelling and reconstruction can be regarded as a step in the processing pipeline (see Section 4.4.) that is used to obtain a dependent variable (e.g., amount of power in a particular frequency band at location X), which can subsequently be subjected to a univariate statistical test. However, before analyzing the source activity, it is essential to provide readers with information on the quality of the reconstruction.
For EEG, source reconstruction based on low-density electrode coverage should be justified given that the number of electrodes impact the accuracy of localization ( Michel et al., 2004 ;Michel & Brunet, 2019 ) and connectivity ( Hassan et al., 2014 ). While it has been suggested that > 64 EEG electrodes were needed to avoid mislocalisations in source modelling, more recently it is believed that between 128-256 EEG electrodes are needed to effectively model oscillatory EEG activity ( Michel & Brunet, 2019 ). For both MEG and EEG, since there are multiple methods available to estimate sources, the expected accuracy, errors and robustness of the method should be ideally described. For instance, one could report the point-spread-function and localization error for sources (i.e., spatial confidence bounds of dipoles, Fuchs et al., 2004 ). In addition, where estimates are performed on multiple participants, error measures (variance) captured by the model should be reported. In general, it is critical to report all parameters used in the modelling procedure, so that the analysis can be reproduced by other investigators.

Biophysical modelling and connectivity analyses
Functional and effective connectivity metrics need to be clearly stated and justified. The type of statistical dependence measure in either sensor or source space used should be specified (e.g., correlation, phase coupling, amplitude coupling, spectral coherence, entropy, DCM, Granger causality), as well as the assumptions underlying the analysis (e.g., linear versus unspecified; directional versus non-directional). The calculation of specific graph theoretical measures on the basis of dependency measures should be motivated and correctly associated to the data (e.g., the interpretation of shorter path length is often used, but in the context of functional adjacency matrices, its meaning has been questioned, see Sporns, 2014 ). It should be clearly stated whether a generative model is used (and what data types form inputs for it), or whether the measure makes specific assumptions about the data distribution (e.g., one versus two different populations of participants). It is necessary to state the nodes used for the connectivity matrix (e.g., channels, sources), the function used for the time-frequency decomposition (e.g., Morlet, Hilbert, Fourier, etc.) and the type of statistics used.
For biophysical methods such as Dynamic Causal Modelling ( Kiebel et al., 2008 ), details should be given of the neural model employed (e.g., event-related potential, canonical microcircuit), the full space of functional architectures considered and connectivity matrices present/modulated (forward, backward, lateral, if intrinsic), the vector of between-trial effects, the number of modes, the temporal window modelled, and the priors on source locations. Finally, information should be provided on the statistical approach used for inference at the level of models or the family of models (Fixed-or Random-effects, FFX or RFX) as well as at the level of connectivity parameters (Frequentist versus Bayesian, Bayesian Model Averaging (BMA) over all models or conditioned on the winning family/model etc. (see Kiebel et al., 2010 ).

Results reporting
Recorded MEEG data contains rich spatial, temporal and oscillatory information. Analysis of these spatiotemporal data matrices typically leads to results that may be described across different dimensions. Signals vary in frequency, time and space. Moreover, relationships between signals (connectivity) and different signal components (e.g., cross-frequency interactions) may further increase the dimensionality of the results. Depending on the study's scientific question, results are reported in one or several of these dimensions. Different conventions and requirements exist for the reporting of results in time, space, or frequency, and for the reporting of connectivity results. Thus, in this section we consider these dimensions separately.
6.1. Time-domain analysis 6.1.1. Naming conventions In the current MEEG literature there is quite a bit of variability in component nomenclature. The word "component" traditionally referred to a functional brain process that has a characteristic spatial distribution (Donchin et al., 1978). Because of the loaded meaning of the term "component", the use of the term "deflection" is a potentially useful alternative.
Traditionally, event-related response components have been named using an established nomenclature, where the polarity of the (EEG) response and its nominal latency form the elements of the name (e.g., N100, N170, P300, N400, etc.). This convention appears in guidelines first published by the International Federation for Clinical Neurophysiology (IFCN) in 1983, and those updated in 1999 ( http://www.clinph-journal.com/content/guidelinesIFCN ). This convention was also advocated for reporting of data in clinical populations ( Duncan et al., 2009 ), based on original guidelines (Donchin et al., 1977). For MEG data, two conventions are used to refer to these analogous components. One can add an "m" to the name (e.g., N100m, N170m), or simply refer to them as M100, M170 etc. It should be noted that in MEEG there are also other names for certain event-related responses such as the mismatch negativity (MMN), contingent negative variation (CNV), error-related negativity (ERN), that refer to a specific neurophysiological response that is elicited under a particular type of paradigm or which refers to a presumed mental state (e.g., error detection). Some early investigators have referred to event-related components by successive deflections in the EEG waveform (e.g., P1, N1, P2, N2 etc.), however, this system of nomenclature is not generally recommended. Following the IFCN guidelines would, for example, ensure parity across the clinical and healthy participant literature. This applies particularly to neural responses that occur early in time e.g., the somatosensory N20, the auditory N100, the visual N170 etc. That said, there is an established literature on some later ERP components such as P3a and P3b (also known as P300 or the late positive component (LPC)), as well as the MMN, CNV etc. In these cases, referring to their well-established names could be more appropriate (or adapted e.g., P300a, P300b), ideally citing the original article describing the component.
To achieve transparency in results reporting, it is important to explicitly mention the latency window that was used to quantify the amplitude components, especially when the results are subjected to subsequent statistical evaluation. Additionally, for EEG results being reported in sensor space, the recording site(s) should be noted (e.g., vertex N100) to alleviate confusion in the literature, as the polarity of the response can vary as a function of the reference electrode position on the scalp and the underlying cortical folding.

A. Regions of interest
Results of event-related paradigms are often reported as averaged event-related potentials/fields either in sensor space, or as time courses of activation at the source level. For group or experimental condition differences, the description of the difference test statistic (e.g., F-values or t-values with degrees of freedom and p-values, or Bayes Factor) should be reported along with model assumptions, for instance in linear models this would be the Gaussianity of residuals. Any statistics should be complemented by a description of the effect size (e.g., Cohen's d, percentage difference and/or raw magnitude) and its variability (e.g., confidence intervals). Each effect should also be reported, significant or not, allowing readers to evaluate the dataset. This does not only facilitate the comparison with other similar studies, but it also facilitates an informed power analysis in planning future studies and allows building a quantitative, more reproducible view, on brain dynamics .

B. Mass univariate statistical modelling
Unless event-related data analyses in sensor or source space are based on a priori hypotheses drawn from independent data, it is recommended to perform statistical analyses in the full data space (all channels, all independent components, all sources) and across the entire epoch. Reporting must indicate what method (and software) was used to account for multiple testing/comparisons of temporally and spatially uncorrelated values (see Section 5) and indicate the statistical threshold that defines significance. In general, it is good practice to report the explained model variance and data fit (both R-squared and RMSE http://data.library.virginia.edu/is-r-squared-useless/ ), while parameters deriving from the model(s) (e.g., weight estimates, maximum statistical values) can be reported in tables. Each effect should be reported (significant or not), along with details such as the onset, duration, and amplitude of the responses, thereby allowing readers to evaluate the dataset.

C. Multivariate modelling and predictive models
For classical multivariate analysis, as for univariate analysis, reporting the quality of the data fit is recommended, along with a description of all the effects (significant or not).
For predictive models, decoding accuracy (classification), R-squared or RMSE (regression) are the measures of choice. Because these computations in MEEG are typically performed in space, it is essential to specify how accuracy is computed (e.g., over a specified time and/or frequency windows, for entire epochs, for individual participants or groups). It is also important to report how input data were preprocessed, e.g., if some scaling or standardization of features was performed. Accuracy scores with respect to chance level will depend in part on the type of experimental design used (e.g., balanced vs unbalanced). Chance level should be if possible illustrated in figures and mentioned in the text ( Jas et al. 2018 ). In addition to decoding accuracy, the time course of the measure of classification accuracy is also useful to report. The display of confusion matrices can also be helpful to reveal the structure of the errors made by the model. If permutation tests are used, associated p-values should be reported. It is commonly recommended to use a large number of random splits in the cross-validation to better evaluate variance in scores due to the choices of the data partitions between train and test ( Varoquaux et al. 2017 ). The area under a ROC curve can also be used when doing binary classification.

Figures
When displaying MEEG waveforms, it is important to make explicit whether these waveforms represent sensor space (e.g., single channel data, averaged waveforms across a set of channels) or source space, and whether they represent single participant or group data. For evoked activity, a marker indicating stimulus onset/offset on the time-frequency plot is recommended. Similarly, if motor activity is studied, a marker depicting (the onset of) the motor response is suggested. When waveforms are presented in different figures and are compared, they should be presented with identical x-and y-scales and units should be noted. When difference waves (subtraction of one condition from another) are presented, the waveforms of the two conditions compared should also be depicted, to allow the reader to evaluate the nature of the difference Jas et al. 2018 ). If any form of averaging is performed, the variability should also be depicted and the figure caption should explicitly state the nature of the variability measure being depicted (e.g., confidence interval or standard deviation of the mean around the waveform). Moreover, since MEEG deflections are not only defined by their latency, but also by their topography across the head, it is recommended that the waveforms of the full set of channels be shown. If this is impractical due to the number of channels, a relevant sample of channels should be depicted. In addition, for time windows of interest, not only the peak latency should be reported, but also the associated topographic maps or source locations in the brain should be shown. This is particularly important when reporting amplitude differences between conditions or groups at certain channels, as the differences could be due to changes in the topography of the potential or magnetic field.
When topographical distributions of activation (maps) are shown, it is important to make explicit what is being displayed (e.g., magnetic field strength, power, mean voltages, voltage difference, Laplacians, etc.), as well as the time point (or window) that was used to create the topographical display. The number of participants used to generate the topographical plot should be specified, and displays of selected channels of interest on the maps themselves can also be helpful. If multiple topographies are presented (e.g., of different conditions at a given window), they should be presented with the same scale to allow comparison. However, it may also be appropriate to scale each topography to its own range to highlight the pattern more clearly over a smaller range of amplitude. In such cases, the caption of the figure should clearly highlight the different amplitude scales. Colour scales should be chosen to reflect the nature of the data (linear vs diverging color scales) and be linear in luminance to aid with inference ( Pernet & Madan, 2019 ). Colour legends (bars) are also essential.
For mass-univariate and multivariate analyses, statistical maps of the space tested should be displayed, along with the corresponding waveforms and topographic maps. While statistical significance matters, showing only thresholded maps hinders reproducibility. We thus recommend displaying thresholded maps in manuscripts (along with an adequate description of the thresholding method), while providing raw maps for all channels and time/frequency frames in supplementary materials (ideally as a data matrix in a repository and not just a figure). To allow the reader to evaluate the observed effect, both the time course of the model parameters and of the underlying data should be presented. Make sure to label axes appropriately (e.g., average microV, fT, average parameter estimates, T-, F-or p-values) with units of the quantities clearly specified. As for statistical maps, consideration should be given to what figures should appear in the main manuscript versus those that should appear in a Supplementary Materials section.
6.2. Frequency-domain analysis 6.2.1. Naming conventions When reporting spectral analysis results in specific frequency bands, it is recommended to explicitly report the boundaries of the different bands. This is important because the designation of canonical frequency bands (e.g., delta, theta, alpha, beta, gamma) has been subject to considerable variability (e.g., compare Kane, 2017, to Jobert, 2012. We recommend the IFCN guidelines ( Kane, 2017 ) for the delineation of frequency bands, as these remain close to the original frequency bands proposed by Berger in the late 1920s, Walter, and Jasper and Andrews in the 1930s ( Hari & Puce, 2017 ), and are consistent with those recommended in the main clinical textbook in the field ( Krishnan et al., 2018 ). That said, because of an inconsistency across the literature, we have made a slight adjustment to the transition between the alpha and beta ranges to aid the description of results from time-frequency analyses.
These are: infra-slow : < 0. It should be noted that gamma band signals can be recorded at frequencies higher than 80 Hz for MEEG ( Amzica & Lopes da Silva, 2018 ), but that the majority of MEEG studies tend to use the lower values of the gamma range -as originally postulated in the canonical frequency ranges above. Indeed, for MEG the gamma band can extend out to 1 KHz ( Baillet, 2017 ). Similarly, for field potentials recorded from intracranial electrodes, the gamma band also can be extended to 600 Hz and gamma sub-bands also have their own nomenclature (e.g., see Uhlhaas et al., 2011 ). The majority of MEEG studies tends to use the lower values of gamma. Statistical analysis of gamma activity may identify ranges of activity within this very broad frequency band. Therefore, in some cases, reporting the specific values of the frequencies of interest may be more useful than referring to the canonical frequency band.
It is important to be able to distinguish gamma activity from physiological activity elicited to willful saccades -an artifact that is particularly problematic in EEG studies ( Yuval-Greenberg et al., 2008 ). Methods used to separate saccadic potentials (different from eye movement artifacts discussed in pre-processing) from gamma activity need to be clearly described, particularly for studies where gamma activity is the focus of the investigation. Similarly, muscle activity can also contaminate MEEG data. Although the gamma range is the most susceptible, the power spectrum of muscle activity varies according to the muscle involved -and can also invade MEEG beta and alpha ranges [see Goncharova et al., 2003 ].
The mu rhythm is a complex sensorimotor arc-shaped rhythm that occurs in the healthy brain with two main frequency components -one at~10 Hz and the other at ~20 Hz ( Hari, 2006 ). Source localization indicates that these two frequency components have separate sources -the 10 Hz mu component tends to be posterior (relative to the central sulcus), whereas the 20 Hz mu component is anterior, suggesting different functional roles in sensory versus motor cortex, respectively ( Salmelin & Hari, 1994 ). There has been a tendency in the mu rhythm literature to examine only one of these components, and also label this activity as "alpha" or "beta" -creating confusion for those running literature searches. We recommend to specify the actual frequencies and to refer to the different "mu rhythm components" as a means to reduce confusion in the literature.
It should be noted that cortical rhythms such as the posterior alpha rhythm change throughout the lifespan in healthy brains. In young infants (3-4 months of age) a reactive posterior rhythm first appears at~4 Hz, increasing to~6 Hz at 12 months of age and to~8 Hz at 36 months, and reaching adult frequencies of~10 Hz by 6 to 12 years ( Pearl et al., 2018 ). With normal ageing in a healthy elderly individual, the posterior alpha rhythm will slow ( Krishnan et al., 2018 ). There has been some confusion in the literature regarding this posterior reactive rhythm that we know as alpha in the normal adult brain. When studying infants and children, it would be preferable to specify the frequency and distribution of the activity and comment on its reactivity. It is best to avoid using terms such as "baby alpha", as this creates ambiguity in the literature. (One reason is that central/rolandic ("mu") rhythms can actually appear in infants before the manifestation of the posterior reactive rhythm that will become fully-fledged "alpha" activity ( Krishnan et al., 2018 )).

Frequency and time-frequency decomposition
Point estimates of time-frequency resolved activity (both for measures of power and phase) always reflect integration across a frequency range and time window. Since different methods for time-frequency analysis exist, it is important to report the method and associated parameters (i.e., type of filtering) that have been used.
Reporting the referencing scheme is crucial for EEG, since the results of time-frequency analysis depend on this, as in the case of the analysis of event-related activity.
A point needs to be made regarding the use of the term " oscillation ", which is specifically used to describe a spectral peak within a frequency band of interest, and not a general increase in MEEG power within a canonical frequency band ( Lopes da Silva, 2013 ). The oscillation can then be exactly defined by its peak frequency, bandwidth and power. Note that a simple (isolated) spectral peak may be a damped linear resonance giving more energy to that frequency band, but this is not of itself evidence for a self sustained oscillation, which is usually characterised by a strong fundamental frequency, but also (sub-)harmonics at integer multiples of the fundamental frequency.
MEEG activity at lower frequencies (e.g. theta, alpha) can modulate the amplitude (power), frequency or phase of activity at higher frequencies (e.g. gamma) -a phenomenon known as cross-frequency coupling (CFC). When describing CFC analyses, the type of coupling ( Jensen & Colgin, 2007 ) and overall analysis method should be explicitly noted (including keywords regarding the method used may also be helpful for future meta-analyses). Given that even one type of CFC can be extracted using multiple methods (e.g. phase-amplitude coupling, Tort et al., 2010, van Wijk et al., 2015, Dupré la Tour et al., 2017, the analysis methods and all associated parameters, such as filtering parameters, must be specified in detail (as should software used be identified). Finally, quality checks (e.g. Lozano-Soldevilla et al., 2016 ) and dedicated statistical methods should be used to avoid estimation biases ( van Driel et al., 2015 ).

Statistical results
Results reporting should follow the same principles as already described for the time domain (see Section 6.1.2).

Figures
As for all other figures, when displaying frequency spectra, all axes should be clearly labelled and units should be shown. For data with a large range in power/amplitude, a logarithmic scale might be considered as it may be better suited to displaying important features in the data. In time-frequency plots, care should be taken to not only label and display all axes clearly but to also include a calibration bar to show the range of power/amplitude values (or statistical values). For evoked sensory activity, a marker indicating stimulus onset/offset on the time-frequency plot is recommended. Similarly, if motor activity is studied, a marker depicting (the onset of) the motor response is suggested.
Other considerations regarding mass-univariate and multivariate modelling follow the same recommendations that were made for time-domain analyses (see Section 6.1.2).

Spatial and source analyses
Spatial analysis can be restricted to the topographical distribution of potential differences or magnetic fields, or can include source localization and reports of estimated activity in source space. As already described in Section 6.1.3, it is recommended that topographic maps are shown together with the time-or frequency analysis results so that the reader can appreciate both the spatial distribution of the effects reported and their temporal evolution. When maps are displayed, a layout of the electrodes/sensors on the head surface should be shown. Since the topography of the maps in the time domain are independent of the reference in EEG recording, it is recommended that maps displayed against the average reference are centred at zero, so as to optimally and meaningfully exploit the colour scale. If sequential maps are presented, in most cases they should be presented with the same colour scale and if not, this point should be made explicit in the caption and justified. Two-dimensional projections of the sensor layout (seen from the top) are commonly used for display, mainly because activity from the entire set of electrodes can be easily seen at one time. If it is necessary to display 3D maps, different views should be shown, or a cross-sectional cut given, so that no activity is hidden. Ideally, EOG electrodes should also be included in the topographic maps. It is recommended that the amount of extrapolation in the maps be reduced to a minimum, outside the part of the scalp covered by the electrodes as some algorithms do by default. The inclusion of contour lines in topographic maps can also be helpful for showing differences between experimental conditions.
Results of source reconstructions with distributed models (e.g., minimum-norm estimates, dSPM, LORETA, sLORETA) or beamformers (LCMV, DICS, SAM) are often displayed as thresholded (contrast) maps. In this case, the thresholding has to be reported and clearly stated whether it is based on statistical analysis (in which case, a p-value or a q-value (false discovery rate) should also be provided, as well as the method used for correction of multiple comparisons) or what exact criteria for the cutoff were used. Ideally, the non-thresholded maps could also be displayed (in the form of selected orthogonal 2D slices, perhaps as supplementary materials), in order to give an impression of the amount of spatial blur of the reconstruction. If results are based on ROIs or virtual channels, specify whether or not the ROI was identified prior to any data analysis and how it was defined. If brain areas are masked in the display or excluded in the source reconstruction or the ROI definition, this has to be clearly stated, and preferably immediately apparent on the image. If individual head models are used for source reconstruction and then co-registered to a template head model (e.g., MNI 152) for group averages and statistical comparisons, the co-registration procedure (and parameters) has to be specified.
If 3D coordinates for source analysis are reported in tables, each table should be clearly labeled as to which contrast/effect it refers to (nature and direction of the contrast, individual versus group result, group size), and should have columns for: Anatomical region, X-Y-Z coordinate, t/Z/F statistic, and the p-value or Bayes factor on which the inference is based. The table caption should clearly state (even if repeated in the body of the text) the significance criterion used to obtain these coordinates, and whether they represent a subset of all such significant results (e.g., all findings from whole-brain significance, or just those in a selected anatomical region).

Connectivity analysis
The Committee acknowledges that the term "connectivity" has been somewhat problematic since it is often loosely used and is an umbrella term to refer to multiple methods, creating some confusion in the literature (see O 'Neill et al., 2018 ;He et al., 2019 ). We thus recommend (i) to always explicitly refer to effective (i.e. causal) or functional (i.e. correlational) connectivity and ii) have a more informative approach in which one specifies the exact method used e.g. effective Granger connectivity, functional partial coherence connectivity, functional power envelope correlation connectivity, etc. Depending on the exact analytic details, connectivity analysis may lead to results that contain very large amounts of data, for instance with all pairwise connectivity estimates in source space, resolved in time and frequency. If data reduction approaches are applied (e.g., the use of descriptive metrics from graph theory) to describe and display general patterns in the data, these approaches should be fully documented. Alternatively, a priori selection of connections-of-interest could limit the number of data points in the solution space. Either way, it needs to be clearly stated and justified how the results that are subjected to subsequent statistical evaluation have been derived.
For effective connectivity methods (generative/model based), such as Dynamic Causal Modelling ( Kiebel et al., 2008 ;Danizeau et al., 2011 ), details should be given on the neural model employed (e.g., Event Related Potential, canonical microcircuit), the full space of functional architectures considered and connectivity matrices present/modulated (forward, backward, lateral, if intrinsic), the vector of between-trial effects, the number of modes, the temporal window modelled, and the priors on source locations. A display for the distribution of the probability (expected and/or exceedance, if using RFX), or log likelihood ratios (when using FFX) over all models considered should be provided. Information about the connectivity parameters estimated and their variability should be noted (e.g., confidence intervals, p-values if comparing groups, etc.). Also, information regarding how parameters were computed at the group level (e.g., Bayesian Model Averaging (BMA) over all models versus conditioning on the winning family/model) should be provided. When several models are compared, it is essential to ensure that the selected model not only performs better than the other alternatives but also adequately explains (fits) the experimental data.
7. Replicability and Data sharing 7.1. Replicable MEEG The recommendations made in Sections 2 to 6 correspond to current best practice as in 2019. Reporting the data using these criteria should allow the derivation of reproducible results, as well as allowing any studies to be replicated. As more and more complex analysis pipelines are being used, the more details need to be reported. This often contradicts good writing recommendations (e.g. be concise) and journal policies (limited number of words or pages). Our recommendation is thus (i) to use the COBIDAS (MRI and MEEG alike) tables in the Appendices to prepare supplementary materials where details of methods with parameters are described; and (ii) to share the analysis code using dedicated repositories such as GitHub (see Eglen et al, 2017 for simple steps to follow).
In addition to these recommendations, we encourage the MEEG community to share the raw and derived data (see Section 7.2) together with the scripts used to process the data. Sharing of the data and scripts foster reproducibility, and re-usage of scripts allows replicability across laboratories. One of the challenges in replicability for MEEG studies is the large data space and variety of methods. In that respect, sharing of derived data is essential as it allows the comparison of effect sizes rather than binary results, very much akin to fMRI data where statistical maps are shared, allowing direct comparisons of results. In an era of electronic journal articles, it is relatively easy to share the data that generated figures. For instance, grand average ERP data between two conditions consist of a file of a few kilobytes that can easily be added as supplementary material or posted in a data repository. While we appreciate the complexity of sharing raw data (see below), sharing data behind figures will allow more direct comparisons, replications and aggregations of results across studies (e.g., meta-analysis).
Sharing data may not always feasible since it requires having obtained consent from participants to do so. This can be particularly problematic for clinical samples where issues of confidentiality may be a concern. Along these lines, datasets containing whole head anatomical MRI data have implications for both subject privacy and reproducibility (e.g. the head model cannot be reconstructed if the T1-weighted image is defaced and skull stripped). Issues of confidentiality are currently handled or are being evaluated by various countries, so cross-continental data-sharing initiatives may encounter some challenges ( Open brain consent working group ). Hence, it is critical to seek ethical clearance from subjects regarding data sharing before embarking on the study; initiatives like the open brain consent provide easy to follow templates.

BIDS MEG and EEG
The brain imaging data structure (BIDS) is a "simple" way to share neuroimaging data using generally agreed standards in the neuroimaging community. The BIDS initiative started with a meeting at Stanford in Spring 2015, followed by follow-up meetings at respective OHBM and INCF annual meetings in the same year, with a first release candidate and public call for comments in September 2015. Initially, BIDS focused on MRI data (anatomical, diffusion and functional, see Gorgolewski et al., 2016 ), but now encompasses many modalities including MEEG. BIDS offers a systematic way to organize data into folders using dedicated names, in association with text files, either as tabulated separated value file (.tsv) or JavaScript Object Notation file (.json)) to store metadata. MEG BIDS was created first ( Niso et al., 2018 ) and EEG BIDS mostly follows the same structure, with differences mainly relating to meta-data . We encourage the MEEG community to share their data using this data structure as it facilitates communications, increases reproducibility and makes easier to develop data analysis pipelines. The validity of a dataset according to the standard can be checked with an online validator ( http://incf.github.io/bids-validator/ ).   Workflow -Indicate in detail the exact order in which preprocessing steps took place.

Appendix
Software -Which software and version was/were used for preprocessing and processing, and analysis platform? -In-house code? should be shared/made public.
Generic preprocessing -Indicate any downsampling of the data.
-If electrodes/sensors were removed, which identification method was used, which ones were deleted, if missing channel interpolation is performed indicate which method.
-Specify detrending method (typically polynomial order) for baseline correction.
-Specify noise normalization method (typically used in multivariate analyses).
-If data segmentation is performed, indicate the number of epochs per subject per condition.
-Indicate the spectral decomposition algorithm and parameters, and if applied before/after segmentation.
Detection/rejection/correction of artifacts -Indicate what types of artifact are present in the data.
-For automatic artifact detection, describe algorithms used and their respective parameters (e.g., amplitude thresholds).
-For manual detection, indicate the criteria used with as much detail as needed for reproducibility.
-Indicate if trials with artifacts were rejected or corrected. If using correction, indicate method(s) and parameters.
-If trials/segments of data with artifacts have been removed, indicate the average number of remaining trials per condition across participants (include minimum and maximum number of trials across participants).
-For resting state data, specify the length of time of the artifact-free data.
Correction of artifacts using BSS/ICA -Indicate how many total components were generated, what type of artifact was identified and how, and how many components were removed (on average across participants).
-Display example topographies of the ICs that were removed.
-Justify choice of the re-reference scheme.
Source modelling -Method of co-registration of measurement sensors to anatomical MRI scan of the participant's head or MRI template (for EEG in particular)? -Volume conductor model (e.g., BEM/FEM) and tissue conductivity values (for EEG), procedure for anatomical image segmentation? -Source model details (e.g., dipole, distributed, dipole scanning, volumetric or surface based), number of source points and their average distance? -Report parameters used for source estimation (i.e., regularization of the data covariance matrix; constraints used for source model).
-Detail exact variables that have been analysed (which of the data was partialised, marginalised, or conditioned).
-For model based approach, indicate model parameters.
-Specify metrics of coupling.

Table 4. Statistical analyses
ROIs -How were these were determined, i.e. what was the mode of selection (e.g., a priori from literature or independent data)? -Report specific sensors/regions of interest, peaks, components, time and/or frequency window, source.
Summary measures -Report how these were obtained.
-Justify how the selection of dependent variables is unbiased (especially how the temporal and spatial ROIs were chosen).
Statistical analysis/modelling -Software and version used, and analysis platform? -Report model used including all regressors (and covariates of no interest). -Check and report statistical assumptions (e.g., normality, sphericity). -Provide model details when complex designed are used. -Provide details on classification method and validation procedure. -Note method used for multiple comparisons correction and chosen level of statistical significance. -Report classifier used, the distance metric used and the parameters. -How was chance level determined? -Detail cross-validation scheme. -Report/justify data reduction method and parameters if used (PCA, SVD, etc.).
Source modeling -Indicate quality of the model (goodness of fit, percentage of variance explained, residual mean squares). -Report spatial uncertainty for sources.

Connectivity analyses
-Sensor or source space? -Report epoch length.
-Software and version, and analysis platform? -Domain, type of connectivity and measure(s) used? -Definitions of nodes/regions of interest.

DCM
-Specify type of neuronal model.
-Ensure fit of model to data before comparing different models.
-Define all connectivity architectures tested and connectivity matrices present and modulated.
-Describe statistics used for model/family inference (Random vs. Fixed effects) and parameter inference (Frequentist vs. Bayesian). Spatial analyses -Show all source results, including time courses. For distributed models, show the full non-thresholded map along with the thresholded one -Topography: include information related to the spatial layout of sensors -Tables: present contrast/effect tested, anatomical region, X-Y-Z coordinate, T/Z/F statistic, and the P-value (or Bayes Factor) on which inference is based Connectivity analysis -Explicitly report type of data reduction performed and/or space selection -Report all metrics tested and associated values -If using a null model for statistical comparison, report how this was generated -For DCM, report the distribution of probability (expected or/and exceedance) over models considered and the statistics on the connectivity parameters