osl_ephys.preprocessing.batch#

Tools for batch preprocessing.

Attributes#

logger

Functions#

print_custom_func_info(func)

Prints info for user-specified functions.

import_data(infile[, preload])

Imports data from a file.

find_func(method[, target, extra_funcs])

Find a preprocessing function.

load_config(config)

Load config.

check_config_versions(config)

Get config from a preprocessed fif file.

get_config_from_fif(inst)

Get config from a preprocessed fif file.

append_preproc_info(dataset, config[, extra_funcs])

Add to the config of already preprocessed data to inst.info['description'].

write_dataset(dataset, outbase, run_id[, ftype, ...])

Write preprocessed data to a file.

read_dataset(fif[, preload, ftype])

Reads fif/npy/yml files associated with a dataset.

plot_preproc_flowchart(config[, outname, show, ...])

Make a summary flowchart of a preprocessing chain.

run_proc_chain(config, infile[, subject, ftype, ...])

Run preprocessing for a single file.

run_proc_batch(config, files[, subjects, ftype, ...])

Run batched preprocessing.

main([argv])

Main function for command line interface.

Module Contents#

osl_ephys.preprocessing.batch.logger = None[source]#
osl_ephys.preprocessing.batch.print_custom_func_info(func)[source]#

Prints info for user-specified functions.

Parameters:

func (function) – Function to wrap.

Returns:

Wrapped function.

Return type:

function

osl_ephys.preprocessing.batch.import_data(infile, preload=True)[source]#

Imports data from a file.

Parameters:
  • infile (str) – Path to file to read. File can be bti, fif, ds, meg4 or vhdr.

  • preload (bool) – Should we load the data in the file?

Returns:

raw – Data as an MNE Raw object.

Return type:

mne.io.Raw

osl_ephys.preprocessing.batch.find_func(method, target='raw', extra_funcs=None)[source]#

Find a preprocessing function.

Function priority:

  1. User custom function

  2. MNE/osl-ephys wrapper

  3. MNE method on Raw or Epochs (specified by target)

Parameters:
  • method (str) – Function name.

  • target (str) – Type of MNE object to preprocess. Can be 'raw', 'epochs', 'evoked', 'power' or 'itc'.

  • extra_funcs (list) – List of user-defined functions.

Returns:

Function to preprocess an MNE object.

Return type:

function

osl_ephys.preprocessing.batch.load_config(config)[source]#

Load config.

Parameters:

config (str or dict) – Path to yaml file or string to convert to dict or a dict.

Returns:

Preprocessing config.

Return type:

dict

osl_ephys.preprocessing.batch.check_config_versions(config)[source]#

Get config from a preprocessed fif file.

Parameters:

config (dictionary or yaml string) – Preprocessing configuration to check.

Raises:
  • AssertionError – Raised if package version mismatch found in ‘version_assert’

  • Warning – Raised if package version mismatch found in ‘version_warn’

osl_ephys.preprocessing.batch.get_config_from_fif(inst)[source]#

Get config from a preprocessed fif file.

Reads the inst.info['description'] field of a fif file to get the preprocessing config.

Parameters:

inst (mne.io.Raw, mne.Epochs, mne.Evoked) – Preprocessed MNE object.

Returns:

Preprocessing config.

Return type:

dict

osl_ephys.preprocessing.batch.append_preproc_info(dataset, config, extra_funcs=None)[source]#

Add to the config of already preprocessed data to inst.info['description'].

Parameters:
  • dataset (dict) – Preprocessed dataset.

  • config (dict) – Preprocessing config.

Returns:

Dataset dict containing the preprocessed data edited in place.

Return type:

dict

osl_ephys.preprocessing.batch.write_dataset(dataset, outbase, run_id, ftype='preproc-raw', overwrite=False, skip=None)[source]#

Write preprocessed data to a file.

Will write all keys in the dataset dict to disk with corresponding extensions.

Parameters:
  • dataset (dict) – Preprocessed dataset.

  • outbase (str) – Path to directory to write to.

  • run_id (str) – ID for the output file.

  • ftype (str) – Extension for the fif file (default preproc-raw)

  • overwrite (bool) – Should we overwrite if the file already exists?

  • skip (list or None) – List of keys to skip writing to disk. If None, we don’t skip any keys.

  • Output

  • ------

  • fif_outname (str) – The saved fif file name

osl_ephys.preprocessing.batch.read_dataset(fif, preload=False, ftype=None)[source]#

Reads fif/npy/yml files associated with a dataset.

Parameters:
  • fif (str) – Path to raw fif file (can be preprocessed).

  • preload (bool) – Should we load the raw fif data?

  • ftype (str) – Extension for the fif file (will be replaced for e.g. '_events.npy' or '_ica.fif'). If None, we assume the fif file is preprocessed with osl-ephys and has the extension '_preproc-raw'. If this fails, we guess the extension as whatever comes after the last '_'.

Returns:

dataset – Contains keys: 'raw', 'events', 'event_id', 'epochs', 'ica'.

Return type:

dict

osl_ephys.preprocessing.batch.plot_preproc_flowchart(config, outname=None, show=False, stagecol='wheat', startcol='red', fig=None, ax=None, title=None)[source]#

Make a summary flowchart of a preprocessing chain.

Parameters:
  • config (dict) – Preprocessing config to plot.

  • outname (str) – Output filename.

  • show (bool) – Should we show the plot?

  • stagecol (str) – Stage colour.

  • startcol (str) – Start colour.

  • fig (matplotlib.figure) – Matplotlib figure to plot on.

  • ax (matplotlib.axes) – Matplotlib axes to plot on.

  • title (str) – Title for the plot.

Returns:

  • fig (matplotlib.figure)

  • ax (matplotlib.axes)

osl_ephys.preprocessing.batch.run_proc_chain(config, infile, subject=None, ftype='preproc-raw', outdir=None, logsdir=None, reportdir=None, ret_dataset=True, gen_report=None, overwrite=False, skip_save=None, extra_funcs=None, random_seed='auto', verbose='INFO', mneverbose='WARNING')[source]#

Run preprocessing for a single file.

Parameters:
  • config (str or dict) – Preprocessing config.

  • infile (str) – Path to input file.

  • subject (str) – Subject ID. This will be the sub-directory in outdir.

  • ftype (str) – Extension for the fif file (default preproc-raw)

  • outdir (str) – Output directory.

  • logsdir (str) – Directory to save log files to.

  • reportdir (str) – Directory to save report files to.

  • ret_dataset (bool) – Should we return a dataset dict?

  • gen_report (bool) – Should we generate a report?

  • overwrite (bool) – Should we overwrite the output file if it already exists?

  • skip_save (list or None (default)) – List of keys to skip writing to disk. If None, we don’t skip any keys.

  • extra_funcs (list) – User-defined functions.

  • random_seed ('auto' (default), int or None) – Random seed to set. If ‘auto’, a random seed will be generated. Random seeds are set for both Python and NumPy. If None, no random seed is set.

  • verbose (str) – Level of info to print. Can be: 'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG' or 'NOTSET'.

  • mneverbose (str) – Level of info from MNE to print. Can be: 'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG' or 'NOTSET'.

Returns:

If ret_dataset=True, a dict containing the preprocessed dataset with the following keys: raw, ica, epochs, events, event_id. An empty dict is returned if preprocessing fails. If ret_dataset=False, we return a flag indicating whether preprocessing was successful.

Return type:

dict or bool

osl_ephys.preprocessing.batch.run_proc_batch(config, files, subjects=None, ftype='preproc-raw', outdir=None, logsdir=None, reportdir=None, gen_report=True, overwrite=False, skip_save=None, extra_funcs=None, covs=None, random_seed='auto', verbose='INFO', mneverbose='WARNING', strictrun=False, dask_client=False)[source]#

Run batched preprocessing.

This function will write output to disk (i.e. will not return the preprocessed data).

Parameters:
  • config (str or dict) – Preprocessing config.

  • files (str or list or mne.Raw) – Can be a list of Raw objects or a list of filenames (or .ds dir names if CTF data) or a path to a textfile list of filenames (or .ds dir names if CTF data).

  • subjects (list of str) – Subject directory names. These are sub-directories in outdir.

  • ftype (None or str) – Extension of the preprocessed fif files. Default option is _preproc-raw.

  • outdir (str) – Output directory.

  • logsdir (str) – Directory to save log files to.

  • reportdir (str) – Directory to save report files to.

  • gen_report (bool) – Should we generate a report?

  • overwrite (bool) – Should we overwrite the output file if it exists?

  • skip_save (list or None (default)) – List of keys to skip writing to disk. If None, we don’t skip any keys.

  • extra_funcs (list) – User-defined functions.

  • covs (dict or pd.DataFrame) – Covariates to use for building the GLM design

  • random_seed ('auto' (default), int or None) – Random seed to set. If ‘auto’, a random seed will be generated. Random seeds are set for both Python and NumPy. If None, no random seed is set.

  • verbose (str) – Level of info to print. Can be: 'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG' or 'NOTSET'.

  • mneverbose (str) – Level of info from MNE to print. Can be: 'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG' or 'NOTSET'.

  • strictrun (bool) – Should we ask for confirmation of user inputs before starting?

  • dask_client (bool) – Indicate whether to use a previously initialised dask.distributed.Client instance.

Returns:

Flags indicating whether preprocessing was successful for each input file.

Return type:

list of bool

Notes

If you are using a dask.distributed.Client instance, you must initialise it before calling this function. For example:

>>> from dask.distributed import Client
>>> client = Client(threads_per_worker=1, n_workers=4)
osl_ephys.preprocessing.batch.main(argv=None)[source]#

Main function for command line interface.

Parameters:

argv (list) – Command line arguments.