osl_ephys.preprocessing.batch#

Tools for batch preprocessing.

Attributes#

logger

Functions#

`print_custom_func_info`(func)	Prints info for user-specified functions.
`import_data`(infile[, preload])	Imports data from a file.
`find_func`(method[, target, extra_funcs])	Find a preprocessing function.
`load_config`(config)	Load config.
`check_config_versions`(config)	Get config from a preprocessed fif file.
`get_config_from_fif`(inst)	Get config from a preprocessed fif file.
`append_preproc_info`(dataset, config[, extra_funcs])	Add to the config of already preprocessed data to `inst.info['description']`.
`write_dataset`(dataset, outbase, run_id[, ftype, ...])	Write preprocessed data to a file.
`read_dataset`(fif[, preload, ftype])	Reads `fif`/`npy`/`yml` files associated with a dataset.
`plot_preproc_flowchart`(config[, outname, show, ...])	Make a summary flowchart of a preprocessing chain.
`run_proc_chain`(config, infile[, subject, ftype, ...])	Run preprocessing for a single file.
`run_proc_batch`(config, files[, subjects, ftype, ...])	Run batched preprocessing.
`main`([argv])	Main function for command line interface.

Module Contents#

osl_ephys.preprocessing.batch.logger = None[source]#

osl_ephys.preprocessing.batch.print_custom_func_info(func)[source]#

Prints info for user-specified functions.

Parameters:: func (function) – Function to wrap.
Returns:: Wrapped function.
Return type:: function

osl_ephys.preprocessing.batch.import_data(infile, preload=True)[source]#

Imports data from a file.

Parameters:

infile (str) – Path to file to read. File can be bti, fif, ds, meg4 or vhdr.
preload (bool) – Should we load the data in the file?

Returns:

raw – Data as an MNE Raw object.

Return type:

mne.io.Raw

osl_ephys.preprocessing.batch.find_func(method, target='raw', extra_funcs=None)[source]#

Find a preprocessing function.

Function priority:

User custom function
MNE/osl-ephys wrapper
MNE method on Raw or Epochs (specified by target)

Parameters:

method (str) – Function name.
target (str) – Type of MNE object to preprocess. Can be 'raw', 'epochs', 'evoked', 'power' or 'itc'.
extra_funcs (list) – List of user-defined functions.

Returns:

Function to preprocess an MNE object.

Return type:

function

osl_ephys.preprocessing.batch.load_config(config)[source]#

Load config.

Parameters:: config (str or dict) – Path to yaml file or string to convert to dict or a dict.
Returns:: Preprocessing config.
Return type:: dict

osl_ephys.preprocessing.batch.check_config_versions(config)[source]#

Get config from a preprocessed fif file.

Parameters:

config (dictionary or yaml string) – Preprocessing configuration to check.

Raises:

AssertionError – Raised if package version mismatch found in ‘version_assert’
Warning – Raised if package version mismatch found in ‘version_warn’

osl_ephys.preprocessing.batch.get_config_from_fif(inst)[source]#

Get config from a preprocessed fif file.

Reads the inst.info['description'] field of a fif file to get the preprocessing config.

Parameters:: inst (mne.io.Raw, mne.Epochs, mne.Evoked) – Preprocessed MNE object.
Returns:: Preprocessing config.
Return type:: dict

osl_ephys.preprocessing.batch.append_preproc_info(dataset, config, extra_funcs=None)[source]#

Add to the config of already preprocessed data to inst.info['description'].

Parameters:

dataset (dict) – Preprocessed dataset.
config (dict) – Preprocessing config.

Returns:

Dataset dict containing the preprocessed data edited in place.

Return type:

dict

osl_ephys.preprocessing.batch.write_dataset(dataset, outbase, run_id, ftype='preproc-raw', overwrite=False, skip=None)[source]#

Write preprocessed data to a file.

Will write all keys in the dataset dict to disk with corresponding extensions.

Parameters:

dataset (dict) – Preprocessed dataset.
outbase (str) – Path to directory to write to.
run_id (str) – ID for the output file.
ftype (str) – Extension for the fif file (default preproc-raw)
overwrite (bool) – Should we overwrite if the file already exists?
skip (list or None) – List of keys to skip writing to disk. If None, we don’t skip any keys.
Output –
------ –
fif_outname (str) – The saved fif file name

osl_ephys.preprocessing.batch.read_dataset(fif, preload=False, ftype=None)[source]#

Reads fif/npy/yml files associated with a dataset.

Parameters:

fif (str) – Path to raw fif file (can be preprocessed).
preload (bool) – Should we load the raw fif data?
ftype (str) – Extension for the fif file (will be replaced for e.g. '_events.npy' or '_ica.fif'). If None, we assume the fif file is preprocessed with osl-ephys and has the extension '_preproc-raw'. If this fails, we guess the extension as whatever comes after the last '_'.

Returns:

dataset – Contains keys: 'raw', 'events', 'event_id', 'epochs', 'ica'.

Return type:

dict

osl_ephys.preprocessing.batch.plot_preproc_flowchart(config, outname=None, show=False, stagecol='wheat', startcol='red', fig=None, ax=None, title=None)[source]#

Make a summary flowchart of a preprocessing chain.

Parameters:

config (dict) – Preprocessing config to plot.
outname (str) – Output filename.
show (bool) – Should we show the plot?
stagecol (str) – Stage colour.
startcol (str) – Start colour.
fig (matplotlib.figure) – Matplotlib figure to plot on.
ax (matplotlib.axes) – Matplotlib axes to plot on.
title (str) – Title for the plot.

Returns:

fig (matplotlib.figure)
ax (matplotlib.axes)

osl_ephys.preprocessing.batch.run_proc_chain(config, infile, subject=None, ftype='preproc-raw', outdir=None, logsdir=None, reportdir=None, ret_dataset=True, gen_report=None, overwrite=False, skip_save=None, extra_funcs=None, random_seed='auto', verbose='INFO', mneverbose='WARNING')[source]#

Run preprocessing for a single file.

Parameters:

config (str or dict) – Preprocessing config.
infile (str) – Path to input file.
subject (str) – Subject ID. This will be the sub-directory in outdir.
ftype (str) – Extension for the fif file (default preproc-raw)
outdir (str) – Output directory.
logsdir (str) – Directory to save log files to.
reportdir (str) – Directory to save report files to.
ret_dataset (bool) – Should we return a dataset dict?
gen_report (bool) – Should we generate a report?
overwrite (bool) – Should we overwrite the output file if it already exists?
skip_save (list or None (default)) – List of keys to skip writing to disk. If None, we don’t skip any keys.
extra_funcs (list) – User-defined functions.
random_seed ('auto' (default), int or None) – Random seed to set. If ‘auto’, a random seed will be generated. Random seeds are set for both Python and NumPy. If None, no random seed is set.
verbose (str) – Level of info to print. Can be: 'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG' or 'NOTSET'.
mneverbose (str) – Level of info from MNE to print. Can be: 'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG' or 'NOTSET'.

Returns:

If ret_dataset=True, a dict containing the preprocessed dataset with the following keys: raw, ica, epochs, events, event_id. An empty dict is returned if preprocessing fails. If ret_dataset=False, we return a flag indicating whether preprocessing was successful.

Return type:

dict or bool

osl_ephys.preprocessing.batch.run_proc_batch(config, files, subjects=None, ftype='preproc-raw', outdir=None, logsdir=None, reportdir=None, gen_report=True, overwrite=False, skip_save=None, extra_funcs=None, covs=None, random_seed='auto', verbose='INFO', mneverbose='WARNING', strictrun=False, dask_client=False)[source]#

Run batched preprocessing.

This function will write output to disk (i.e. will not return the preprocessed data).

Parameters:

config (str or dict) – Preprocessing config.
files (str or list or mne.Raw) – Can be a list of Raw objects or a list of filenames (or .ds dir names if CTF data) or a path to a textfile list of filenames (or .ds dir names if CTF data).
subjects (list of str) – Subject directory names. These are sub-directories in outdir.
ftype (None or str) – Extension of the preprocessed fif files. Default option is _preproc-raw.
outdir (str) – Output directory.
logsdir (str) – Directory to save log files to.
reportdir (str) – Directory to save report files to.
gen_report (bool) – Should we generate a report?
overwrite (bool) – Should we overwrite the output file if it exists?
skip_save (list or None (default)) – List of keys to skip writing to disk. If None, we don’t skip any keys.
extra_funcs (list) – User-defined functions.
covs (dict or pd.DataFrame) – Covariates to use for building the GLM design
random_seed ('auto' (default), int or None) – Random seed to set. If ‘auto’, a random seed will be generated. Random seeds are set for both Python and NumPy. If None, no random seed is set.
verbose (str) – Level of info to print. Can be: 'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG' or 'NOTSET'.
mneverbose (str) – Level of info from MNE to print. Can be: 'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG' or 'NOTSET'.
strictrun (bool) – Should we ask for confirmation of user inputs before starting?
dask_client (bool) – Indicate whether to use a previously initialised dask.distributed.Client instance.

Returns:

Flags indicating whether preprocessing was successful for each input file.

Return type:

list of bool

Notes

If you are using a dask.distributed.Client instance, you must initialise it before calling this function. For example:

>>> from dask.distributed import Client
>>> client = Client(threads_per_worker=1, n_workers=4)

osl_ephys.preprocessing.batch.main(argv=None)[source]#

Main function for command line interface.

Parameters:: argv (list) – Command line arguments.