osl_ephys.preprocessing.batch#
Tools for batch preprocessing.
Attributes#
Functions#
|
Prints info for user-specified functions. |
|
Imports data from a file. |
|
Find a preprocessing function. |
|
Load config. |
|
Get config from a preprocessed fif file. |
|
Get config from a preprocessed fif file. |
|
Add to the config of already preprocessed data to |
|
Write preprocessed data to a file. |
|
Reads |
|
Make a summary flowchart of a preprocessing chain. |
|
Run preprocessing for a single file. |
|
Run batched preprocessing. |
|
Main function for command line interface. |
Module Contents#
- osl_ephys.preprocessing.batch.print_custom_func_info(func)[source]#
Prints info for user-specified functions.
- Parameters:
func (function) – Function to wrap.
- Returns:
Wrapped function.
- Return type:
function
- osl_ephys.preprocessing.batch.import_data(infile, preload=True)[source]#
Imports data from a file.
- Parameters:
infile (str) – Path to file to read. File can be bti, fif, ds, meg4 or vhdr.
preload (bool) – Should we load the data in the file?
- Returns:
raw – Data as an MNE Raw object.
- Return type:
- osl_ephys.preprocessing.batch.find_func(method, target='raw', extra_funcs=None)[source]#
Find a preprocessing function.
Function priority:
User custom function
MNE/osl-ephys wrapper
MNE method on Raw or Epochs (specified by target)
- Parameters:
method (str) – Function name.
target (str) – Type of MNE object to preprocess. Can be
'raw','epochs','evoked','power'or'itc'.extra_funcs (list) – List of user-defined functions.
- Returns:
Function to preprocess an MNE object.
- Return type:
function
- osl_ephys.preprocessing.batch.load_config(config)[source]#
Load config.
- Parameters:
config (str or dict) – Path to yaml file or string to convert to dict or a dict.
- Returns:
Preprocessing config.
- Return type:
dict
- osl_ephys.preprocessing.batch.check_config_versions(config)[source]#
Get config from a preprocessed fif file.
- Parameters:
config (dictionary or yaml string) – Preprocessing configuration to check.
- Raises:
AssertionError – Raised if package version mismatch found in ‘version_assert’
Warning – Raised if package version mismatch found in ‘version_warn’
- osl_ephys.preprocessing.batch.get_config_from_fif(inst)[source]#
Get config from a preprocessed fif file.
Reads the
inst.info['description']field of a fif file to get the preprocessing config.- Parameters:
inst (
mne.io.Raw,mne.Epochs,mne.Evoked) – Preprocessed MNE object.- Returns:
Preprocessing config.
- Return type:
dict
- osl_ephys.preprocessing.batch.append_preproc_info(dataset, config, extra_funcs=None)[source]#
Add to the config of already preprocessed data to
inst.info['description'].- Parameters:
dataset (dict) – Preprocessed dataset.
config (dict) – Preprocessing config.
- Returns:
Dataset dict containing the preprocessed data edited in place.
- Return type:
dict
- osl_ephys.preprocessing.batch.write_dataset(dataset, outbase, run_id, ftype='preproc-raw', overwrite=False, skip=None)[source]#
Write preprocessed data to a file.
Will write all keys in the dataset dict to disk with corresponding extensions.
- Parameters:
dataset (dict) – Preprocessed dataset.
outbase (str) – Path to directory to write to.
run_id (str) – ID for the output file.
ftype (str) – Extension for the fif file (default
preproc-raw)overwrite (bool) – Should we overwrite if the file already exists?
skip (list or None) – List of keys to skip writing to disk. If None, we don’t skip any keys.
Output –
------ –
fif_outname (str) – The saved fif file name
- osl_ephys.preprocessing.batch.read_dataset(fif, preload=False, ftype=None)[source]#
Reads
fif/npy/ymlfiles associated with a dataset.- Parameters:
fif (str) – Path to raw fif file (can be preprocessed).
preload (bool) – Should we load the raw fif data?
ftype (str) – Extension for the fif file (will be replaced for e.g.
'_events.npy'or'_ica.fif'). IfNone, we assume the fif file is preprocessed withosl-ephysand has the extension'_preproc-raw'. If this fails, we guess the extension as whatever comes after the last'_'.
- Returns:
dataset – Contains keys:
'raw','events','event_id','epochs','ica'.- Return type:
dict
- osl_ephys.preprocessing.batch.plot_preproc_flowchart(config, outname=None, show=False, stagecol='wheat', startcol='red', fig=None, ax=None, title=None)[source]#
Make a summary flowchart of a preprocessing chain.
- Parameters:
config (dict) – Preprocessing config to plot.
outname (str) – Output filename.
show (bool) – Should we show the plot?
stagecol (str) – Stage colour.
startcol (str) – Start colour.
fig (matplotlib.figure) – Matplotlib figure to plot on.
ax (
matplotlib.axes) – Matplotlib axes to plot on.title (str) – Title for the plot.
- Returns:
fig (
matplotlib.figure)ax (
matplotlib.axes)
- osl_ephys.preprocessing.batch.run_proc_chain(config, infile, subject=None, ftype='preproc-raw', outdir=None, logsdir=None, reportdir=None, ret_dataset=True, gen_report=None, overwrite=False, skip_save=None, extra_funcs=None, random_seed='auto', verbose='INFO', mneverbose='WARNING')[source]#
Run preprocessing for a single file.
- Parameters:
config (str or dict) – Preprocessing config.
infile (str) – Path to input file.
subject (str) – Subject ID. This will be the sub-directory in outdir.
ftype (str) – Extension for the fif file (default
preproc-raw)outdir (str) – Output directory.
logsdir (str) – Directory to save log files to.
reportdir (str) – Directory to save report files to.
ret_dataset (bool) – Should we return a dataset dict?
gen_report (bool) – Should we generate a report?
overwrite (bool) – Should we overwrite the output file if it already exists?
skip_save (list or None (default)) – List of keys to skip writing to disk. If None, we don’t skip any keys.
extra_funcs (list) – User-defined functions.
random_seed ('auto' (default), int or None) – Random seed to set. If ‘auto’, a random seed will be generated. Random seeds are set for both Python and NumPy. If None, no random seed is set.
verbose (str) – Level of info to print. Can be:
'CRITICAL','ERROR','WARNING','INFO','DEBUG'or'NOTSET'.mneverbose (str) – Level of info from MNE to print. Can be:
'CRITICAL','ERROR','WARNING','INFO','DEBUG'or'NOTSET'.
- Returns:
If
ret_dataset=True, a dict containing the preprocessed dataset with the following keys:raw,ica,epochs,events,event_id. An empty dict is returned if preprocessing fails. Ifret_dataset=False, we return a flag indicating whether preprocessing was successful.- Return type:
dict or bool
- osl_ephys.preprocessing.batch.run_proc_batch(config, files, subjects=None, ftype='preproc-raw', outdir=None, logsdir=None, reportdir=None, gen_report=True, overwrite=False, skip_save=None, extra_funcs=None, covs=None, random_seed='auto', verbose='INFO', mneverbose='WARNING', strictrun=False, dask_client=False)[source]#
Run batched preprocessing.
This function will write output to disk (i.e. will not return the preprocessed data).
- Parameters:
config (str or dict) – Preprocessing config.
files (str or list or mne.Raw) – Can be a list of Raw objects or a list of filenames (or
.dsdir names if CTF data) or a path to a textfile list of filenames (or.dsdir names if CTF data).subjects (list of str) – Subject directory names. These are sub-directories in outdir.
ftype (None or str) – Extension of the preprocessed fif files. Default option is _preproc-raw.
outdir (str) – Output directory.
logsdir (str) – Directory to save log files to.
reportdir (str) – Directory to save report files to.
gen_report (bool) – Should we generate a report?
overwrite (bool) – Should we overwrite the output file if it exists?
skip_save (list or None (default)) – List of keys to skip writing to disk. If None, we don’t skip any keys.
extra_funcs (list) – User-defined functions.
covs (dict or pd.DataFrame) – Covariates to use for building the GLM design
random_seed ('auto' (default), int or None) – Random seed to set. If ‘auto’, a random seed will be generated. Random seeds are set for both Python and NumPy. If None, no random seed is set.
verbose (str) – Level of info to print. Can be:
'CRITICAL','ERROR','WARNING','INFO','DEBUG'or'NOTSET'.mneverbose (str) – Level of info from MNE to print. Can be:
'CRITICAL','ERROR','WARNING','INFO','DEBUG'or'NOTSET'.strictrun (bool) – Should we ask for confirmation of user inputs before starting?
dask_client (bool) – Indicate whether to use a previously initialised
dask.distributed.Clientinstance.
- Returns:
Flags indicating whether preprocessing was successful for each input file.
- Return type:
list of bool
Notes
If you are using a
dask.distributed.Clientinstance, you must initialise it before calling this function. For example:>>> from dask.distributed import Client >>> client = Client(threads_per_worker=1, n_workers=4)