one.alf.files

Module for identifying and parsing ALF file names.

An ALF file has the following components (those in brackets are optional):

(_namespace_)object.attribute(_timescale)(.extra.parts).ext

Note the following:

Object attributes may not contain an underscore unless followed by ‘times’ or ‘intervals’. A namespace must not contain extra underscores (i.e. name_space and __namespace__ are not valid). ALF files must always have an extension.

For more information, see the following documentation:

https://int-brain-lab.github.io/ONE/alf_intro.html

Functions

add_uuid_string

Add a UUID to the filename of an ALF path.

filename_parts

Return the parsed elements of a given ALF filename.

folder_parts

Parse all folder parts, including session, collection and revision.

full_path_parts

Parse all filename and folder parts.

get_alf_path

Returns the ALF part of a path or filename.

get_session_path

Returns the session path from any filepath if the date/number pattern is found, including the root directory.

padded_sequence

Ensures a file path contains a zero-padded experiment sequence folder.

rel_path_parts

Parse a relative path into the relevant parts.

remove_uuid_string

Remove UUID from a filename of an ALF path.

session_path_parts

Parse a session path into the relevant parts.

rel_path_parts(rel_path, as_dict=False, assert_valid=True)[source]

Parse a relative path into the relevant parts.

A relative path follows the pattern

(collection/)(#revision#/)_namespace_object.attribute_timescale.extra.extension

Parameters:
  • rel_path (str, pathlib.Path) – A relative path string.

  • as_dict (bool) – If true, an OrderedDict of parts are returned with the keys (‘lab’, ‘subject’, ‘date’, ‘number’), otherwise a tuple of values are returned.

  • assert_valid (bool) – If true a ValueError is raised when the session cannot be parsed, otherwise an empty dict of tuple of Nones is returned.

Returns:

A dict if as_dict is true, or a tuple of parsed values.

Return type:

OrderedDict, tuple

session_path_parts(session_path, as_dict=False, assert_valid=True)[source]

Parse a session path into the relevant parts.

Return keys:
  • lab

  • subject

  • date

  • number

Parameters:
  • session_path (str, pathlib.Path) – A session path string.

  • as_dict (bool) – If true, an OrderedDict of parts are returned with the keys (‘lab’, ‘subject’, ‘date’, ‘number’), otherwise a tuple of values are returned.

  • assert_valid (bool) – If true a ValueError is raised when the session cannot be parsed, otherwise an empty dict of tuple of Nones is returned.

Returns:

A dict if as_dict is true, or a tuple of parsed values.

Return type:

OrderedDict, tuple

Raises:

ValueError – Invalid ALF session path (assert_valid is True).

filename_parts(filename, as_dict=False, assert_valid=True) dict | tuple[source]

Return the parsed elements of a given ALF filename.

Parameters:
  • filename (str) – The name of the file.

  • as_dict (bool) – When true a dict of matches is returned.

  • assert_valid (bool) – When true an exception is raised when the filename cannot be parsed.

Returns:

  • namespace (str) – The _namespace_ or None if not present.

  • object (str) – ALF object.

  • attribute (str) – The ALF attribute.

  • timescale (str) – The ALF _timescale or None if not present.

  • extra (str) – Any extra parts to the filename, or None if not present.

  • extension (str) – The file extension.

Examples

>>> filename_parts('_namespace_obj.times_timescale.extra.foo.ext')
('namespace', 'obj', 'times', 'timescale', 'extra.foo', 'ext')
>>> filename_parts('spikes.clusters.npy', as_dict=True)
{'namespace': None,
 'object': 'spikes',
 'attribute': 'clusters',
 'timescale': None,
 'extra': None,
 'extension': 'npy'}
>>> filename_parts('spikes.times_ephysClock.npy')
(None, 'spikes', 'times', 'ephysClock', None, 'npy')
>>> filename_parts('_iblmic_audioSpectrogram.frequencies.npy')
('iblmic', 'audioSpectrogram', 'frequencies', None, None, 'npy')
>>> filename_parts('_spikeglx_ephysData_g0_t0.imec.wiring.json')
('spikeglx', 'ephysData_g0_t0', 'imec', None, 'wiring', 'json')
>>> filename_parts('_spikeglx_ephysData_g0_t0.imec0.lf.bin')
('spikeglx', 'ephysData_g0_t0', 'imec0', None, 'lf', 'bin')
>>> filename_parts('_ibl_trials.goCue_times_bpod.csv')
('ibl', 'trials', 'goCue_times', 'bpod', None, 'csv')
Raises:

ValueError – Invalid ALF dataset (assert_valid is True).

full_path_parts(path, as_dict=False, assert_valid=True) dict | tuple[source]

Parse all filename and folder parts.

Parameters:
  • path (str, pathlib.Path.) – The ALF path

  • as_dict (bool) – When true a dict of matches is returned.

  • assert_valid (bool) – When true an exception is raised when the filename cannot be parsed.

Returns:

A dict if as_dict is true, or a tuple of parsed values.

Return type:

OrderedDict, tuple

Examples

>>> full_path_parts(
...    'lab/Subjects/subject/2020-01-01/001/collection/#revision#/'
...    '_namespace_obj.times_timescale.extra.foo.ext')
('lab', 'subject', '2020-01-01', '001', 'collection', 'revision',
'namespace', 'obj', 'times','timescale', 'extra.foo', 'ext')
>>> full_path_parts('spikes.clusters.npy', as_dict=True)
{'lab': None,
 'subject': None,
 'date': None,
 'number': None,
 'collection': None,
 'revision': None,
 'namespace': None,
 'object': 'spikes',
 'attribute': 'clusters',
 'timescale': None,
 'extra': None,
 'extension': 'npy'}
Raises:

ValueError – Invalid ALF path (assert_valid is True).

folder_parts(folder_path, as_dict=False, assert_valid=True) dict | tuple[source]

Parse all folder parts, including session, collection and revision.

Parameters:
  • folder_path (str, pathlib.Path) – The ALF folder path.

  • as_dict (bool) – When true a dict of matches is returned.

  • assert_valid (bool) – When true an exception is raised when the filename cannot be parsed.

Returns:

A dict if as_dict is true, or a tuple of parsed values.

Return type:

OrderedDict, tuple

Examples

>>> folder_parts('lab/Subjects/subject/2020-01-01/001/collection/#revision#')
('lab', 'subject', '2020-01-01', '001', 'collection', 'revision')
>>> folder_parts(Path('lab/Subjects/subject/2020-01-01/001'), as_dict=True)
{'lab': 'lab',
 'subject': 'subject',
 'date': '2020-01-01',
 'number': '001',
 'collection': None,
 'revision': None}
Raises:

ValueError – Invalid ALF path (assert_valid is True).

get_session_path(path: str | Path) Path | None[source]

Returns the session path from any filepath if the date/number pattern is found, including the root directory.

Returns:

The session path part of the input path or None if path invalid.

Return type:

pathlib.Path

Examples

>>> get_session_path('/mnt/sd0/Data/lab/Subjects/subject/2020-01-01/001')
Path('/mnt/sd0/Data/lab/Subjects/subject/2020-01-01/001')
>>> get_session_path('C:\Data\subject\2020-01-01\1\trials.intervals.npy')
Path('C:/Data/subject/2020-01-01/1')
get_alf_path(path: str | Path) str[source]

Returns the ALF part of a path or filename. Attempts to return the first valid part of the path, first searching for a session path, then relative path (collection/revision/filename), then just the filename. If all invalid, None is returned.

Parameters:

path (str, pathlib.Path) – A path to parse.

Returns:

A string containing the full ALF path, session path, relative path or filename.

Return type:

str

Examples

>>> get_alf_path('etc/etc/lab/Subjects/subj/2021-01-21/001')
'lab/Subjects/subj/2021-01-21/001/collection/file.attr.ext'
>>> get_alf_path('etc/etc/subj/2021-01-21/001/collection/file.attr.ext')
'subj/2021-01-21/001/collection/file.attr.ext'
>>> get_alf_path('collection/file.attr.ext')
'collection/file.attr.ext'
add_uuid_string(file_path, uuid)[source]

Add a UUID to the filename of an ALF path.

Adds a UUID to an ALF filename as an extra part, e.g. ‘obj.attr.ext’ -> ‘obj.attr.a976e418-c8b8-4d24-be47-d05120b18341.ext’.

Parameters:
  • file_path (str, pathlib.Path, pathlib.PurePath) – An ALF path to add the UUID to.

  • uuid (str, uuid.UUID) – The UUID to add.

Returns:

A new Path or PurePath object with a UUID in the filename.

Return type:

pathlib.Path, pathlib.PurePath

Examples

>>> add_uuid_string('/path/to/trials.intervals.npy', 'a976e418-c8b8-4d24-be47-d05120b18341')
Path('/path/to/trials.intervals.a976e418-c8b8-4d24-be47-d05120b18341.npy')
Raises:

ValueErroruuid must be a valid hyphen-separated hexadecimal UUID.

remove_uuid_string(file_path)[source]

Remove UUID from a filename of an ALF path.

Parameters:

file_path (str, pathlib.Path, pathlib.PurePath) – An ALF path to add the UUID to.

Returns:

A new Path or PurePath object without a UUID in the filename.

Return type:

pathlib.Path, pathlib.PurePath

Examples

>>> add_uuid_string('/path/to/trials.intervals.a976e418-c8b8-4d24-be47-d05120b18341.npy')
Path('/path/to/trials.intervals.npy')
>>> add_uuid_string('/path/to/trials.intervals.npy')
Path('/path/to/trials.intervals.npy')
padded_sequence(filepath)[source]

Ensures a file path contains a zero-padded experiment sequence folder.

Parameters:

filepath (str, pathlib.Path, pathlib.PurePath) – A session or file path to convert.

Returns:

The same path but with the experiment sequence folder zero-padded. If a PurePath was passed, a PurePath will be returned, otherwise a Path object is returned.

Return type:

pathlib.Path, pathlib.PurePath

Examples

>>> filepath = '/iblrigdata/subject/2023-01-01/1/_ibl_experiment.description.yaml'
>>> padded_sequence(filepath)
pathlib.Path('/iblrigdata/subject/2023-01-01/001/_ibl_experiment.description.yaml')

Supports folders and will not affect already padded paths

>>> session_path = pathlib.PurePosixPath('subject/2023-01-01/001')
>>> padded_sequence(filepath)
pathlib.PurePosixPath('subject/2023-01-01/001')