ALF Path objects

ONE methods such as eid2path and load_dataset return objects of the type one.alf.path.ALFPath. These are similar to pathlib.Path objects, but with some extra methods for parsing ALF-specific paths.

Converting paths

You can directly instantiate an ALFPath object in the same way as a pathlib.Path object. Paths can also be converted to ALFPath objects using the one.alf.path.ensure_alf_path function. This funciton ensures the path entered is cast to an ALFPath instance. If the input class is PureALFPath or pathlib.PurePath, a PureALFPath instance is returned, otherwise an ALFPath instance is returned.

Iterating through session datasets

The ALFPath.iter_datasets method is a generator that returns valid datasets within the path. Note that this method is not present in PureALFPath instances.

Properties

In addition to the Path properties of stem, suffix, and name, parts of an ALF path can be readily referenced using various properties. These properties will return an empty string if the particular ALF part is not present in the path.

[1]:
from one.alf.path import ALFPath
path = ALFPath('/data/cortexlab/Subjects/NYU-001/2019-10-01/001/alf/task_00/#2020-01-01#/_ibl_trials.table.pqt')

Session parts

[2]:
print(path)
print(f'The session lab is "{path.lab}"')  # cortexlab
print(f'The session subject is "{path.subject}"')  # NYU-001
print(f'The session date is "{path.date}"')  # 2019-10-01
print(f'The session sequence is "{path.sequence}"')  # 001
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
The session lab is "cortexlab"
The session subject is "NYU-001"
The session date is "2019-10-01"
The session sequence is "001"

Collection parts

[3]:
print(f'The session collection is "{path.collection}"')  # alf/task_00
print(f'The session revision is "{path.revision}"')  # _ibl_trials.table.pqt
The session collection is "alf/task_00"
The session revision is "2020-01-01"

Filename parts

Filename properties include namespace, object, attribute, timescale, and extra.

[4]:
print(f'The ALF name is "{path.name}"')  # _ibl_trials.table.pqt
print(f'The ALF object is "{path.object}"')  # trials
print(f'The ALF attribute is "{path.attribute}"')  # table
# NB: To get the extension, use the pathlib.Path suffix property
print(f'The ALF file extension is "{path.suffix}"')  # pqt
print(f'The ALF namespace is "{path.namespace}"')  # ibl
print(f"There are {'' if path.extra else 'no'} extra components in \"{path.stem}\"")
The ALF name is "_ibl_trials.table.pqt"
The ALF object is "trials"
The ALF attribute is "table"
The ALF file extension is ".pqt"
The ALF namespace is "ibl"
There are no extra components in "_ibl_trials.table"

Part tuples

In addition to Path.parts, you can parse out the path in one go with alf_parts, session_parts, and dataset_name_parts methods. Un-parsed parts are returned as an empty string.

[5]:
print(path)
print('Path parts:')
print(path.parts)
print('ALF path parts:')
print(path.alf_parts)
print('ALF session parts:')
print(path.session_parts)
print('ALF filename parts:')
print(path.dataset_name_parts)
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
Path parts:
('\\', 'data', 'cortexlab', 'Subjects', 'NYU-001', '2019-10-01', '001', 'alf', 'task_00', '#2020-01-01#', '_ibl_trials.table.pqt')
ALF path parts:
('cortexlab', 'NYU-001', '2019-10-01', '001', 'alf/task_00', '2020-01-01', 'ibl', 'trials', 'table', '', '', 'pqt')
ALF session parts:
('cortexlab', 'NYU-001', '2019-10-01', '001')
ALF filename parts:
('ibl', 'trials', 'table', '', '', 'pqt')

Parsing methods

As with the parts properties, there are several parsing methods. The key difference is that these methods can optionally return a dict of results, and instead of empty strings, absent or invalid parts are returned as None.

[6]:
print(path.parse_alf_path())
print(path.parse_alf_name())
OrderedDict([('lab', 'cortexlab'), ('subject', 'NYU-001'), ('date', '2019-10-01'), ('number', '001'), ('collection', 'alf/task_00'), ('revision', '2020-01-01'), ('namespace', 'ibl'), ('object', 'trials'), ('attribute', 'table'), ('timescale', None), ('extra', None), ('extension', 'pqt')])
OrderedDict([('namespace', 'ibl'), ('object', 'trials'), ('attribute', 'table'), ('timescale', None), ('extra', None), ('extension', 'pqt')])

With methods

In addition to the pathlib.Path methods with_name, with_stem, and with_suffix, ALFPath objects contain multiple methods for adding/replacing ALF parts in a path.

Adding/changing parts

[7]:
print('The original path:')
print(path, end='\n\n')

# Changing the subject
print('With a different subject:')
print(path.with_subject('new_subject'), end='\n\n')
# Changing the sequence
print('With a different sequence:')
print(path.with_sequence(5), end='\n\n')
# Changing the revision
print('With a different revision:')
print(path.with_revision('2025-03-03'), end='\n\n')
# Changing the object
print('With a different object:')
print(path.with_object('new_object'), end='\n\n')
# Changing the lab
print('With a different lab:')
print(path.with_lab('mainenlab'), end='\n\n')
# Adding a lab (note that this also adds the required 'Subjects' subfolder)
print('Adding in a lab:')
without_lab = path.without_lab()
print(path.without_lab())
print(without_lab.with_lab('mainenlab'), end='\n\n')
# A session path without a padded sequence can be modified such that e.g. '1', becomes '001'
print('Padding a session sequence:')
unpadded_path = ALFPath('NYU-001/2019-10-01/1')
print(f'{unpadded_path} - > {unpadded_path.with_padded_sequence()}')
The original path:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt

With a different subject:
\data\cortexlab\Subjects\new_subject\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt

With a different sequence:
\data\cortexlab\Subjects\NYU-001\2019-10-01\005\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt

With a different revision:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2025-03-03#\_ibl_trials.table.pqt

With a different object:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_new_object.table.pqt

With a different lab:
\data\mainenlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt

Adding in a lab:
\data\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
\data\mainenlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt

Padding a session sequence:
NYU-001\2019-10-01\1 - > NYU-001\2019-10-01\001

Sometimes ALF names contain an extra UUID part. This can be added with the with_uuid method.

[8]:
uid_path = path.with_uuid('87f60a7c-581c-4fc6-b304-183ac423312c')
print(uid_path.name)
_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt

Removing parts

Unlike for the with methods, there are only three without methods: without_lab, without_uuid, and without_revision:

[9]:
print('The full ALF path:')
print(uid_path, end='\n\n')

print('Without the lab:')
print(uid_path := uid_path.without_lab(), end='\n\n')

print('Without the revision:')
print(uid_path := uid_path.without_revision(), end='\n\n')

print('Without the UUID:')
print(uid_path := uid_path.without_uuid())
The full ALF path:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt

Without the lab:
\data\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt

Without the revision:
\data\NYU-001\2019-10-01\001\alf\task_00\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt

Without the UUID:
\data\NYU-001\2019-10-01\001\alf\task_00\_ibl_trials.table.pqt

Relative methods

Similar to Path.relative_to, ALFPath objects have several methods for returning the path relative to various ALF parts.

[10]:
print('The full ALF path:')
print(path, end='\n\n')
print(f'Just the session path: "{path.session_path()}"')
print(f'Just the subject, date, and sequence part: "{path.session_path_short()}"')
print(f'Relative to lab: "{path.relative_to_lab()}"')
print(f'Relative to session: "{path.relative_to_session()}"')
The full ALF path:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt

Just the session path: "\data\cortexlab\Subjects\NYU-001\2019-10-01\001"
Just the subject, date, and sequence part: "NYU-001/2019-10-01/001"
Relative to lab: "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt"
Relative to session: "alf\task_00\#2020-01-01#\_ibl_trials.table.pqt"

Validation methods

Paths can be validated with several methods.

[!CAUTION] Be aware that PureALFPath.is_valid_alf will return True if any part of the path follows an ALF pattern as some pure paths are ambiguous (‘foo.bar’ is a valid collection folder but not a valid dataset filename). ALFPath.is_valid_alf is much stricter as it can take into account whether the path is a file or a directory.

[11]:
path = path.relative_to_lab()
print(f'Is "{path}" a session path? {path.is_session_path()}')  # False
print(f'Is "{path.session_path()}" a session path? {path.session_path().is_session_path()}')  # True
print(f'Is "{path}" an ALF path? {path.is_valid_alf()}')  # True
print(f'Is "{path.parent}" an ALF dataset? {path.parent.is_dataset()}')  # False
print(f'Is "{path}" an ALF dataset? {path.is_dataset()}')  # True
Is "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt" a session path? False
Is "NYU-001\2019-10-01\001" a session path? True
Is "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt" an ALF path? True
Is "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#" an ALF dataset? False
Is "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt" an ALF dataset? True