ALF Path objects
ONE methods such as eid2path
and load_dataset
return objects of the type one.alf.path.ALFPath
. These are similar to pathlib.Path
objects, but with some extra methods for parsing ALF-specific paths.
Converting paths
You can directly instantiate an ALFPath object in the same way as a pathlib.Path object. Paths can also be converted to ALFPath objects using the one.alf.path.ensure_alf_path
function. This funciton ensures the path entered is cast to an ALFPath instance. If the input class is PureALFPath or pathlib.PurePath, a PureALFPath instance is returned, otherwise an ALFPath instance is returned.
Iterating through session datasets
The ALFPath.iter_datasets
method is a generator that returns valid datasets within the path. Note that this method is not present in PureALFPath
instances.
Properties
In addition to the Path properties of stem
, suffix
, and name
, parts of an ALF path can be readily referenced using various properties. These properties will return an empty string if the particular ALF part is not present in the path.
[1]:
from one.alf.path import ALFPath
path = ALFPath('/data/cortexlab/Subjects/NYU-001/2019-10-01/001/alf/task_00/#2020-01-01#/_ibl_trials.table.pqt')
Session parts
[2]:
print(path)
print(f'The session lab is "{path.lab}"') # cortexlab
print(f'The session subject is "{path.subject}"') # NYU-001
print(f'The session date is "{path.date}"') # 2019-10-01
print(f'The session sequence is "{path.sequence}"') # 001
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
The session lab is "cortexlab"
The session subject is "NYU-001"
The session date is "2019-10-01"
The session sequence is "001"
Collection parts
[3]:
print(f'The session collection is "{path.collection}"') # alf/task_00
print(f'The session revision is "{path.revision}"') # _ibl_trials.table.pqt
The session collection is "alf/task_00"
The session revision is "2020-01-01"
Filename parts
Filename properties include namespace
, object
, attribute
, timescale
, and extra
.
[4]:
print(f'The ALF name is "{path.name}"') # _ibl_trials.table.pqt
print(f'The ALF object is "{path.object}"') # trials
print(f'The ALF attribute is "{path.attribute}"') # table
# NB: To get the extension, use the pathlib.Path suffix property
print(f'The ALF file extension is "{path.suffix}"') # pqt
print(f'The ALF namespace is "{path.namespace}"') # ibl
print(f"There are {'' if path.extra else 'no'} extra components in \"{path.stem}\"")
The ALF name is "_ibl_trials.table.pqt"
The ALF object is "trials"
The ALF attribute is "table"
The ALF file extension is ".pqt"
The ALF namespace is "ibl"
There are no extra components in "_ibl_trials.table"
Part tuples
In addition to Path.parts
, you can parse out the path in one go with alf_parts
, session_parts
, and dataset_name_parts
methods. Un-parsed parts are returned as an empty string.
[5]:
print(path)
print('Path parts:')
print(path.parts)
print('ALF path parts:')
print(path.alf_parts)
print('ALF session parts:')
print(path.session_parts)
print('ALF filename parts:')
print(path.dataset_name_parts)
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
Path parts:
('\\', 'data', 'cortexlab', 'Subjects', 'NYU-001', '2019-10-01', '001', 'alf', 'task_00', '#2020-01-01#', '_ibl_trials.table.pqt')
ALF path parts:
('cortexlab', 'NYU-001', '2019-10-01', '001', 'alf/task_00', '2020-01-01', 'ibl', 'trials', 'table', '', '', 'pqt')
ALF session parts:
('cortexlab', 'NYU-001', '2019-10-01', '001')
ALF filename parts:
('ibl', 'trials', 'table', '', '', 'pqt')
Parsing methods
As with the parts properties, there are several parsing methods. The key difference is that these methods can optionally return a dict of results, and instead of empty strings, absent or invalid parts are returned as None.
[6]:
print(path.parse_alf_path())
print(path.parse_alf_name())
OrderedDict([('lab', 'cortexlab'), ('subject', 'NYU-001'), ('date', '2019-10-01'), ('number', '001'), ('collection', 'alf/task_00'), ('revision', '2020-01-01'), ('namespace', 'ibl'), ('object', 'trials'), ('attribute', 'table'), ('timescale', None), ('extra', None), ('extension', 'pqt')])
OrderedDict([('namespace', 'ibl'), ('object', 'trials'), ('attribute', 'table'), ('timescale', None), ('extra', None), ('extension', 'pqt')])
With methods
In addition to the pathlib.Path methods with_name
, with_stem
, and with_suffix
, ALFPath objects contain multiple methods for adding/replacing ALF parts in a path.
Adding/changing parts
[7]:
print('The original path:')
print(path, end='\n\n')
# Changing the subject
print('With a different subject:')
print(path.with_subject('new_subject'), end='\n\n')
# Changing the sequence
print('With a different sequence:')
print(path.with_sequence(5), end='\n\n')
# Changing the revision
print('With a different revision:')
print(path.with_revision('2025-03-03'), end='\n\n')
# Changing the object
print('With a different object:')
print(path.with_object('new_object'), end='\n\n')
# Changing the lab
print('With a different lab:')
print(path.with_lab('mainenlab'), end='\n\n')
# Adding a lab (note that this also adds the required 'Subjects' subfolder)
print('Adding in a lab:')
without_lab = path.without_lab()
print(path.without_lab())
print(without_lab.with_lab('mainenlab'), end='\n\n')
# A session path without a padded sequence can be modified such that e.g. '1', becomes '001'
print('Padding a session sequence:')
unpadded_path = ALFPath('NYU-001/2019-10-01/1')
print(f'{unpadded_path} - > {unpadded_path.with_padded_sequence()}')
The original path:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
With a different subject:
\data\cortexlab\Subjects\new_subject\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
With a different sequence:
\data\cortexlab\Subjects\NYU-001\2019-10-01\005\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
With a different revision:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2025-03-03#\_ibl_trials.table.pqt
With a different object:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_new_object.table.pqt
With a different lab:
\data\mainenlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
Adding in a lab:
\data\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
\data\mainenlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
Padding a session sequence:
NYU-001\2019-10-01\1 - > NYU-001\2019-10-01\001
Sometimes ALF names contain an extra UUID part. This can be added with the with_uuid
method.
[8]:
uid_path = path.with_uuid('87f60a7c-581c-4fc6-b304-183ac423312c')
print(uid_path.name)
_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt
Removing parts
Unlike for the with methods, there are only three without methods: without_lab
, without_uuid
, and without_revision
:
[9]:
print('The full ALF path:')
print(uid_path, end='\n\n')
print('Without the lab:')
print(uid_path := uid_path.without_lab(), end='\n\n')
print('Without the revision:')
print(uid_path := uid_path.without_revision(), end='\n\n')
print('Without the UUID:')
print(uid_path := uid_path.without_uuid())
The full ALF path:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt
Without the lab:
\data\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt
Without the revision:
\data\NYU-001\2019-10-01\001\alf\task_00\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt
Without the UUID:
\data\NYU-001\2019-10-01\001\alf\task_00\_ibl_trials.table.pqt
Relative methods
Similar to Path.relative_to
, ALFPath objects have several methods for returning the path relative to various ALF parts.
[10]:
print('The full ALF path:')
print(path, end='\n\n')
print(f'Just the session path: "{path.session_path()}"')
print(f'Just the subject, date, and sequence part: "{path.session_path_short()}"')
print(f'Relative to lab: "{path.relative_to_lab()}"')
print(f'Relative to session: "{path.relative_to_session()}"')
The full ALF path:
\data\cortexlab\Subjects\NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt
Just the session path: "\data\cortexlab\Subjects\NYU-001\2019-10-01\001"
Just the subject, date, and sequence part: "NYU-001/2019-10-01/001"
Relative to lab: "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt"
Relative to session: "alf\task_00\#2020-01-01#\_ibl_trials.table.pqt"
Validation methods
Paths can be validated with several methods.
[!CAUTION] Be aware that
PureALFPath.is_valid_alf
will return True if any part of the path follows an ALF pattern as some pure paths are ambiguous (‘foo.bar’ is a valid collection folder but not a valid dataset filename).ALFPath.is_valid_alf
is much stricter as it can take into account whether the path is a file or a directory.
[11]:
path = path.relative_to_lab()
print(f'Is "{path}" a session path? {path.is_session_path()}') # False
print(f'Is "{path.session_path()}" a session path? {path.session_path().is_session_path()}') # True
print(f'Is "{path}" an ALF path? {path.is_valid_alf()}') # True
print(f'Is "{path.parent}" an ALF dataset? {path.parent.is_dataset()}') # False
print(f'Is "{path}" an ALF dataset? {path.is_dataset()}') # True
Is "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt" a session path? False
Is "NYU-001\2019-10-01\001" a session path? True
Is "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt" an ALF path? True
Is "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#" an ALF dataset? False
Is "NYU-001\2019-10-01\001\alf\task_00\#2020-01-01#\_ibl_trials.table.pqt" an ALF dataset? True