{ "cells": [ { "cell_type": "markdown", "id": "01dacb82", "metadata": {}, "source": [ "# ALF Path objects" ] }, { "cell_type": "markdown", "id": "6fb1bb60", "metadata": {}, "source": [ "ONE methods such as `eid2path` and `load_dataset` return objects of the type `one.alf.path.ALFPath`. These are similar to `pathlib.Path` objects, but with some extra methods for parsing ALF-specific paths." ] }, { "cell_type": "markdown", "id": "dd1f8dcb", "metadata": {}, "source": [ "### Converting paths\n", "You can directly instantiate an ALFPath object in the same way as a pathlib.Path object. Paths can also be converted to ALFPath objects using the `one.alf.path.ensure_alf_path` function. This funciton ensures the path entered is cast to an ALFPath instance. If the input class is PureALFPath or pathlib.PurePath, a PureALFPath instance is returned, otherwise an ALFPath instance is returned." ] }, { "cell_type": "markdown", "id": "2799e238", "metadata": {}, "source": [ "### Iterating through session datasets\n", "The `ALFPath.iter_datasets` method is a generator that returns valid datasets within the path. Note that this method is not present in `PureALFPath` instances." ] }, { "cell_type": "markdown", "id": "bbf61f77", "metadata": {}, "source": [ "## Properties\n", "\n", "In addition to the Path properties of `stem`, `suffix`, and `name`, parts of an ALF path can be readily referenced using various properties. These properties will return an empty string if the particular ALF part is not present in the path." ] }, { "cell_type": "code", "execution_count": 1, "id": "72b0113b", "metadata": {}, "outputs": [], "source": [ "from one.alf.path import ALFPath\n", "path = ALFPath('/data/cortexlab/Subjects/NYU-001/2019-10-01/001/alf/task_00/#2020-01-01#/_ibl_trials.table.pqt')" ] }, { "cell_type": "markdown", "id": "fd3fcd50", "metadata": {}, "source": [ "### Session parts" ] }, { "cell_type": "code", "execution_count": 2, "id": "6b214115", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "The session lab is \"cortexlab\"\n", "The session subject is \"NYU-001\"\n", "The session date is \"2019-10-01\"\n", "The session sequence is \"001\"\n" ] } ], "source": [ "print(path)\n", "print(f'The session lab is \"{path.lab}\"') # cortexlab\n", "print(f'The session subject is \"{path.subject}\"') # NYU-001\n", "print(f'The session date is \"{path.date}\"') # 2019-10-01\n", "print(f'The session sequence is \"{path.sequence}\"') # 001" ] }, { "cell_type": "markdown", "id": "bb9062ec", "metadata": {}, "source": [ "### Collection parts" ] }, { "cell_type": "code", "execution_count": 3, "id": "d45c1987", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The session collection is \"alf/task_00\"\n", "The session revision is \"2020-01-01\"\n" ] } ], "source": [ "print(f'The session collection is \"{path.collection}\"') # alf/task_00\n", "print(f'The session revision is \"{path.revision}\"') # _ibl_trials.table.pqt" ] }, { "cell_type": "markdown", "id": "e02b85ed", "metadata": {}, "source": [ "### Filename parts\n", "Filename properties include `namespace`, `object`, `attribute`, `timescale`, and `extra`." ] }, { "cell_type": "code", "execution_count": 4, "id": "ed5453bb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The ALF name is \"_ibl_trials.table.pqt\"\n", "The ALF object is \"trials\"\n", "The ALF attribute is \"table\"\n", "The ALF file extension is \".pqt\"\n", "The ALF namespace is \"ibl\"\n", "There are no extra components in \"_ibl_trials.table\"\n" ] } ], "source": [ "print(f'The ALF name is \"{path.name}\"') # _ibl_trials.table.pqt\n", "print(f'The ALF object is \"{path.object}\"') # trials\n", "print(f'The ALF attribute is \"{path.attribute}\"') # table\n", "# NB: To get the extension, use the pathlib.Path suffix property\n", "print(f'The ALF file extension is \"{path.suffix}\"') # pqt\n", "print(f'The ALF namespace is \"{path.namespace}\"') # ibl\n", "print(f\"There are {'' if path.extra else 'no'} extra components in \\\"{path.stem}\\\"\")" ] }, { "cell_type": "markdown", "id": "98d22c29", "metadata": {}, "source": [ "### Part tuples\n", "In addition to `Path.parts`, you can parse out the path in one go with `alf_parts`, `session_parts`, and `dataset_name_parts` methods. Un-parsed parts are returned as an empty string." ] }, { "cell_type": "code", "execution_count": 5, "id": "5351ea93", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "Path parts:\n", "('\\\\', 'data', 'cortexlab', 'Subjects', 'NYU-001', '2019-10-01', '001', 'alf', 'task_00', '#2020-01-01#', '_ibl_trials.table.pqt')\n", "ALF path parts:\n", "('cortexlab', 'NYU-001', '2019-10-01', '001', 'alf/task_00', '2020-01-01', 'ibl', 'trials', 'table', '', '', 'pqt')\n", "ALF session parts:\n", "('cortexlab', 'NYU-001', '2019-10-01', '001')\n", "ALF filename parts:\n", "('ibl', 'trials', 'table', '', '', 'pqt')\n" ] } ], "source": [ "print(path)\n", "print('Path parts:')\n", "print(path.parts)\n", "print('ALF path parts:')\n", "print(path.alf_parts)\n", "print('ALF session parts:')\n", "print(path.session_parts)\n", "print('ALF filename parts:')\n", "print(path.dataset_name_parts)" ] }, { "cell_type": "markdown", "id": "280fcc81", "metadata": {}, "source": [ "## Parsing methods\n", "As with the parts properties, there are several parsing methods. The key difference is that these methods can optionally return a dict of results, and instead of empty strings, absent or invalid parts are returned as None." ] }, { "cell_type": "code", "execution_count": 6, "id": "de170155", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "OrderedDict([('lab', 'cortexlab'), ('subject', 'NYU-001'), ('date', '2019-10-01'), ('number', '001'), ('collection', 'alf/task_00'), ('revision', '2020-01-01'), ('namespace', 'ibl'), ('object', 'trials'), ('attribute', 'table'), ('timescale', None), ('extra', None), ('extension', 'pqt')])\n", "OrderedDict([('namespace', 'ibl'), ('object', 'trials'), ('attribute', 'table'), ('timescale', None), ('extra', None), ('extension', 'pqt')])\n" ] } ], "source": [ "print(path.parse_alf_path())\n", "print(path.parse_alf_name())" ] }, { "cell_type": "markdown", "id": "48dcb2cb", "metadata": {}, "source": [ "## With methods\n", "In addition to the pathlib.Path methods `with_name`, `with_stem`, and `with_suffix`, ALFPath objects contain multiple methods for adding/replacing ALF parts in a path.\n", "\n", "### Adding/changing parts" ] }, { "cell_type": "code", "execution_count": 7, "id": "dfe36c35", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The original path:\n", "\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "\n", "With a different subject:\n", "\\data\\cortexlab\\Subjects\\new_subject\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "\n", "With a different sequence:\n", "\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\005\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "\n", "With a different revision:\n", "\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2025-03-03#\\_ibl_trials.table.pqt\n", "\n", "With a different object:\n", "\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_new_object.table.pqt\n", "\n", "With a different lab:\n", "\\data\\mainenlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "\n", "Adding in a lab:\n", "\\data\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "\\data\\mainenlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "\n", "Padding a session sequence:\n", "NYU-001\\2019-10-01\\1 - > NYU-001\\2019-10-01\\001\n" ] } ], "source": [ "print('The original path:')\n", "print(path, end='\\n\\n')\n", "\n", "# Changing the subject\n", "print('With a different subject:')\n", "print(path.with_subject('new_subject'), end='\\n\\n')\n", "# Changing the sequence\n", "print('With a different sequence:')\n", "print(path.with_sequence(5), end='\\n\\n')\n", "# Changing the revision\n", "print('With a different revision:')\n", "print(path.with_revision('2025-03-03'), end='\\n\\n')\n", "# Changing the object\n", "print('With a different object:')\n", "print(path.with_object('new_object'), end='\\n\\n')\n", "# Changing the lab\n", "print('With a different lab:')\n", "print(path.with_lab('mainenlab'), end='\\n\\n')\n", "# Adding a lab (note that this also adds the required 'Subjects' subfolder)\n", "print('Adding in a lab:')\n", "without_lab = path.without_lab()\n", "print(path.without_lab())\n", "print(without_lab.with_lab('mainenlab'), end='\\n\\n')\n", "# A session path without a padded sequence can be modified such that e.g. '1', becomes '001'\n", "print('Padding a session sequence:')\n", "unpadded_path = ALFPath('NYU-001/2019-10-01/1')\n", "print(f'{unpadded_path} - > {unpadded_path.with_padded_sequence()}')" ] }, { "cell_type": "markdown", "id": "b57c0520", "metadata": {}, "source": [ "Sometimes ALF names contain an extra UUID part. This can be added with the `with_uuid` method." ] }, { "cell_type": "code", "execution_count": 8, "id": "91d87775", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt\n" ] } ], "source": [ "uid_path = path.with_uuid('87f60a7c-581c-4fc6-b304-183ac423312c')\n", "print(uid_path.name)" ] }, { "cell_type": "markdown", "id": "70e6d811", "metadata": {}, "source": [ "### Removing parts\n", "Unlike for the with methods, there are only three without methods: `without_lab`, `without_uuid`, and `without_revision`:" ] }, { "cell_type": "code", "execution_count": 9, "id": "b6cf95d9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The full ALF path:\n", "\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt\n", "\n", "Without the lab:\n", "\\data\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt\n", "\n", "Without the revision:\n", "\\data\\NYU-001\\2019-10-01\\001\\alf\\task_00\\_ibl_trials.table.87f60a7c-581c-4fc6-b304-183ac423312c.pqt\n", "\n", "Without the UUID:\n", "\\data\\NYU-001\\2019-10-01\\001\\alf\\task_00\\_ibl_trials.table.pqt\n" ] } ], "source": [ "print('The full ALF path:')\n", "print(uid_path, end='\\n\\n')\n", "\n", "print('Without the lab:')\n", "print(uid_path := uid_path.without_lab(), end='\\n\\n')\n", "\n", "print('Without the revision:')\n", "print(uid_path := uid_path.without_revision(), end='\\n\\n')\n", "\n", "print('Without the UUID:')\n", "print(uid_path := uid_path.without_uuid())" ] }, { "cell_type": "markdown", "id": "1b2ccf06", "metadata": {}, "source": [ "## Relative methods\n", "Similar to `Path.relative_to`, ALFPath objects have several methods for returning the path relative to various ALF parts." ] }, { "cell_type": "code", "execution_count": 10, "id": "968ac60e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The full ALF path:\n", "\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\n", "\n", "Just the session path: \"\\data\\cortexlab\\Subjects\\NYU-001\\2019-10-01\\001\"\n", "Just the subject, date, and sequence part: \"NYU-001/2019-10-01/001\"\n", "Relative to lab: \"NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\"\n", "Relative to session: \"alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\"\n" ] } ], "source": [ "print('The full ALF path:')\n", "print(path, end='\\n\\n')\n", "print(f'Just the session path: \"{path.session_path()}\"')\n", "print(f'Just the subject, date, and sequence part: \"{path.session_path_short()}\"')\n", "print(f'Relative to lab: \"{path.relative_to_lab()}\"')\n", "print(f'Relative to session: \"{path.relative_to_session()}\"')" ] }, { "cell_type": "markdown", "id": "6e1920a0", "metadata": {}, "source": [ "## Validation methods\n", "Paths can be validated with several methods. \n", "\n", "> [!CAUTION]\n", "> Be aware that `PureALFPath.is_valid_alf` will return True if any part of the path follows an ALF pattern as some pure paths are ambiguous ('foo.bar' is a valid collection folder but not a valid dataset filename).\n", "> `ALFPath.is_valid_alf` is much stricter as it can take into account whether the path is a file or a directory." ] }, { "cell_type": "code", "execution_count": 11, "id": "ada465f0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Is \"NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\" a session path? False\n", "Is \"NYU-001\\2019-10-01\\001\" a session path? True\n", "Is \"NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\" an ALF path? True\n", "Is \"NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\" an ALF dataset? False\n", "Is \"NYU-001\\2019-10-01\\001\\alf\\task_00\\#2020-01-01#\\_ibl_trials.table.pqt\" an ALF dataset? True\n" ] } ], "source": [ "path = path.relative_to_lab()\n", "print(f'Is \"{path}\" a session path? {path.is_session_path()}') # False\n", "print(f'Is \"{path.session_path()}\" a session path? {path.session_path().is_session_path()}') # True\n", "print(f'Is \"{path}\" an ALF path? {path.is_valid_alf()}') # True\n", "print(f'Is \"{path.parent}\" an ALF dataset? {path.parent.is_dataset()}') # False\n", "print(f'Is \"{path}\" an ALF dataset? {path.is_dataset()}') # True" ] } ], "metadata": { "kernelspec": { "display_name": "ibl", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }