ibllib.oneibl.patcher

A module for ad-hoc dataset modification and registration.

Unlike the DataHandler class in oneibl.data_handlers, the Patcher class allows one to fully remove datasets (delete them from the database and repositories), and to overwrite datasets on both the main repositories and the local servers. Additionally the Patcher can handle datasets from multiple sessions at once.

Examples

Delete a dataset from Alyx and all associated repositories.

>>> dataset_id = 'f4aafe6c-a7ab-4390-82cd-2c0e245322a5'
>>> task_ids, files_by_repo = IBLGlobusPatcher(AlyxClient(), 'admin').delete_dataset(dataset_id)

Patch some local datasets using Globus

>>> from one.api import ONE
>>> patcher = GlobusPatcher('admin', ONE(), label='UCLA audio times patch')
>>> responses = patcher.patch_datasets(file_paths)  # register the new datasets to Alyx
>>> patcher.launch_transfers(local_servers=True)  # transfer to all remote repositories

Functions

globus_path_from_dataset

Returns local one file path from a dset record or a list of dsets records from REST

sdsc_globus_path_from_dataset

sdsc_path_from_dataset

Returns sdsc file path from a dset record or a list of dsets records from REST

Classes

FTPPatcher

This is used to register from anywhere without write access to FlatIron

GlobusPatcher

Requires GLOBUS keys access

IBLGlobusPatcher

This is a replacement for the GlobusPatcher class, utilizing the ONE Globus class.

Patcher

S3Patcher

SDSCPatcher

This is used to patch data on the SDSC server

SSHPatcher

Requires SSH keys access on the FlatIron

sdsc_globus_path_from_dataset(dset)[source]
Parameters:

dset – dset dictionary or list of dictionaries from ALyx rest endpoint

Returns SDSC globus file path from a dset record or a list of dsets records from REST

sdsc_path_from_dataset(dset, root_path=PurePosixPath('/mnt/ibl'))[source]

Returns sdsc file path from a dset record or a list of dsets records from REST

Parameters:
  • dset – dset dictionary or list of dictionaries from ALyx rest endpoint

  • root_path – (optional) the prefix path such as one download directory or SDSC root

globus_path_from_dataset(dset, repository=None, uuid=False)[source]

Returns local one file path from a dset record or a list of dsets records from REST

Parameters:
  • dset – dset dictionary or list of dictionaries from ALyx rest endpoint

  • repository – (optional) repository name of the file record (if None, will take the first filerecord with a URL)

class Patcher(one=None)[source]

Bases: ABC

register_dataset(file_list, **kwargs)[source]

Registers a set of files belonging to a session only on the server

Parameters:
  • file_list – (list of pathlib.Path)

  • created_by – (string) name of user in Alyx (defaults to ‘root’)

  • repository – optional: (string) name of the server repository in Alyx

  • versions – optional (list of strings): versions tags (defaults to ibllib version)

  • dry – (bool) False by default

Returns:

register_datasets(file_list, **kwargs)[source]

Same as register_dataset but works with files belonging to different sessions

patch_dataset(file_list, dry=False, ftp=False, **kwargs)[source]

Creates a new dataset on FlatIron and uploads it from arbitrary location. Rules for creation/patching are the same that apply for registration via Alyx as this uses the registration endpoint to get the dataset. An existing file (same session and path relative to session) will be patched.

Parameters:

path – full file path. Must be within an ALF session folder (subject/date/number)

can also be a list of full file paths belonging to the same session. :param server_repository: Alyx server repository name :param created_by: alyx username for the dataset (optional, defaults to root) :param ftp: flag for case when using ftppatcher. Don’t adjust windows path in _patch_dataset when ftp=True :return: the registrations response, a list of dataset records

patch_datasets(file_list, **kwargs)[source]

Same as create_dataset method but works with several sessions.

class GlobusPatcher(client_name='default', one=None, label='ibllib patch')[source]

Bases: Patcher, Globus

Requires GLOBUS keys access

patch_datasets(file_list, **kwargs)[source]

Calls the super method that registers and updates the current computer to Python transfer Then, creates individual transfer items for each local server so that after the update on Flatiron, local server files are also updated

Parameters:
  • file_list

  • kwargs

Returns:

launch_transfers()[source]

Launches the globus transfer and delete from the local patch computer to the flat-rion

Param:

local_servers (False): if True, sync the local servers after the main transfer

Returns:

None

launch_transfers_secondary()[source]

patcher.launch_transfer_secondary() Launches the globus transfers from flatiron to third-party repositories (local servers) This should run after the the main transfer from patch computer to the flatiron :return: None

class IBLGlobusPatcher(alyx=None, client_name='default')[source]

Bases: Patcher, Globus

This is a replacement for the GlobusPatcher class, utilizing the ONE Globus class.

The GlobusPatcher class is more complicated but has the advantage of being able to launch transfers independently to registration, although it remains to be seen whether this is useful.

delete_dataset(dataset, dry=False)[source]

Delete a dataset off Alyx and remove file record from all Globus repositories.

Parameters:
  • dataset (uuid.UUID, str, dict) – The dataset record or ID to delete.

  • dry (bool) – If true, dataset is not deleted and file paths that would be removed are returned.

Returns:

  • list of uuid.UUID – A list of Globus delete task IDs if dry is false.

  • dict of str – A map of data repository names and relative paths of the deleted files.

class SSHPatcher(one=None)[source]

Bases: Patcher

Requires SSH keys access on the FlatIron

class FTPPatcher(one=None)[source]

Bases: Patcher

This is used to register from anywhere without write access to FlatIron

static setup(par=None, silent=False)[source]

Set up (and save) FTP login parameters

Parameters:

par – A parameters object to modify, if None the default Webclient parameters are

loaded :param silent: If true, the defaults are used with no user input prompt :return: the modified parameters object

create_dataset(path, created_by='root', dry=False, repository='ibl_patcher', **kwargs)[source]
mktree(remote_path)[source]

Browse to the tree on the ftp server, making directories on the way

class SDSCPatcher(one=None)[source]

Bases: Patcher

This is used to patch data on the SDSC server

patch_datasets(file_list, **kwargs)[source]

Same as create_dataset method but works with several sessions.

class S3Patcher(one=None)[source]

Bases: Patcher

check_datasets(file_list)[source]
patch_dataset(file_list, dry=False, ftp=False, force=False, **kwargs)[source]

Creates a new dataset on FlatIron and uploads it from arbitrary location. Rules for creation/patching are the same that apply for registration via Alyx as this uses the registration endpoint to get the dataset. An existing file (same session and path relative to session) will be patched.

Parameters:

path – full file path. Must be within an ALF session folder (subject/date/number)

can also be a list of full file paths belonging to the same session. :param server_repository: Alyx server repository name :param created_by: alyx username for the dataset (optional, defaults to root) :param ftp: flag for case when using ftppatcher. Don’t adjust windows path in _patch_dataset when ftp=True :return: the registrations response, a list of dataset records