one.api

Classes for searching, listing and (down)loading ALyx Files TODO Document TODO Add sig to ONE Light uuids TODO Save changes to cache TODO Fix update cache in AlyxONE - save parquet table TODO save parquet in update_filesystem

Points of discussion:
  • Module structure: oneibl is too restrictive, naming module one means obj should have

different name - Download datasets timeout - Support for pids? - Need to check performance of 1. (re)setting index, 2. converting object array to 2D int array - NB: Sessions table date ordered. Indexing by eid is therefore O(N) but not done in code. Datasets table has sorted index. - Conceivably you could have a subclass for Figshare, etc., not just Alyx

Functions

ONE

ONE API factory Determine which class to instantiate depending on parameters passed.

Classes

One

OneAlyx

class One(cache_dir=None, mode='auto', wildcards=True)[source]

Bases: one.converters.ConversionMixin

property offline
search_terms(query_type=None)[source]
refresh_cache(mode='auto')[source]

Check and reload cache tables

Parameters

mode (str) – Options are ‘local’ (don’t reload); ‘refresh’ (reload); ‘auto’ (reload if expired); ‘remote’ (don’t reload)

Returns

Return type

Loaded timestamp

search(details=False, query_type=None, **kwargs)[source]

Searches sessions matching the given criteria and returns a list of matching eids

For a list of search terms, use the methods

one.search_terms()

For all of the search parameters, a single value or list may be provided. For dataset, the sessions returned will contain all listed datasets. For the other parameters, the session must contain at least one of the entries. NB: Wildcards are not permitted, however if wildcards property is False, regular expressions may be used for all but number and date_range.

Parameters
  • dataset (str, list) – list of dataset names. Returns sessions containing all these datasets. A dataset matches if it contains the search string e.g. ‘wheel.position’ matches ‘_ibl_wheel.position.npy’

  • date_range (str, list, datetime.datetime, datetime.date, pandas.timestamp) – A single date to search or a list of 2 dates that define the range (inclusive). To define only the upper or lower date bound, set the other element to None.

  • lab (str) – A str or list of lab names, returns sessions from any of these labs

  • number (str, int) – Number of session to be returned, i.e. number in sequence for a given date

  • subject (str, list) – A list of subject nicknames, returns sessions for any of these subjects

  • task_protocol (str) – The task protocol name (can be partial, i.e. any task protocol containing that str will be found)

  • project (str) – The project name (can be partial, i.e. any task protocol containing that str will be found)

  • details (bool) – If true also returns a dict of dataset details

  • query_type (str, None) – Query cache (‘local’) or Alyx database (‘remote’)

Returns

  • list of eids, if details is True, also returns a list of dictionaries, each entry

  • corresponding to a matching session

get_details(eid: Union[str, pathlib.Path, uuid.UUID], full: bool = False)[source]
list_subjects() → List[str][source]

List all subjects in database

Returns

Return type

Sorted list of subject names

list_datasets(eid=None, collection=None, details=False, query_type=None) → Union[numpy.ndarray, pandas.core.frame.DataFrame][source]

Given an eid, return the datasets for those sessions. If no eid is provided, a list of all datasets is returned. When details is false, a sorted array of unique datasets is returned (their relative paths).

Parameters
  • eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.

  • collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.

  • details (bool) – When true, a pandas DataFrame is returned, otherwise a numpy array of relative paths (collection/revision/filename) - see one.alf.spec.describe for details.

  • query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)

Returns

Return type

Slice of datasets table or numpy array if details is False

list_collections(eid=None, details=False) → Union[numpy.ndarray, dict][source]

List the collections for a given experiment. If no experiment id is given, all collections are returned.

Parameters
  • eid ([str, UUID, Path, dict]) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path

  • details (bool) – If true a dict of pandas datasets tables is returned with collections as keys, otherwise a numpy array of unique collections

Returns

Return type

A numpy array of unique collections or dict of datasets tables

list_revisions(eid=None, dataset=None, collection=None, details=False)[source]

List the revisions for a given experiment. If no experiment id is given, all collections are returned.

Parameters
  • eid ([str, UUID, Path, dict]) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path

  • details (bool) – If true a dict of pandas datasets tables is returned with collections as keys, otherwise a numpy array of unique collections

Returns

Return type

A numpy array of unique collections or dict of datasets tables

load_object(eid: Union[str, pathlib.Path, uuid.UUID], obj: str, collection: Optional[str] = None, revision: Optional[str] = None, query_type: Optional[str] = None, download_only: bool = False, **kwargs) → Union[one.alf.io.AlfBunch, List[pathlib.Path]][source]

Load all attributes of an ALF object from a Session ID and an object name. Any datasets with matching object name will be loaded.

Parameters
  • eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.

  • obj (str) – The ALF object to load. Supports asterisks as wildcards.

  • collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.

  • revision (str) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted.

  • query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)

  • download_only (bool) – When true the data are downloaded and the file path is returned.

  • kwargs (dict) – Additional filters for datasets, including namespace and timescale. For full list see the one.alf.spec.describe function.

Returns

Return type

An ALF bunch or if download_only is True, a list of Paths objects

Examples

load_object(eid, ‘moves’) load_object(eid, ‘trials’) load_object(eid, ‘spikes’, collection=’probe01’) # wildcards is True load_object(eid, ‘spikes’, collection=’.*probe01’) # wildcards is False load_object(eid, ‘spikes’, namespace=’ibl’) load_object(eid, ‘spikes’, timescale=’ephysClock’) # Load specific attributes load_object(eid, ‘spikes’, attribute=[‘times’, ‘clusters’])

load_dataset(eid: Union[str, pathlib.Path, uuid.UUID], dataset: str, collection: Optional[str] = None, revision: Optional[str] = None, query_type: Optional[str] = None, download_only: bool = False, **kwargs) → Any[source]

Load a single dataset for a given session id and dataset name

Parameters
  • eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.

  • dataset (str, dict) – The ALF dataset to load. May be a string or dict of ALF parts. Supports asterisks as wildcards.

  • collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.

  • revision (str) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted.

  • query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)

  • download_only (bool) – When true the data are downloaded and the file path is returned.

Returns

Return type

Dataset or a Path object if download_only is true.

Examples

intervals = one.load_dataset(eid, ‘_ibl_trials.intervals.npy’) # Load dataset without specifying extension intervals = one.load_dataset(eid, ‘trials.intervals’) # wildcard mode only intervals = one.load_dataset(eid, ‘trials.intervals’) # wildcard mode only filepath = one.load_dataset(eid ‘_ibl_trials.intervals.npy’, download_only=True) spike_times = one.load_dataset(eid ‘spikes.times.npy’, collection=’alf/probe01’) old_spikes = one.load_dataset(eid, ‘spikes.times.npy’,

collection=’alf/probe01’, revision=’2020-08-31’)

load_datasets(eid: Union[str, pathlib.Path, uuid.UUID], datasets: List[str], collections: Optional[str] = None, revisions: Optional[str] = None, query_type: Optional[str] = None, assert_present=True, download_only: bool = False, **kwargs) → Any[source]

Load datasets for a given session id. Returns two lists the length of datasets. The first is the data (or file paths if download_data is false), the second is a list of meta data Bunches. If assert_present is false, missing data will be returned as None.

Parameters
  • eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.

  • datasets (list of strings) – The ALF datasets to load. May be a string or dict of ALF parts. Supports asterisks as wildcards.

  • collections (str, list) – The collection(s) to which the object(s) belong, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.

  • revisions (str, list) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted.

  • query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)

  • assert_present (bool) – If true, missing datasets raises and error, otherwise None is returned

  • download_only (bool) – When true the data are downloaded and the file path is returned.

Returns

  • Returns a list of data (or file paths) the length of datasets, and a list of

  • meta data Bunches. If assert_present is False, missing data will be None

load_dataset_from_id(dset_id: Union[str, uuid.UUID], download_only: bool = False, details: bool = False) → Any[source]

Load a dataset given a dataset UUID

Parameters
  • dset_id (uuid.UUID, str) – A dataset UUID to load

  • download_only (bool) – If true the dataset is downloaded (if necessary) and the filepath returned

  • details (bool) – If true a pandas Series is returned in addition to the data

Returns

Return type

Dataset data (or filepath if download_only) and dataset record if details is True

static setup(cache_dir, **kwargs)[source]

Interactive command tool that populates parameter file for ONE IBL. FIXME See subclass

ONE(*, mode='auto', wildcards=True, **kwargs)[source]

ONE API factory Determine which class to instantiate depending on parameters passed.

Parameters
  • mode (str) – Query mode, options include ‘auto’, ‘local’ (offline) and ‘remote’ (online only). Most methods have a query_type parameter that can override the class mode.

  • wildcards (bool) – If true all mathods use unix shell style pattern matching, otherwise regular expressions are used.

  • cache_dir (str, Path) – Path to the data files. If Alyx parameters have been set up for this location, an OneAlyx instance is returned. If data_dir and base_url are None, the default location is used.

  • base_url (str) – An Alyx database URL. The URL must start with ‘http’.

  • username (str) – An Alyx database login username.

  • password (str) – An Alyx database password.

  • cache_rest (str) – If not in ‘local’ mode, this determines which http request types to cache. Default is ‘GET’. Use None to deactivate cache (not recommended).

Returns

Return type

An One instance if mode is ‘local’, otherwise an OneAlyx instance.

class OneAlyx(username=None, password=None, base_url=None, cache_dir=None, mode='auto', wildcards=True, **kwargs)[source]

Bases: one.api.One

property alyx
search_terms(query_type=None)[source]

Returns a list of search terms to be passed as kwargs to the search method

Parameters

query_type (str) – If ‘remote’, the search terms are largely determined by the REST endpoint used

Returns

Return type

Tuple of search strings

describe_dataset(dataset_type=None)[source]
list_datasets(eid=None, collection=None, details=False, query_type=None) → Union[numpy.ndarray, pandas.core.frame.DataFrame][source]

Given an eid, return the datasets for those sessions. If no eid is provided, a list of all datasets is returned. When details is false, a sorted array of unique datasets is returned (their relative paths).

Parameters
  • eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.

  • collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.

  • details (bool) – When true, a pandas DataFrame is returned, otherwise a numpy array of relative paths (collection/revision/filename) - see one.alf.spec.describe for details.

  • query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)

Returns

Return type

Slice of datasets table or numpy array if details is False

load_collection(eid, collection)[source]
pid2eid(pid: str, query_type=None) -> (<class 'str'>, <class 'str'>)[source]

Given an Alyx probe UUID string, returns the session id string and the probe label (i.e. the ALF collection)

Parameters
  • pid (str, uuid.UUID) – A probe UUID

  • query_type (str) – Query mode - options include ‘remote’, and ‘refresh’

Returns

Return type

experiment ID, probe label

search(details=False, query_type=None, **kwargs)[source]

Searches sessions matching the given criteria and returns a list of matching eids

For a list of search terms, use the method

one.search_terms(query_type=’remote’)

For all of the search parameters, a single value or list may be provided. For dataset, the sessions returned will contain all listed datasets. For the other parameters, the session must contain at least one of the entries. NB: Wildcards are not permitted, however if wildcards property is False, regular expressions may be used for all but number and date_range.

Parameters
  • dataset (str, list) – list of dataset names. Returns sessions containing all these datasets. A dataset matches if it contains the search string e.g. ‘wheel.position’ matches ‘_ibl_wheel.position.npy’

  • date_range (str, list, datetime.datetime, datetime.date, pandas.timestamp) – A single date to search or a list of 2 dates that define the range (inclusive). To define only the upper or lower date bound, set the other element to None.

  • lab (str, list) – A str or list of lab names, returns sessions from any of these labs

  • number (str, int) – Number of session to be returned, i.e. number in sequence for a given date

  • subject (str, list) – A list of subject nicknames, returns sessions for any of these subjects

  • task_protocol (str, list) – The task protocol name (can be partial, i.e. any task protocol containing that str will be found)

  • project (str, list) – The project name (can be partial, i.e. any task protocol containing that str will be found)

  • / performance_gte (performance_lte) – search only for sessions whose performance is less equal or greater equal than a pre-defined threshold as a percentage (0-100)

  • users (str, list) – A list of users

  • location (str, list) – a str or list of lab location (as per Alyx definition) name Note: this corresponds to the specific rig, not the lab geographical location per se

  • dataset_types (str, list) – One or more of dataset_types

  • details (bool) – If true also returns a dict of dataset details

  • query_type (str, None) – Query cache (‘local’) or Alyx database (‘remote’)

  • limit (int) – The number of results to fetch in one go (if pagination enabled on server)

Returns

  • List of eids and, if details is True, also returns a list of dictionaries, each entry

  • corresponding to a matching session

static setup(**kwargs)[source]

TODO Interactive command tool that sets up cache for ONE.

eid2path(eid: str, query_type=None) → Union[pathlib.Path, Sequence[pathlib.Path]][source]

From an experiment ID gets the local session path

Parameters
  • eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.

  • query_type (str) – If set to ‘remote’, will force database connection

Returns

Return type

A session path or list of session paths

path2eid(path_obj: Union[str, pathlib.Path], query_type=None) → Union[pathlib.Path, Sequence[pathlib.Path]][source]

From a local path, gets the experiment ID

Parameters
  • path_obj (str, pathlib.Path, list) – Local path or list of local paths

  • query_type (str) – If set to ‘remote’, will force database connection

Returns

Return type

An eid or list of eids

path2url(filepath, query_type=None)[source]

Given a local file path, returns the URL of the remote file.

Parameters
  • filepath (str, pathlib.Path) – A local file path

  • query_type (str) – If set to ‘remote’, will force database connection

Returns

Return type

A URL string

type2datasets(eid, dataset_type, details=False)[source]

Get list of datasets belonging to a given dataset type for a given session

Parameters
  • eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.

  • dataset_type (str, list) – An Alyx dataset type, e.g. camera.times or a list of dtypes

  • details (bool) – If True, a datasets DataFrame is returned

Returns

Return type

A numpy array of data, or DataFrame if details is true

dataset2type(dset)[source]

Return dataset type from dataset

describe_revision(revision)[source]
get_details(eid: str, full: bool = False, query_type=None)[source]

Returns details of eid like from one.search, optional return full session details.