one.api

Classes for searching, listing and (down)loading ALyx Files.

Module attributes

`SAVE_ON_DELETE`	Whether to save modified cache tables on delete.
`REVISION_LAST_BEFORE`	If set, the revision string to use when loading data before a given date.

Functions

ONE

ONE API factory.

Classes

`One`	An API for searching and loading data on a local filesystem.
`OneAlyx`	An API for searching and loading data through the Alyx database.

ONE(*, mode='remote', wildcards=True, **kwargs)[source]

ONE API factory.

Determine which class to instantiate depending on parameters passed.

Parameters:

mode (str) – Query mode, options include ‘local’ (offline) and ‘remote’ (online only). Most methods have a query_type parameter that can override the class mode.
wildcards (bool) – If true all methods use unix shell style pattern matching, otherwise regular expressions are used.
cache_dir (str, pathlib.Path) – Path to the data files. If Alyx parameters have been set up for this location, an OneAlyx instance is returned. If data_dir and base_url are None, the default location is used.
tables_dir (str, pathlib.Path) – An optional location of the cache tables. If None, the tables are assumed to be in the cache_dir.
base_url (str) – An Alyx database URL. The URL must start with ‘http’.
username (str) – An Alyx database login username.
password (str) – An Alyx database password.
cache_rest (str) – If not in ‘local’ mode, this determines which http request types to cache. Default is ‘GET’. Use None to deactivate cache (not recommended).

Returns:

An One instance if mode is ‘local’, otherwise an OneAlyx instance.

Return type:

One, OneAlyx

class One(cache_dir=None, mode='local', wildcards=True, tables_dir=None)[source]

Bases: ConversionMixin

An API for searching and loading data on a local filesystem.

uuid_filenames = None

whether datasets on disk have a UUID in their filename.

Type:: bool

property offline

True if mode is local or no Web client set.

Type:: bool

search_terms(query_type=None) → tuple[source]: List the search term keyword args for use in the search method.

load_cache(tables_dir=None, clobber=True, **kwargs)[source]

Load parquet cache files from a local directory.

Parameters:

tables_dir (str, pathlib.Path) – An optional directory location of the parquet files, defaults to One._tables_dir.
clobber (bool) – If true, the cache is loaded without merging with existing table files.

Returns:

A timestamp of when the cache was loaded.

Return type:

datetime.datetime

save_cache(save_dir=None, clobber=False)[source]

Save One._cache attribute into parquet tables if recently modified.

Checks if another process is writing to file, if so waits before saving.

Parameters:

save_dir (str, pathlib.Path) – The directory path into which the tables are saved. Defaults to cache directory.
clobber (bool) – If true, the cache is saved without merging with existing table files, regardless of modification time.

save_loaded_ids(sessions_only=False, clear_list=True)[source]

Save list of UUIDs corresponding to datasets or sessions where datasets were loaded.

Parameters:

sessions_only (bool) – If true, save list of experiment IDs, otherwise the full list of dataset IDs.
clear_list (bool) – If true, clear the current list of loaded dataset IDs after saving.

Returns:

list of str – List of UUIDs.
pathlib.Path – The file path of the saved list.

search(details=False, **kwargs)[source]

Searches sessions matching the given criteria and returns a list of matching eids.

For a list of search terms, use the method

one.search_terms()

For all search parameters, a single value or list may be provided. For dataset, the sessions returned will contain all listed datasets. For the other parameters, the session must contain at least one of the entries.

For all but date_range and number, any field that contains the search string is returned. Wildcards are not permitted, however if wildcards property is True, regular expressions may be used (see notes and examples).

Parameters:

datasets (str, list) – One or more (exact) dataset names. Returns sessions containing all of these datasets.
dataset_qc_lte (str, int, one.alf.spec.QC) – A dataset QC value, returns sessions with datasets at or below this QC value, including those with no QC set. If dataset not passed, sessions with any passing QC datasets are returned, otherwise all matching datasets must have the QC value or below.
date_range (str, list, datetime.datetime, datetime.date, pandas.timestamp) – A single date to search or a list of 2 dates that define the range (inclusive). To define only the upper or lower date bound, set the other element to None.
lab (str) – A str or list of lab names, returns sessions from any of these labs.
number (str, int) – Number of session to be returned, i.e. number in sequence for a given date.
subject (str, list) – A list of subject nicknames, returns sessions for any of these subjects.
task_protocol (str) – The task protocol name (can be partial, i.e. any task protocol containing that str will be found).
projects (str, list) – The project name(s) (can be partial, i.e. any project containing that str will be found).
details (bool) – If true also returns a dict of dataset details.

Returns:

list of UUID – A list of eids.
(list) – (If details is True) a list of dictionaries, each entry corresponding to a matching session.

Examples

Search for sessions with ‘training’ in the task protocol.

>>> eids = one.search(task='training')

Search for sessions by subject ‘MFD_04’.

>>> eids = one.search(subject='MFD_04')

Do an exact search for sessions by subject ‘FD_04’.

>>> assert one.wildcards is True, 'the wildcards flag must be True for regex expressions'
>>> eids = one.search(subject='^FD_04$')

Search for sessions on a given date, in a given lab, containing trials and spike data.

>>> eids = one.search(
...    date='2023-01-01', lab='churchlandlab',
...    datasets=['trials.table.pqt', 'spikes.times.npy'])

Search for sessions containing trials and spike data where QC for both are WARNING or less.

>>> eids = one.search(dataset_qc_lte='WARNING', dataset=['trials', 'spikes'])

Search for sessions with any datasets that have a QC of PASS or NOT_SET.

>>> eids = one.search(dataset_qc_lte='PASS')

Notes

In default and local mode, most queries are case-sensitive partial matches. When lists are provided, the search is a logical OR, except for datasets, which is a logical AND.
If dataset_qc and datasets are defined, the QC criterion only applies to the provided datasets and all must pass for a session to be returned.
All search terms are true for a session to be returned, i.e. subject matches AND project matches, etc.
In remote mode most queries are case-insensitive partial matches.
In default and local mode, when the one.wildcards flag is True (default), queries are interpreted as regular expressions. To turn this off set one.wildcards to False.
In remote mode regular expressions are only supported using the django argument.

get_details(eid: str | Path | UUID, full: bool = False)[source]

Return session details for a given session ID.

Parameters:

eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
full (bool) – If True, returns a DataFrame of session and dataset info

Returns:

A session record or full DataFrame with dataset information if full is True

Return type:

pd.Series, pd.DataFrame

list_subjects() → List[str][source]

List all subjects in database.

Returns:: Sorted list of subject names
Return type:: list

list_datasets(eid=None, filename=None, collection=None, revision=None, qc=QC.FAIL, ignore_qc_not_set=False, details=False, query_type=None, default_revisions_only=False, keep_eid_index=False) → ndarray | DataFrame[source]

Given an eid, return the datasets for those sessions.

If no eid is provided, a list of all datasets is returned. When details is false, a sorted array of unique datasets is returned (their relative paths).

Parameters:

eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
filename (str, dict, list) – Filters datasets and returns only the ones matching the filename. Supports lists asterisks as wildcards. May be a dict of ALF parts.
collection (str, list) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
revision (str) – Filters datasets and returns only the ones matching the revision. Supports asterisks as wildcards.
qc (str, int, one.alf.spec.QC) – Returns datasets at or below this QC level. Integer values should correspond to the QC enumeration NOT the qc category column codes in the pandas table.
ignore_qc_not_set (bool) – When true, do not return datasets for which QC is NOT_SET.
details (bool) – When true, a pandas DataFrame is returned, otherwise a numpy array of relative paths (collection/revision/filename) - see one.alf.spec.describe for details.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’).
default_revisions_only (bool) – When true, only matching datasets that are considered default revisions are returned. If no ‘default_revision’ column is present, and ALFError is raised.
keep_eid_index (bool) – If details is true, this determines whether the returned data frame contains the eid in the index. When false (default) the returned data frame index is the dataset id only, otherwise the index is a MultIndex with levels (eid, id).

Returns:

Slice of datasets table or numpy array if details is False.

Return type:

np.ndarray, pd.DataFrame

Examples

List all unique datasets in ONE cache

>>> datasets = one.list_datasets()

List all datasets for a given experiment

>>> datasets = one.list_datasets(eid)

List all datasets for an experiment that match a collection name

>>> probe_datasets = one.list_datasets(eid, collection='*probe*')

List datasets for an experiment that have ‘wheel’ in the filename

>>> datasets = one.list_datasets(eid, filename='*wheel*')

List datasets for an experiment that are part of a ‘wheel’ or ‘trial(s)’ object

>>> datasets = one.list_datasets(eid, {'object': ['wheel', 'trial?']})

list_collections(eid=None, filename=None, collection=None, revision=None, details=False, query_type=None) → ndarray | dict[source]

List the collections for a given experiment.

If no experiment ID is given, all collections are returned.

Parameters:

eid ([str, UUID, Path, dict]) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path
filename (str, dict, list) – Filters datasets and returns only the collections containing matching datasets. Supports lists asterisks as wildcards. May be a dict of ALF parts.
collection (str, list) – Filter by a given pattern. Supports asterisks as wildcards.
revision (str) – Filters collections and returns only the ones with the matching revision. Supports asterisks as wildcards
details (bool) – If true a dict of pandas datasets tables is returned with collections as keys, otherwise a numpy array of unique collections
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)

Returns:

A list of unique collections or dict of datasets tables

Return type:

list, dict

Examples

List all unique collections in ONE cache

>>> collections = one.list_collections()

List all collections for a given experiment

>>> collections = one.list_collections(eid)

List all collections for a given experiment and revision

>>> revised = one.list_collections(eid, revision='2020-01-01')

List all collections that have ‘probe’ in the name.

>>> collections = one.list_collections(eid, collection='*probe*')

List collections for an experiment that have datasets with ‘wheel’ in the name

>>> collections = one.list_collections(eid, filename='*wheel*')

List collections for an experiment that contain numpy datasets

>>> collections = one.list_collections(eid, {'extension': 'npy'})

list_revisions(eid=None, filename=None, collection=None, revision=None, details=False, query_type=None)[source]

List the revisions for a given experiment.

If no experiment id is given, all collections are returned.

Parameters:

eid (str, UUID, Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
filename (str, dict, list) – Filters datasets and returns only the revisions containing matching datasets. Supports lists asterisks as wildcards. May be a dict of ALF parts.
collection (str, list) – Filter by a given collection. Supports asterisks as wildcards.
revision (str, list) – Filter by a given pattern. Supports asterisks as wildcards.
details (bool) – If true a dict of pandas datasets tables is returned with collections as keys, otherwise a numpy array of unique collections.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’).

Returns:

A list of unique collections or dict of datasets tables.

Return type:

list, dict

Examples

List all revisions in ONE cache

>>> revisions = one.list_revisions()

List all revisions for a given experiment

>>> revisions = one.list_revisions(eid)

List all revisions for a given experiment that contain the trials object

>>> revisions = one.list_revisions(eid, filename={'object': 'trials'})

List all revisions for a given experiment that start with 2020 or 2021

>>> revisions = one.list_revisions(eid, revision=['202[01]*'])

Load all attributes of an ALF object from a Session ID and an object name.

Any datasets with matching object name will be loaded.

Parameters:

eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
obj (str) – The ALF object to load. Supports asterisks as wildcards.
collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
revision (str) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted. May be set with ONE_REVISION_LAST_BEFORE environment variable.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’).
download_only (bool) – When true the data are downloaded and the file path is returned. NB: The order of the file path list is undefined.
check_hash (bool) – Consider dataset missing if local file hash does not match. In online mode, the dataset will be re-downloaded.
kwargs – Additional filters for datasets, including namespace and timescale. For full list see the one.alf.spec.describe() function.

Returns:

An ALF bunch or if download_only is True, a list of one.alf.path.ALFPath objects.

Return type:

one.alf.io.AlfBunch, list

Examples

>>> load_object(eid, 'moves')
>>> load_object(eid, 'trials')
>>> load_object(eid, 'spikes', collection='*probe01')  # wildcards is True
>>> load_object(eid, 'spikes', collection='.*probe01')  # wildcards is False
>>> load_object(eid, 'spikes', namespace='ibl')
>>> load_object(eid, 'spikes', timescale='ephysClock')

Load specific attributes:

>>> load_object(eid, 'spikes', attribute=['times*', 'clusters'])

Load a single dataset for a given session id and dataset name.

Parameters:

eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
dataset (str, dict) – The ALF dataset to load. May be a string or dict of ALF parts. Supports asterisks as wildcards.
collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
revision (str) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted. May be set with ONE_REVISION_LAST_BEFORE environment variable.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)
download_only (bool) – When true the data are downloaded and the file path is returned.
check_hash (bool) – Consider dataset missing if local file hash does not match. In online mode, the dataset will be re-downloaded.

Returns:

Dataset or a ALFPath object if download_only is true.

Return type:

np.ndarray, one.alf.path.ALFPath

Examples

>>> intervals = one.load_dataset(eid, '_ibl_trials.intervals.npy')

Load dataset without specifying extension

>>> intervals = one.load_dataset(eid, 'trials.intervals')  # wildcard mode only
>>> intervals = one.load_dataset(eid, '.*trials.intervals.*')  # regex mode only
>>> intervals = one.load_dataset(eid, dict(object='trials', attribute='intervals'))
>>> filepath = one.load_dataset(eid, '_ibl_trials.intervals.npy', download_only=True)
>>> spike_times = one.load_dataset(eid, 'spikes.times.npy', collection='alf/probe01')
>>> old_spikes = one.load_dataset(eid, 'spikes.times.npy',
...                               collection='alf/probe01', revision='2020-08-31')
>>> old_spikes = one.load_dataset(eid, 'alf/probe01/#2020-08-31#/spikes.times.npy')

Raises:

ValueError – When a relative paths is provided (e.g. ‘collection/#revision#/object.attribute.ext’), the collection and revision keyword arguments must be None.
one.alf.exceptions.ALFObjectNotFound – The dataset was not found in the cache or on disk.
one.alf.exceptions.ALFMultipleCollectionsFound – The dataset provided exists in multiple collections or matched multiple different files. Provide a specific collection to load, and make sure any wildcard/regular expressions are specific enough.

Warning

UserWarning: When a relative paths is provided (e.g. ‘collection/#revision#/object.attribute.ext’), wildcards/regular expressions must not be used. To use wildcards, pass the collection and revision as separate keyword arguments.

load_datasets(eid: str | Path | UUID, datasets: List[str], collections: str | None = None, revisions: str | None = None, query_type: str | None = None, assert_present=True, download_only: bool = False, check_hash: bool = True) → Any[source]

Load datasets for a given session id.

Returns two lists the length of datasets. The first is the data (or file paths if download_data is false), the second is a list of meta data Bunches. If assert_present is false, missing data will be returned as None.

Parameters:

eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
datasets (list of strings) – The ALF datasets to load. May be a string or dict of ALF parts. Supports asterisks as wildcards.
collections (str, list) – The collection(s) to which the object(s) belong, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
revisions (str, list) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted. May be set with ONE_REVISION_LAST_BEFORE environment variable.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)
assert_present (bool) – If true, missing datasets raises and error, otherwise None is returned
download_only (bool) – When true the data are downloaded and the file path is returned.
check_hash (bool) – Consider dataset missing if local file hash does not match. In online mode, the dataset will be re-downloaded.

Returns:

list – A list of data (or file paths) the length of datasets.
list – A list of meta data Bunches. If assert_present is False, missing data will be None.

Notes

There are three ways the datasets may be formatted: the object.attribute; the file name (including namespace and extension); the ALF components as a dict; the dataset path, relative to the session path, e.g. collection/object.attribute.ext.
When relative paths are provided (e.g. ‘collection/#revision#/object.attribute.ext’), wildcards/regular expressions must not be used. To use wildcards, pass the collection and revision as separate keyword arguments.
To ensure you are loading the correct revision, use the revisions kwarg instead of relative paths.
To load an exact revision (i.e. not the last revision before a given date), pass in a list of relative paths or a data frame.

Raises:

ValueError – When a relative paths is provided (e.g. ‘collection/#revision#/object.attribute.ext’), the collection and revision keyword arguments must be None.
ValueError – If a list of collections or revisions are provided, they must match the number of datasets passed in.
TypeError – The datasets argument must be a non-string iterable.
one.alf.exceptions.ALFObjectNotFound – One or more of the datasets was not found in the cache or on disk. To suppress this error and return None for missing datasets, use assert_present=False.
one.alf.exceptions.ALFMultipleCollectionsFound – One or more of the dataset(s) provided exist in multiple collections. Provide the specific collections to load, and if using wildcards/regular expressions, make sure the expression is specific enough.

Warning

UserWarning: When providing a list of relative dataset paths, this warning occurs if one or more of the datasets are not marked as default revisions. Avoid such warnings by explicitly passing in the required revisions with the revisions keyword argument.

load_dataset_from_id(dset_id: str | UUID, download_only: bool = False, details: bool = False, check_hash: bool = True) → Any[source]

Load a dataset given a dataset UUID.

Parameters:

dset_id (uuid.UUID, str) – A dataset UUID to load.
download_only (bool) – If true the dataset is downloaded (if necessary) and the filepath returned.
details (bool) – If true a pandas Series is returned in addition to the data.
check_hash (bool) – Consider dataset missing if local file hash does not match. In online mode, the dataset will be re-downloaded.

Returns:

Dataset data (or filepath if download_only) and dataset record if details is True.

Return type:

np.ndarray, one.alf.path.ALFPath

Load all objects in an ALF collection from a Session ID.

Any datasets with matching object name(s) will be loaded. Returns a bunch of objects.

Parameters:

eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
object (str) – The ALF object to load. Supports asterisks as wildcards.
revision (str) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted. May be set with ONE_REVISION_LAST_BEFORE environment variable.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)
download_only (bool) – When true the data are downloaded and the file path is returned.
check_hash (bool) – Consider dataset missing if local file hash does not match. In online mode, the dataset will be re-downloaded.
kwargs – Additional filters for datasets, including namespace and timescale. For full list see the one.alf.spec.describe function.

Returns:

A Bunch of objects or if download_only is True, a list of ALFPath objects.

Return type:

Bunch of one.alf.io.AlfBunch, list of one.alf.path.ALFPath

Examples

>>> alf_collection = load_collection(eid, 'alf')
>>> load_collection(eid, '*probe01', object=['spikes', 'clusters'])  # wildcards is True
>>> files = load_collection(eid, '', download_only=True)  # Base session dir

Raises:

alferr.ALFError – No datasets exist for the provided session collection.
alferr.ALFObjectNotFound – No datasets match the object, attribute or revision filters for this collection.

static setup(cache_dir=None, silent=False, **kwargs)[source]

Set up One cache tables for a given data directory.

Parameters:

cache_dir (pathlib.Path, str) – A path to the ALF data directory.
silent ((False) bool) – When True will prompt for cache_dir, if cache_dir is None, and overwrite cache if any. When False will use cwd for cache_dir, if cache_dir is None, and use existing cache.
kwargs – Optional arguments to pass to one.alf.cache.make_parquet_db.

Returns:

An instance of One for the provided cache directory.

Return type:

One

class OneAlyx(username=None, password=None, base_url=None, cache_dir=None, mode='remote', wildcards=True, tables_dir=None, **kwargs)[source]

Bases: One

An API for searching and loading data through the Alyx database.

load_cache(tables_dir=None, clobber=False, tag=None)[source]

Load parquet cache files.

Queries the database for the location and creation date of the remote cache. If newer, it will be download and loaded.

Parameters:

tables_dir (str, pathlib.Path) – An optional directory location of the parquet files, defaults to One._tables_dir.
clobber (bool) – If True, query Alyx for a newer cache even if current (local) cache is recent.
tag (str) – An optional Alyx dataset tag for loading cache tables containing a subset of datasets.

Returns:

A timestamp of when the cache was loaded.

Return type:

datetime.datetime

Examples

To load the cache tables for a given release tag >>> one.load_cache(tag=’2022_Q2_IBL_et_al_RepeatedSite’)

To reset the cache tables after loading a tag >>> ONE.cache_clear() … one = ONE()

property alyx

The Alyx Web client.

Type:: one.webclient.AlyxClient

property cache_dir

The location of the downloaded file cache.

Type:: pathlib.Path

search_terms(query_type=None, endpoint=None)[source]

Returns a list of search terms to be passed as kwargs to the search method.

Parameters:

query_type (str) – If ‘remote’, the search terms are largely determined by the REST endpoint used.
endpoint (str) – If ‘remote’, specify the endpoint to search terms for.

Returns:

Tuple of search strings.

Return type:

tuple

describe_dataset(dataset_type=None)[source]

Print a dataset type description.

NB: This requires an Alyx database connection.

Parameters:: dataset_type (str) – A dataset type or dataset name.
Returns:: The Alyx dataset type record.
Return type:: dict

list_datasets(eid=None, filename=None, collection=None, revision=None, qc=QC.FAIL, ignore_qc_not_set=False, details=False, query_type=None, default_revisions_only=False, keep_eid_index=False) → ndarray | DataFrame[source]

Given an eid, return the datasets for those sessions.

If no eid is provided, a list of all datasets is returned. When details is false, a sorted array of unique datasets is returned (their relative paths).

Parameters:

eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
filename (str, dict, list) – Filters datasets and returns only the ones matching the filename. Supports lists asterisks as wildcards. May be a dict of ALF parts.
collection (str, list) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
revision (str) – Filters datasets and returns only the ones matching the revision. Supports asterisks as wildcards.
qc (str, int, one.alf.spec.QC) – Returns datasets at or below this QC level. Integer values should correspond to the QC enumeration NOT the qc category column codes in the pandas table.
ignore_qc_not_set (bool) – When true, do not return datasets for which QC is NOT_SET.
details (bool) – When true, a pandas DataFrame is returned, otherwise a numpy array of relative paths (collection/revision/filename) - see one.alf.spec.describe for details.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’).
default_revisions_only (bool) – When true, only matching datasets that are considered default revisions are returned. If no ‘default_revision’ column is present, and ALFError is raised.
keep_eid_index (bool) – If details is true, this determines whether the returned data frame contains the eid in the index. When false (default) the returned data frame index is the dataset id only, otherwise the index is a MultIndex with levels (eid, id).

Returns:

Slice of datasets table or numpy array if details is False.

Return type:

np.ndarray, pd.DataFrame

Examples

List all unique datasets in ONE cache

>>> datasets = one.list_datasets()

List all datasets for a given experiment

>>> datasets = one.list_datasets(eid)

List all datasets for an experiment that match a collection name

>>> probe_datasets = one.list_datasets(eid, collection='*probe*')

List datasets for an experiment that have ‘wheel’ in the filename

>>> datasets = one.list_datasets(eid, filename='*wheel*')

List datasets for an experiment that are part of a ‘wheel’ or ‘trial(s)’ object

>>> datasets = one.list_datasets(eid, {'object': ['wheel', 'trial?']})

list_aggregates(relation: str, identifier: str = None, dataset=None, revision=None, assert_unique=False)[source]

List datasets aggregated over a given relation.

Parameters:

relation (str) – The thing over which the data were aggregated, e.g. ‘subjects’ or ‘tags’.
identifier (str) – The ID of the datasets, e.g. for data over subjects this would be lab/subject.
dataset (str, dict, list) – Filters datasets and returns only the ones matching the filename. Supports lists asterisks as wildcards. May be a dict of ALF parts.
revision (str) – Filters datasets and returns only the ones matching the revision. Supports asterisks as wildcards.
assert_unique (bool) – When true an error is raised if multiple collections or datasets are found.

Returns:

The matching aggregate dataset records.

Return type:

pandas.DataFrame

Examples

List datasets aggregated over a specific subject’s sessions

>>> trials = one.list_aggregates('subjects', 'SP026')

load_aggregate(relation: str, identifier: str, dataset=None, revision=None, download_only=False)[source]

Load a single aggregated dataset for a given string identifier.

Loads data aggregated over a relation such as subject, project or tag.

Parameters:

relation (str) – The thing over which the data were aggregated, e.g. ‘subjects’ or ‘tags’.
identifier (str) – The ID of the datasets, e.g. for data over subjects this would be lab/subject.
dataset (str, dict, list) – Filters datasets and returns only the ones matching the filename. Supports lists asterisks as wildcards. May be a dict of ALF parts.
revision (str) – Filters datasets and returns only the ones matching the revision. Supports asterisks as wildcards.
download_only (bool) – When true the data are downloaded and the file path is returned.

Returns:

Dataset or a ALFPath object if download_only is true.

Return type:

pandas.DataFrame, one.alf.path.ALFPath

Raises:

alferr.ALFObjectNotFound – No datasets match the object, attribute or revision filters for this relation and identifier. Matching dataset was not found on disk (neither on the remote repository or locally).

Examples

Load a dataset aggregated over a specific subject’s sessions

>>> trials = one.load_aggregate('subjects', 'SP026', '_ibl_subjectTraining.table')

Notes

Unlike other loading functions, this function loads datasets with a matching revision.

pid2eid(pid: str, query_type=None) -> (<class 'uuid.UUID'>, <class 'str'>)[source]

Given an Alyx probe UUID string, return the session ID and probe label.

NB: Requires a connection to the Alyx database.

Parameters:

pid (str, UUID) – A probe UUID.
query_type (str) – Query mode - options include ‘remote’, and ‘refresh’.

Returns:

uuid.UUID – Experiment ID (eid).
str – Probe label.

eid2pid(eid, query_type=None, details=False, **kwargs) -> (<class 'uuid.UUID'>, <class 'str'>, <class 'list'>)[source]

Given an experiment UUID (eID), return the probe IDs and labels (i.e. ALF collection).

NB: Requires a connection to the Alyx database.

Parameters:

eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
query_type (str) – Query mode - options include ‘remote’, and ‘refresh’.
details (bool) – Additionally return the complete Alyx records from insertions endpoint.
kwargs – Additional parameters to filter insertions Alyx endpoint.

Returns:

list of UUID – Probe UUIDs (pID).
list of str – Probe labels, e.g. ‘probe00’.
list of dict (optional) – If details is true, returns the Alyx records from insertions endpoint.

Examples

Get the probe IDs and details for a given session ID

>>> pids, labels, recs = one.eid2pid(eid, details=True)

Get the probe ID for a given session ID and label

>>> (pid,), _ = one.eid2pid(eid, details=False, name='probe00')

search_insertions(details=False, query_type=None, **kwargs)[source]

Search insertions matching the given criteria and return a list of matching probe IDs.

For a list of search terms, use the method

one.search_terms(query_type=’remote’, endpoint=’insertions’)

All of the search parameters, apart from dataset and dataset type require a single value. For dataset and dataset type, a single value or a list can be provided. Insertions returned will contain all listed datasets.

Parameters:

session (str) – A session eid, returns insertions associated with the session.
name (str) – An insertion label, returns insertions with specified name.
lab (str) – A lab name, returns insertions associated with the lab.
subject (str) – A subject nickname, returns insertions associated with the subject.
task_protocol (str) – A task protocol name (can be partial, i.e. any task protocol containing that str will be found).
project(s) (str) – The project name (can be partial, i.e. any task protocol containing that str will be found).
dataset (str) – A (partial) dataset name. Returns sessions containing matching datasets. A dataset matches if it contains the search string e.g. ‘wheel.position’ matches ‘_ibl_wheel.position.npy’. C.f. datasets argument.
datasets (str, list) – One or more exact dataset names. Returns insertions containing all these datasets.
dataset_qc_lte (int, str, one.alf.spec.QC) – The maximum QC value for associated datasets.
dataset_types (str, list) – One or more dataset_types (exact matching).
details (bool) – If true also returns a dict of dataset details.
query_type (str, None) – Query cache (‘local’) or Alyx database (‘remote’).
limit (int) – The number of results to fetch in one go (if pagination enabled on server).

Returns:

list of UUID – List of probe IDs (pids).
(list of dicts) – If details is True, also returns a list of dictionaries, each entry corresponding to a matching insertion.

Notes

This method does not use the local cache and therefore can not work in ‘local’ mode.

Examples

List the insertions associated with a given data release

>>> tag = '2022_Q2_IBL_et_al_RepeatedSite'
... ins = one.search_insertions(django='datasets__tags__name,' + tag)

search(details=False, query_type=None, **kwargs)[source]

Searches sessions matching the given criteria and returns a list of matching eids.

For a list of search terms, use the method

one.search_terms(query_type=’remote’)

For all search parameters, a single value or list may be provided. For dataset, the sessions returned will contain all listed datasets. For the other parameters, the session must contain at least one of the entries.

For all but date_range and number, any field that contains the search string is returned. Wildcards are not permitted, however if wildcards property is True, regular expressions may be used (see notes and examples).

Parameters:

datasets (str, list) – One or more (exact) dataset names. Returns sessions containing all of these datasets.
date_range (str, list, datetime.datetime, datetime.date, pandas.timestamp) – A single date to search or a list of 2 dates that define the range (inclusive). To define only the upper or lower date bound, set the other element to None.
lab (str, list) – A str or list of lab names, returns sessions from any of these labs (can be partial, i.e. any task protocol containing that str will be found).
number (str, int) – Number of session to be returned, i.e. number in sequence for a given date.
subject (str, list) – A list of subject nicknames, returns sessions for any of these subjects (can be partial, i.e. any task protocol containing that str will be found).
task_protocol (str, list) – The task protocol name (can be partial, i.e. any task protocol containing that str will be found).
project(s) (str, list) – The project name (can be partial, i.e. any task protocol containing that str will be found).
performance_gte (performance_lte /) – Search only for sessions whose performance is less equal or greater equal than a pre-defined threshold as a percentage (0-100).
users (str, list) – A list of users.
location (str, list) – A str or list of lab location (as per Alyx definition) name. Note: this corresponds to the specific rig, not the lab geographical location per se.
dataset_types (str, list) – One or more of dataset_types. Unlike with datasets, the dataset types for the sessions returned may not be reachable (i.e. for recent sessions the datasets may not yet be available).
dataset_qc_lte (int, str, one.alf.spec.QC) – The maximum QC value for associated datasets. NB: Without datasets, not all associated datasets with the matching QC values are guarenteed to be reachable.
details (bool) – If true also returns a dict of dataset details.
query_type (str, None) – Query cache (‘local’) or Alyx database (‘remote’).
limit (int) – The number of results to fetch in one go (if pagination enabled on server).

Returns:

list of UUID – List of eids.
(list of dicts) – If details is True, also returns a list of dictionaries, each entry corresponding to a matching session.

Examples

Search for sessions with ‘training’ in the task protocol.

>>> eids = one.search(task='training')

Search for sessions by subject ‘MFD_04’.

>>> eids = one.search(subject='MFD_04')

Do an exact search for sessions by subject ‘FD_04’.

>>> assert one.wildcards is True, 'the wildcards flag must be True for regex expressions'
>>> eids = one.search(subject='^FD_04$', query_type='local')

Search for sessions on a given date, in a given lab, containing trials and spike data.

>>> eids = one.search(date='2023-01-01', lab='churchlandlab', dataset=['trials', 'spikes'])

Notes

In default and local mode, most queries are case-sensitive partial matches. When lists are provided, the search is a logical OR, except for datasets, which is a logical AND.
All search terms are true for a session to be returned, i.e. subject matches AND project matches, etc.
In remote mode most queries are case-insensitive partial matches.
In default and local mode, when the one.wildcards flag is True (default), queries are interpreted as regular expressions. To turn this off set one.wildcards to False.
In remote mode regular expressions are only supported using the django argument.
In remote mode, only the datasets argument returns sessions where datasets are registered and exist. Using dataset_types or dataset_qc_lte without datasets will not check that the datasets are reachable.

static setup(base_url=None, **kwargs)[source]

Set up OneAlyx for a given database.

Parameters:

base_url (str) – An Alyx database URL. If None, the current default database is used.
kwargs – Optional arguments to pass to one.params.setup.

Returns:

An instance of OneAlyx for the newly set up database URL

Return type:

OneAlyx