one.api¶
Classes for searching, listing and (down)loading ALyx Files TODO Document TODO Add sig to ONE Light uuids TODO Save changes to cache TODO Fix update cache in AlyxONE - save parquet table TODO save parquet in update_filesystem
- Points of discussion:
Module structure: oneibl is too restrictive, naming module one means obj should have
different name - Download datasets timeout - Support for pids? - Need to check performance of 1. (re)setting index, 2. converting object array to 2D int array - NB: Sessions table date ordered. Indexing by eid is therefore O(N) but not done in code. Datasets table has sorted index. - Conceivably you could have a subclass for Figshare, etc., not just Alyx
Functions
ONE API factory Determine which class to instantiate depending on parameters passed. |
Classes
-
class
One
(cache_dir=None, mode='auto', wildcards=True)[source]¶ Bases:
one.converters.ConversionMixin
-
property
offline
¶
-
refresh_cache
(mode='auto')[source]¶ Check and reload cache tables
- Parameters
mode (str) – Options are ‘local’ (don’t reload); ‘refresh’ (reload); ‘auto’ (reload if expired); ‘remote’ (don’t reload)
- Returns
- Return type
Loaded timestamp
-
search
(details=False, query_type=None, **kwargs)[source]¶ Searches sessions matching the given criteria and returns a list of matching eids
For a list of search terms, use the methods
one.search_terms()
For all of the search parameters, a single value or list may be provided. For dataset, the sessions returned will contain all listed datasets. For the other parameters, the session must contain at least one of the entries. NB: Wildcards are not permitted, however if wildcards property is False, regular expressions may be used for all but number and date_range.
- Parameters
dataset (str, list) – list of dataset names. Returns sessions containing all these datasets. A dataset matches if it contains the search string e.g. ‘wheel.position’ matches ‘_ibl_wheel.position.npy’
date_range (str, list, datetime.datetime, datetime.date, pandas.timestamp) – A single date to search or a list of 2 dates that define the range (inclusive). To define only the upper or lower date bound, set the other element to None.
lab (str) – A str or list of lab names, returns sessions from any of these labs
number (str, int) – Number of session to be returned, i.e. number in sequence for a given date
subject (str, list) – A list of subject nicknames, returns sessions for any of these subjects
task_protocol (str) – The task protocol name (can be partial, i.e. any task protocol containing that str will be found)
project (str) – The project name (can be partial, i.e. any task protocol containing that str will be found)
details (bool) – If true also returns a dict of dataset details
query_type (str, None) – Query cache (‘local’) or Alyx database (‘remote’)
- Returns
list of eids, if details is True, also returns a list of dictionaries, each entry
corresponding to a matching session
-
list_subjects
() → List[str][source]¶ List all subjects in database
- Returns
- Return type
Sorted list of subject names
-
list_datasets
(eid=None, collection=None, details=False, query_type=None) → Union[numpy.ndarray, pandas.core.frame.DataFrame][source]¶ Given an eid, return the datasets for those sessions. If no eid is provided, a list of all datasets is returned. When details is false, a sorted array of unique datasets is returned (their relative paths).
- Parameters
eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
details (bool) – When true, a pandas DataFrame is returned, otherwise a numpy array of relative paths (collection/revision/filename) - see one.alf.spec.describe for details.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)
- Returns
- Return type
Slice of datasets table or numpy array if details is False
-
list_collections
(eid=None, details=False) → Union[numpy.ndarray, dict][source]¶ List the collections for a given experiment. If no experiment id is given, all collections are returned.
- Parameters
eid ([str, UUID, Path, dict]) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path
details (bool) – If true a dict of pandas datasets tables is returned with collections as keys, otherwise a numpy array of unique collections
- Returns
- Return type
A numpy array of unique collections or dict of datasets tables
-
list_revisions
(eid=None, dataset=None, collection=None, details=False)[source]¶ List the revisions for a given experiment. If no experiment id is given, all collections are returned.
- Parameters
eid ([str, UUID, Path, dict]) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path
details (bool) – If true a dict of pandas datasets tables is returned with collections as keys, otherwise a numpy array of unique collections
- Returns
- Return type
A numpy array of unique collections or dict of datasets tables
-
load_object
(eid: Union[str, pathlib.Path, uuid.UUID], obj: str, collection: Optional[str] = None, revision: Optional[str] = None, query_type: Optional[str] = None, download_only: bool = False, **kwargs) → Union[one.alf.io.AlfBunch, List[pathlib.Path]][source]¶ Load all attributes of an ALF object from a Session ID and an object name. Any datasets with matching object name will be loaded.
- Parameters
eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
obj (str) – The ALF object to load. Supports asterisks as wildcards.
collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
revision (str) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)
download_only (bool) – When true the data are downloaded and the file path is returned.
kwargs (dict) – Additional filters for datasets, including namespace and timescale. For full list see the one.alf.spec.describe function.
- Returns
- Return type
An ALF bunch or if download_only is True, a list of Paths objects
Examples
load_object(eid, ‘moves’) load_object(eid, ‘trials’) load_object(eid, ‘spikes’, collection=’probe01’) # wildcards is True load_object(eid, ‘spikes’, collection=’.*probe01’) # wildcards is False load_object(eid, ‘spikes’, namespace=’ibl’) load_object(eid, ‘spikes’, timescale=’ephysClock’) # Load specific attributes load_object(eid, ‘spikes’, attribute=[‘times’, ‘clusters’])
-
load_dataset
(eid: Union[str, pathlib.Path, uuid.UUID], dataset: str, collection: Optional[str] = None, revision: Optional[str] = None, query_type: Optional[str] = None, download_only: bool = False, **kwargs) → Any[source]¶ Load a single dataset for a given session id and dataset name
- Parameters
eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
dataset (str, dict) – The ALF dataset to load. May be a string or dict of ALF parts. Supports asterisks as wildcards.
collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
revision (str) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)
download_only (bool) – When true the data are downloaded and the file path is returned.
- Returns
- Return type
Dataset or a Path object if download_only is true.
Examples
intervals = one.load_dataset(eid, ‘_ibl_trials.intervals.npy’) # Load dataset without specifying extension intervals = one.load_dataset(eid, ‘trials.intervals’) # wildcard mode only intervals = one.load_dataset(eid, ‘trials.intervals’) # wildcard mode only filepath = one.load_dataset(eid ‘_ibl_trials.intervals.npy’, download_only=True) spike_times = one.load_dataset(eid ‘spikes.times.npy’, collection=’alf/probe01’) old_spikes = one.load_dataset(eid, ‘spikes.times.npy’,
collection=’alf/probe01’, revision=’2020-08-31’)
-
load_datasets
(eid: Union[str, pathlib.Path, uuid.UUID], datasets: List[str], collections: Optional[str] = None, revisions: Optional[str] = None, query_type: Optional[str] = None, assert_present=True, download_only: bool = False, **kwargs) → Any[source]¶ Load datasets for a given session id. Returns two lists the length of datasets. The first is the data (or file paths if download_data is false), the second is a list of meta data Bunches. If assert_present is false, missing data will be returned as None.
- Parameters
eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
datasets (list of strings) – The ALF datasets to load. May be a string or dict of ALF parts. Supports asterisks as wildcards.
collections (str, list) – The collection(s) to which the object(s) belong, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
revisions (str, list) – The dataset revision (typically an ISO date). If no exact match, the previous revision (ordered lexicographically) is returned. If None, the default revision is returned (usually the most recent revision). Regular expressions/wildcards not permitted.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)
assert_present (bool) – If true, missing datasets raises and error, otherwise None is returned
download_only (bool) – When true the data are downloaded and the file path is returned.
- Returns
Returns a list of data (or file paths) the length of datasets, and a list of
meta data Bunches. If assert_present is False, missing data will be None
-
load_dataset_from_id
(dset_id: Union[str, uuid.UUID], download_only: bool = False, details: bool = False) → Any[source]¶ Load a dataset given a dataset UUID
- Parameters
dset_id (uuid.UUID, str) – A dataset UUID to load
download_only (bool) – If true the dataset is downloaded (if necessary) and the filepath returned
details (bool) – If true a pandas Series is returned in addition to the data
- Returns
- Return type
Dataset data (or filepath if download_only) and dataset record if details is True
-
property
-
ONE
(*, mode='auto', wildcards=True, **kwargs)[source]¶ ONE API factory Determine which class to instantiate depending on parameters passed.
- Parameters
mode (str) – Query mode, options include ‘auto’, ‘local’ (offline) and ‘remote’ (online only). Most methods have a query_type parameter that can override the class mode.
wildcards (bool) – If true all mathods use unix shell style pattern matching, otherwise regular expressions are used.
cache_dir (str, Path) – Path to the data files. If Alyx parameters have been set up for this location, an OneAlyx instance is returned. If data_dir and base_url are None, the default location is used.
base_url (str) – An Alyx database URL. The URL must start with ‘http’.
username (str) – An Alyx database login username.
password (str) – An Alyx database password.
cache_rest (str) – If not in ‘local’ mode, this determines which http request types to cache. Default is ‘GET’. Use None to deactivate cache (not recommended).
- Returns
- Return type
An One instance if mode is ‘local’, otherwise an OneAlyx instance.
-
class
OneAlyx
(username=None, password=None, base_url=None, cache_dir=None, mode='auto', wildcards=True, **kwargs)[source]¶ Bases:
one.api.One
-
property
alyx
¶
-
search_terms
(query_type=None)[source]¶ Returns a list of search terms to be passed as kwargs to the search method
- Parameters
query_type (str) – If ‘remote’, the search terms are largely determined by the REST endpoint used
- Returns
- Return type
Tuple of search strings
-
list_datasets
(eid=None, collection=None, details=False, query_type=None) → Union[numpy.ndarray, pandas.core.frame.DataFrame][source]¶ Given an eid, return the datasets for those sessions. If no eid is provided, a list of all datasets is returned. When details is false, a sorted array of unique datasets is returned (their relative paths).
- Parameters
eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
collection (str) – The collection to which the object belongs, e.g. ‘alf/probe01’. This is the relative path of the file from the session root. Supports asterisks as wildcards.
details (bool) – When true, a pandas DataFrame is returned, otherwise a numpy array of relative paths (collection/revision/filename) - see one.alf.spec.describe for details.
query_type (str) – Query cache (‘local’) or Alyx database (‘remote’)
- Returns
- Return type
Slice of datasets table or numpy array if details is False
-
pid2eid
(pid: str, query_type=None) -> (<class 'str'>, <class 'str'>)[source]¶ Given an Alyx probe UUID string, returns the session id string and the probe label (i.e. the ALF collection)
- Parameters
pid (str, uuid.UUID) – A probe UUID
query_type (str) – Query mode - options include ‘remote’, and ‘refresh’
- Returns
- Return type
experiment ID, probe label
-
search
(details=False, query_type=None, **kwargs)[source]¶ Searches sessions matching the given criteria and returns a list of matching eids
For a list of search terms, use the method
one.search_terms(query_type=’remote’)
For all of the search parameters, a single value or list may be provided. For dataset, the sessions returned will contain all listed datasets. For the other parameters, the session must contain at least one of the entries. NB: Wildcards are not permitted, however if wildcards property is False, regular expressions may be used for all but number and date_range.
- Parameters
dataset (str, list) – list of dataset names. Returns sessions containing all these datasets. A dataset matches if it contains the search string e.g. ‘wheel.position’ matches ‘_ibl_wheel.position.npy’
date_range (str, list, datetime.datetime, datetime.date, pandas.timestamp) – A single date to search or a list of 2 dates that define the range (inclusive). To define only the upper or lower date bound, set the other element to None.
lab (str, list) – A str or list of lab names, returns sessions from any of these labs
number (str, int) – Number of session to be returned, i.e. number in sequence for a given date
subject (str, list) – A list of subject nicknames, returns sessions for any of these subjects
task_protocol (str, list) – The task protocol name (can be partial, i.e. any task protocol containing that str will be found)
project (str, list) – The project name (can be partial, i.e. any task protocol containing that str will be found)
/ performance_gte (performance_lte) – search only for sessions whose performance is less equal or greater equal than a pre-defined threshold as a percentage (0-100)
users (str, list) – A list of users
location (str, list) – a str or list of lab location (as per Alyx definition) name Note: this corresponds to the specific rig, not the lab geographical location per se
dataset_types (str, list) – One or more of dataset_types
details (bool) – If true also returns a dict of dataset details
query_type (str, None) – Query cache (‘local’) or Alyx database (‘remote’)
limit (int) – The number of results to fetch in one go (if pagination enabled on server)
- Returns
List of eids and, if details is True, also returns a list of dictionaries, each entry
corresponding to a matching session
-
eid2path
(eid: str, query_type=None) → Union[pathlib.Path, Sequence[pathlib.Path]][source]¶ From an experiment ID gets the local session path
- Parameters
eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
query_type (str) – If set to ‘remote’, will force database connection
- Returns
- Return type
A session path or list of session paths
-
path2eid
(path_obj: Union[str, pathlib.Path], query_type=None) → Union[pathlib.Path, Sequence[pathlib.Path]][source]¶ From a local path, gets the experiment ID
- Parameters
path_obj (str, pathlib.Path, list) – Local path or list of local paths
query_type (str) – If set to ‘remote’, will force database connection
- Returns
- Return type
An eid or list of eids
-
path2url
(filepath, query_type=None)[source]¶ Given a local file path, returns the URL of the remote file.
- Parameters
filepath (str, pathlib.Path) – A local file path
query_type (str) – If set to ‘remote’, will force database connection
- Returns
- Return type
A URL string
-
type2datasets
(eid, dataset_type, details=False)[source]¶ Get list of datasets belonging to a given dataset type for a given session
- Parameters
eid (str, UUID, pathlib.Path, dict) – Experiment session identifier; may be a UUID, URL, experiment reference string details dict or Path.
dataset_type (str, list) – An Alyx dataset type, e.g. camera.times or a list of dtypes
details (bool) – If True, a datasets DataFrame is returned
- Returns
- Return type
A numpy array of data, or DataFrame if details is true
-
property