one.util
Decorators and small standalone functions for api module.
Module attributes
The cache table QC column data type. |
Functions
Return a typing.Union if the input and sequence of input. |
|
Validate search term and return complete name, e.g. autocomplete('subj') == 'subject'. |
|
Convert int ids to str ids for cache table. |
|
Extract datasets DataFrame from one or more Alyx dataset records. |
|
Ensure input is a list. |
|
Filter the datasets cache table by the relative path (dataset name, collection and revision). |
|
Filter datasets by revision, returning previous revision in ordered list if revision doesn't exactly match. |
|
Returns the index of string that occurs directly before the provided revision string when lexicographic sorted. |
|
Ensures the input experiment identifier is an experiment UUID string. |
|
Reformat older cache tables to comply with this version of ONE. |
|
Refresh cache depending on query_type kwarg. |
|
Extract session cache record and datasets cache from a remote session data record. |
|
Validates and arrange date range in a 2 elements list. |
Classes
Using a paginated response object or list of session records, extracts eid string when required |
- QC_TYPE = CategoricalDtype(categories=['NOT_SET', 'PASS', 'WARNING', 'FAIL', 'CRITICAL'], ordered=True, categories_dtype=object)
The cache table QC column data type.
- Type:
pandas.api.types.CategoricalDtype
- ses2records(ses: dict)[source]
Extract session cache record and datasets cache from a remote session data record.
- Parameters:
ses (dict) – Session dictionary from Alyx REST endpoint.
- Returns:
pd.Series – Session record.
pd.DataFrame – Datasets frame.
- datasets2records(datasets, additional=None) DataFrame [source]
Extract datasets DataFrame from one or more Alyx dataset records.
- Parameters:
datasets (dict, list) – One or more records from the Alyx ‘datasets’ endpoint.
additional (list of str) – A set of optional fields to extract from dataset records.
- Returns:
Datasets frame.
- Return type:
pd.DataFrame
Examples
>>> datasets = ONE().alyx.rest('datasets', 'list', subject='foobar') >>> df = datasets2records(datasets)
- parse_id(method)[source]
Ensures the input experiment identifier is an experiment UUID string.
- Parameters:
method (function) – An ONE method whose second arg is an experiment ID.
- Returns:
A wrapper function that parses the ID to the expected string.
- Return type:
function
- Raises:
ValueError – Unable to convert input to a valid experiment ID.
- validate_date_range(date_range) -> (<class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'pandas._libs.tslibs.timestamps.Timestamp'>)[source]
Validates and arrange date range in a 2 elements list.
- Parameters:
date_range (str, datetime.date, datetime.datetime, pd.Timestamp, np.datetime64, list, None) – A single date or tuple/list of two dates. None represents no bound.
- Returns:
The start and end timestamps.
- Return type:
tuple of pd.Timestamp
Examples
>>> validate_date_range('2020-01-01') # On this day >>> validate_date_range(datetime.date(2020, 1, 1)) >>> validate_date_range(np.array(['2022-01-30', '2022-01-30'], dtype='datetime64[D]')) >>> validate_date_range(pd.Timestamp(2020, 1, 1)) >>> validate_date_range(np.datetime64(2021, 3, 11)) >>> validate_date_range(['2020-01-01']) # from date >>> validate_date_range(['2020-01-01', None]) # from date >>> validate_date_range([None, '2020-01-01']) # up to date
- Raises:
ValueError – Size of date range tuple must be 1 or 2.
- filter_datasets(all_datasets, filename=None, collection=None, revision=None, revision_last_before=True, qc=QC.FAIL, ignore_qc_not_set=False, assert_unique=True, wildcards=False)[source]
Filter the datasets cache table by the relative path (dataset name, collection and revision). When None is passed, all values will match. To match on empty parts, use an empty string. When revision_last_before is true, None means return latest revision.
- Parameters:
all_datasets (pandas.DataFrame) – A datasets cache table.
filename (str, dict, None) – A filename str or a dict of alf parts. Regular expressions permitted.
collection (str, None) – A collection string. Regular expressions permitted.
revision (str, None) – A revision string to match. If revision_last_before is true, regular expressions are not permitted.
revision_last_before (bool) – When true and no exact match exists, the (lexicographically) previous revision is used instead. When false the revision string is matched like collection and filename, with regular expressions permitted. NB: When true and revision is None the default revision is returned which may not be the last revision. If no default is defined, the last revision is returned.
qc (str, int, one.alf.spec.QC) – Returns datasets at or below this QC level. Integer values should correspond to the QC enumeration NOT the qc category column codes in the pandas table.
ignore_qc_not_set (bool) – When true, do not return datasets for which QC is NOT_SET.
assert_unique (bool) – When true an error is raised if multiple collections or datasets are found.
wildcards (bool) – If true, use unix shell style matching instead of regular expressions.
- Returns:
A slice of all_datasets that match the filters.
- Return type:
pd.DataFrame
Examples
Filter by dataset name and collection
>>> datasets = filter_datasets(all_datasets, '.*spikes.times.*', 'alf/probe00')
Filter datasets not in a collection
>>> datasets = filter_datasets(all_datasets, collection='')
Filter by matching revision
>>> datasets = filter_datasets(all_datasets, 'spikes.times.npy', ... revision='2020-01-12', revision_last_before=False)
Filter by filename parts
>>> datasets = filter_datasets(all_datasets, dict(object='spikes', attribute='times'))
Filter by QC outcome - datasets with WARNING or better
>>> datasets filter_datasets(all_datasets, qc='WARNING')
Filter by QC outcome and ignore datasets with unset QC - datasets with PASS only
>>> datasets filter_datasets(all_datasets, qc='PASS', ignore_qc_not_set=True)
- Raises:
one.alf.exceptions.ALFMultipleCollectionsFound – The matching list of datasets have more than one unique collection and assert_unique is True.
one.alf.exceptions.ALFMultipleRevisionsFound – When revision_last_before is false, the matching list of datasets have more than one unique revision. When revision_last_before is true, a ‘default_revision’ column exists, and no revision is passed, this error means that one or more matching datasets have multiple revisions specified as the default. This is typically an error in the cache table itself as all datasets should have one and only one default revision specified.
one.alf.exceptions.ALFMultipleObjectsFound – The matching list of datasets have more than one unique filename and both assert_unique and revision_last_before are true.
one.alf.exceptions.ALFError – When both assert_unique and revision_last_before is true, and a ‘default_revision’ column exists but revision is None; one or more matching datasets have no default revision specified. This is typically an error in the cache table itself as all datasets should have one and only one default revision specified.
Notes
It is not possible to match datasets that are in a given collection OR NOT in ANY collection. e.g. filter_datasets(dsets, collection=[‘alf’, ‘’]) will not match the latter. For this you must use two separate queries.
- filter_revision_last_before(datasets, revision=None, assert_unique=True, assert_consistent=False)[source]
Filter datasets by revision, returning previous revision in ordered list if revision doesn’t exactly match.
- Parameters:
datasets (pandas.DataFrame) – A datasets cache table.
revision (str) – A revision string to match (regular expressions not permitted).
assert_unique (bool) – When true an alferr.ALFMultipleRevisionsFound exception is raised when multiple default revisions are found; an alferr.ALFError when no default revision is found.
assert_consistent (bool) – Will raise alferr.ALFMultipleRevisionsFound if matching revision is different between datasets.
- Returns:
A datasets DataFrame with 0 or 1 row per unique dataset.
- Return type:
pd.DataFrame
- Raises:
one.alf.exceptions.ALFMultipleRevisionsFound – When the ‘default_revision’ column exists and no revision is passed, this error means that one or more matching datasets have multiple revisions specified as the default. This is typically an error in the cache table itself as all datasets should have one and only one default revision specified. When assert_consistent is True, this error may mean that the matching datasets have mixed revisions.
one.alf.exceptions.ALFMultipleObjectsFound – The matching list of datasets have more than one unique filename and both assert_unique and revision_last_before are true.
one.alf.exceptions.ALFError – When both assert_unique and revision_last_before is true, and a ‘default_revision’ column exists but revision is None; one or more matching datasets have no default revision specified. This is typically an error in the cache table itself as all datasets should have one and only one default revision specified.
Notes
When revision is not None, the default revision value is not used. If an older revision is the default one (uncommon), passing in a revision may lead to a newer revision being returned than if revision is None.
A view is returned if a revision column is present, otherwise a copy is returned.
- index_last_before(revisions: List[str], revision: str | None) int | None [source]
Returns the index of string that occurs directly before the provided revision string when lexicographic sorted. If revision is None, the index of the most recent revision is returned.
- Parameters:
revisions (list of strings) – A list of revision strings.
revision (None, str) – The revision string to match on.
- Returns:
Index of revision before matching string in sorted list or None.
- Return type:
int, None
Examples
>>> idx = index_last_before([], '2020-08-01')
- autocomplete(term, search_terms) str [source]
Validate search term and return complete name, e.g. autocomplete(‘subj’) == ‘subject’.
- class LazyId(pg, func=None)[source]
Bases:
Mapping
Using a paginated response object or list of session records, extracts eid string when required
- cache_int2str(table: DataFrame) DataFrame [source]
Convert int ids to str ids for cache table.
- Parameters:
table (pd.DataFrame) – A cache table (from One._cache).
- patch_cache(table: DataFrame, min_api_version=None, name=None) DataFrame [source]
Reformat older cache tables to comply with this version of ONE.
Currently this function will 1. convert integer UUIDs to string UUIDs; 2. rename the ‘project’ column to ‘projects’.
- Parameters:
table (pd.DataFrame) – A cache table (from One._cache).
min_api_version (str) – The minimum API version supported by this cache table.
name ({'dataset', 'session'} str) – The name of the table.