ONE Quick Start

This tutorial will get you started searching and loading IBL data using Open Neurophysiology Environment (ONE).

First we need to install ONE. If you don’t already have IBL libraries, the easiest way is to run pip install ONE-api.

Now we need to import the ONE library and open a connection to the IBL public data server. To do so, we create an ONE object, and ask it to connect to the IBL public server.

Info.

IBL internal users may use their Alyx credentials to access all IBL data. Click here for details.

[1]:
from one.api import ONE
ONE.setup(base_url='https://openalyx.internationalbrainlab.org', silent=True)
one = ONE(password='international')

Now we are going to search for an experiment to analyze. First let’s find out what we can search by:

[2]:
print(one.search_terms())
('dataset', 'date_range', 'laboratory', 'number', 'projects', 'subject', 'task_protocol', 'dataset_qc_lte')

Let’s search for sessions recorded in August 2020, which contain a dataset ‘probes.description’, meaning that electrophysiology was recorded. By adding the argument details=True, we get two outputs - the experiment IDs uniquely identifying these sessions, and some information about the experiments.

[3]:
eids, info = one.search(date_range=['2020-08-01', '2020-08-31'], dataset='probes.description', details=True)

from pprint import pprint
pprint(eids)
pprint(info)
['ebe2efe3-e8a1-451a-8947-76ef42427cc9',
 'b69b86be-af7d-4ecf-8cbf-0cd356afa1bd',
 'edd22318-216c-44ff-bc24-49ce8be78374',
 '71e55bfe-5a3a-4cba-bdc7-f085140d798e',
 '626126d5-eecf-4e9b-900e-ec29a17ece07',
 '49e0ab27-827a-4c91-bcaa-97eea27a1b8d',
 '81a78eac-9d36-4f90-a73a-7eb3ad7f770b',
 '5adab0b7-dfd0-467d-b09d-43cb7ca5d59c',
 'e56541a5-a6d5-4750-b1fe-f6b5257bfe7c',
 '5d01d14e-aced-4465-8f8e-9a1c674f62ec',
 '7f6b86f9-879a-4ea2-8531-294a221af5d0',
 '8c33abef-3d3e-4d42-9f27-445e9def08f9',
 'c557324b-b95d-414c-888f-6ee1329a2329',
 '61e11a11-ab65-48fb-ae08-3cb80662e5d6',
 'c7248e09-8c0d-40f2-9eb4-700a8973d8c8',
 '280ee768-f7b8-4c6c-9ea0-48ca75d6b6f3',
 'ff48aa1d-ef30-4903-ac34-8c41b738c1b9',
 '03063955-2523-47bd-ae57-f7489dd40f15']
[{'date': datetime.date(2020, 8, 19),
  'lab': 'angelakilab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'NYU-21',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 19),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_026',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 19),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_019',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 18),
  'lab': 'angelakilab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'NYU-26',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 18),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_026',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 18),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_019',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 17),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_026',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 16),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_019',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 15),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_026',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 14),
  'lab': 'zadorlab',
  'number': 2,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_026',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 14),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_019',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 13),
  'lab': 'angelakilab',
  'number': 2,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'NYU-21',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 12),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_025',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 10),
  'lab': 'angelakilab',
  'number': 2,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'NYU-21',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 5),
  'lab': 'mainenlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'ZM_3001',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 4),
  'lab': 'zadorlab',
  'number': 2,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_025',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 3),
  'lab': 'zadorlab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'CSH_ZAD_025',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'},
 {'date': datetime.date(2020, 8, 1),
  'lab': 'mrsicflogellab',
  'number': 1,
  'projects': 'ibl_neuropixel_brainwide_01',
  'subject': 'SWC_038',
  'task_protocol': '_iblrig_tasks_ephysChoiceWorld6.4.1'}]

So there were four experiments matching the criteria in the public database. Now let’s load the probe information for the first experiment. one.load_dataset returns a single named dataset

[4]:
eid = eids[0]  # Select the first experiment
probe_insertions = one.load_dataset(eid, 'probes.description')

print(f'N probes = {len(probe_insertions)}')
pprint(probe_insertions[0])
N probes = 2
{'label': 'probe00',
 'model': '3B2',
 'raw_file_name': 'D:/iblrig_data/Subjects/NYU-21/2020-08-19/001/raw_ephys_data/_spikeglx_ephysData_g0/_spikeglx_ephysData_g0_imec0/_spikeglx_ephysData_g0_t0.imec0.ap.bin',
 'serial': 19051004302}

Now let’s see all the datasets associated with the first of these experiments. The command one.list_datasets returns the full path of all datasets, including the collection name and the extension. The ‘alf’ collection contains the preprocessed data we usually want to work with, and the data for each probe are in labeled subdirectories. We use the wildcard * because the data can be saved in different subdirectories for different spike sorters.

[5]:
probe_label = probe_insertions[0]['label']
one.list_datasets(eid, collection=f'alf/{probe_label}*')
Out[5]:
['alf/probe00/electrodeSites.brainLocationIds_ccf_2017.npy',
 'alf/probe00/electrodeSites.localCoordinates.npy',
 'alf/probe00/electrodeSites.mlapdv.npy',
 'alf/probe00/pykilosort/#2024-05-06#/_ibl_log.info_pykilosort.log',
 'alf/probe00/pykilosort/#2024-05-06#/_kilosort_whitening.matrix.npy',
 'alf/probe00/pykilosort/#2024-05-06#/_phy_spikes_subset.channels.npy',
 'alf/probe00/pykilosort/#2024-05-06#/_phy_spikes_subset.spikes.npy',
 'alf/probe00/pykilosort/#2024-05-06#/_phy_spikes_subset.waveforms.npy',
 'alf/probe00/pykilosort/#2024-05-06#/channels.brainLocationIds_ccf_2017.npy',
 'alf/probe00/pykilosort/#2024-05-06#/channels.labels.npy',
 'alf/probe00/pykilosort/#2024-05-06#/channels.localCoordinates.npy',
 'alf/probe00/pykilosort/#2024-05-06#/channels.mlapdv.npy',
 'alf/probe00/pykilosort/#2024-05-06#/channels.rawInd.npy',
 'alf/probe00/pykilosort/#2024-05-06#/clusters.amps.npy',
 'alf/probe00/pykilosort/#2024-05-06#/clusters.channels.npy',
 'alf/probe00/pykilosort/#2024-05-06#/clusters.depths.npy',
 'alf/probe00/pykilosort/#2024-05-06#/clusters.metrics.pqt',
 'alf/probe00/pykilosort/#2024-05-06#/clusters.peakToTrough.npy',
 'alf/probe00/pykilosort/#2024-05-06#/clusters.uuids.csv',
 'alf/probe00/pykilosort/#2024-05-06#/clusters.waveforms.npy',
 'alf/probe00/pykilosort/#2024-05-06#/clusters.waveformsChannels.npy',
 'alf/probe00/pykilosort/#2024-05-06#/drift.times.npy',
 'alf/probe00/pykilosort/#2024-05-06#/drift.um.npy',
 'alf/probe00/pykilosort/#2024-05-06#/drift_depths.um.npy',
 'alf/probe00/pykilosort/#2024-05-06#/passingSpikes.table.pqt',
 'alf/probe00/pykilosort/#2024-05-06#/spikes.amps.npy',
 'alf/probe00/pykilosort/#2024-05-06#/spikes.clusters.npy',
 'alf/probe00/pykilosort/#2024-05-06#/spikes.depths.npy',
 'alf/probe00/pykilosort/#2024-05-06#/spikes.samples.npy',
 'alf/probe00/pykilosort/#2024-05-06#/spikes.templates.npy',
 'alf/probe00/pykilosort/#2024-05-06#/spikes.times.npy',
 'alf/probe00/pykilosort/#2024-05-06#/templates.amps.npy',
 'alf/probe00/pykilosort/#2024-05-06#/templates.waveforms.npy',
 'alf/probe00/pykilosort/#2024-05-06#/templates.waveformsChannels.npy',
 'alf/probe00/pykilosort/#2024-05-06#/waveforms.channels.npz',
 'alf/probe00/pykilosort/#2024-05-06#/waveforms.table.pqt',
 'alf/probe00/pykilosort/#2024-05-06#/waveforms.templates.npy',
 'alf/probe00/pykilosort/#2024-05-06#/waveforms.traces.npy',
 'alf/probe00/pykilosort/_ibl_log.info_pykilosort.log',
 'alf/probe00/pykilosort/_kilosort_whitening.matrix.npy',
 'alf/probe00/pykilosort/_phy_spikes_subset.channels.npy',
 'alf/probe00/pykilosort/_phy_spikes_subset.spikes.npy',
 'alf/probe00/pykilosort/_phy_spikes_subset.waveforms.npy',
 'alf/probe00/pykilosort/channels.brainLocationIds_ccf_2017.npy',
 'alf/probe00/pykilosort/channels.localCoordinates.npy',
 'alf/probe00/pykilosort/channels.mlapdv.npy',
 'alf/probe00/pykilosort/channels.rawInd.npy',
 'alf/probe00/pykilosort/clusters.amps.npy',
 'alf/probe00/pykilosort/clusters.channels.npy',
 'alf/probe00/pykilosort/clusters.depths.npy',
 'alf/probe00/pykilosort/clusters.metrics.pqt',
 'alf/probe00/pykilosort/clusters.peakToTrough.npy',
 'alf/probe00/pykilosort/clusters.uuids.csv',
 'alf/probe00/pykilosort/clusters.waveforms.npy',
 'alf/probe00/pykilosort/clusters.waveformsChannels.npy',
 'alf/probe00/pykilosort/drift.times.npy',
 'alf/probe00/pykilosort/drift.um.npy',
 'alf/probe00/pykilosort/drift_depths.um.npy',
 'alf/probe00/pykilosort/spikes.amps.npy',
 'alf/probe00/pykilosort/spikes.clusters.npy',
 'alf/probe00/pykilosort/spikes.depths.npy',
 'alf/probe00/pykilosort/spikes.samples.npy',
 'alf/probe00/pykilosort/spikes.templates.npy',
 'alf/probe00/pykilosort/spikes.times.npy',
 'alf/probe00/pykilosort/templates.amps.npy',
 'alf/probe00/pykilosort/templates.waveforms.npy',
 'alf/probe00/pykilosort/templates.waveformsChannels.npy']

We might be interested in the data of this session that is not specific to the probe recording, e.g. the behavioural data or video data. We can find that if we list datasets in the alf collection without specifying the probe.

[6]:
one.list_datasets(eid, collection='alf')
Out[6]:
['alf/_ibl_bodyCamera.dlc.pqt',
 'alf/_ibl_bodyCamera.times.npy',
 'alf/_ibl_leftCamera.dlc.pqt',
 'alf/_ibl_leftCamera.features.pqt',
 'alf/_ibl_leftCamera.times.npy',
 'alf/_ibl_passiveGabor.table.csv',
 'alf/_ibl_passivePeriods.intervalsTable.csv',
 'alf/_ibl_passiveRFM.times.npy',
 'alf/_ibl_passiveStims.table.csv',
 'alf/_ibl_rightCamera.dlc.pqt',
 'alf/_ibl_rightCamera.features.pqt',
 'alf/_ibl_rightCamera.times.npy',
 'alf/_ibl_trials.goCueTrigger_times.npy',
 'alf/_ibl_trials.stimOff_times.npy',
 'alf/_ibl_trials.table.pqt',
 'alf/_ibl_wheel.position.npy',
 'alf/_ibl_wheel.timestamps.npy',
 'alf/_ibl_wheelMoves.intervals.npy',
 'alf/_ibl_wheelMoves.peakAmplitude.npy',
 'alf/bodyCamera.ROIMotionEnergy.npy',
 'alf/bodyROIMotionEnergy.position.npy',
 'alf/leftCamera.ROIMotionEnergy.npy',
 'alf/leftROIMotionEnergy.position.npy',
 'alf/licks.times.npy',
 'alf/probes.description.json',
 'alf/rightCamera.ROIMotionEnergy.npy',
 'alf/rightROIMotionEnergy.position.npy']

Let’s load the preprocessed data associated with the left camera. There are two ways to do this. one.load_dataset or one.load_object, which returns all the datasets with the same name part, as an object. Let’s use the second method to load all left camera data in the alf folder.

[7]:
cam = one.load_object(eids[0], 'leftCamera', collection='alf')
cam.keys()
Out[7]:
dict_keys(['ROIMotionEnergy', 'times', 'dlc', 'features'])

To load only specific data associated with this object, we can use the attribute keyword. Let’s load the times and the DLC traces.

[8]:
cam = one.load_object(eids[0], 'leftCamera', collection='alf', attribute=['times', 'dlc'])
cam.keys()
Out[8]:
dict_keys(['times', 'dlc'])

And that’s the end of the quick start tutorial! For more information on any of these commands you can use the standard help function:

[9]:
help(one.list_datasets)
Help on method list_datasets in module one.api:

list_datasets(eid=None, filename=None, collection=None, revision=None, qc=<QC.FAIL: 40>, ignore_qc_not_set=False, details=False, query_type=None) -> Union[numpy.ndarray, pandas.core.frame.DataFrame] method of one.api.OneAlyx instance
    Given an eid, return the datasets for those sessions.

    If no eid is provided, a list of all datasets is returned.  When details is false, a sorted
    array of unique datasets is returned (their relative paths).

    Parameters
    ----------
    eid : str, UUID, pathlib.Path, dict
        Experiment session identifier; may be a UUID, URL, experiment reference string
        details dict or Path.
    filename : str, dict, list
        Filters datasets and returns only the ones matching the filename.
        Supports lists asterisks as wildcards.  May be a dict of ALF parts.
    collection : str, list
        The collection to which the object belongs, e.g. 'alf/probe01'.
        This is the relative path of the file from the session root.
        Supports asterisks as wildcards.
    revision : str
        Filters datasets and returns only the ones matching the revision.
        Supports asterisks as wildcards.
    qc : str, int, one.alf.spec.QC
        Returns datasets at or below this QC level.  Integer values should correspond to the QC
        enumeration NOT the qc category column codes in the pandas table.
    ignore_qc_not_set : bool
        When true, do not return datasets for which QC is NOT_SET.
    details : bool
        When true, a pandas DataFrame is returned, otherwise a numpy array of
        relative paths (collection/revision/filename) - see one.alf.spec.describe for details.
    query_type : str
        Query cache ('local') or Alyx database ('remote').

    Returns
    -------
    np.ndarray, pd.DataFrame
        Slice of datasets table or numpy array if details is False.

    Examples
    --------
    List all unique datasets in ONE cache

    >>> datasets = one.list_datasets()

    List all datasets for a given experiment

    >>> datasets = one.list_datasets(eid)

    List all datasets for an experiment that match a collection name

    >>> probe_datasets = one.list_datasets(eid, collection='*probe*')

    List datasets for an experiment that have 'wheel' in the filename

    >>> datasets = one.list_datasets(eid, filename='*wheel*')

    List datasets for an experiment that are part of a 'wheel' or 'trial(s)' object

    >>> datasets = one.list_datasets(eid, {'object': ['wheel', 'trial?']})

For detailed tutorials, guides and examples, click here for the full ONE API documentation Website.