Download the public datasets

To get a feel for the structure of the data we recommended first downloading the alf data for a single repeated site session and exploring how the data is stored locally on disk. An example alf folder can be downloaded from here. Documentation explaining the data structure can be found here.

In the following sections, we explain how to use the ONE-api to search for and download datasets for any session released. Using the ONE-api is the recommended method to browse through and download available datasets.

Installation

Environment

To use IBL data you will need a python environment with python > 3.7. To create a new environment from scratch you can install anaconda and follow the instructions below to create a new python environment (more information can also be found here)

conda create --name ibl python=3.9

Make sure to always activate this environment before installing or working with the IBL data

conda activate ibl

Install packages

To use IBL data you will need to install the ONE-api package. We also recommend installing ibllib. These can be installed via pip.

pip install ONE-api
pip install ibllib

Setting up credentials

Credentials can be setup in a python terminal in the following way

[1]:
from one.api import ONE
pw = 'international'
one = ONE(base_url='https://openalyx.internationalbrainlab.org', password=pw, silent=True)

Explore and download data using the ONE-api

Launch the ONE-api

Prior to do any searching / downloading, you need to instantiate ONE :

[2]:
from one.api import ONE
one = ONE(base_url='https://openalyx.internationalbrainlab.org')

List all sessions available

Once ONE is instantiated, you can use the REST ONE-api to list all sessions publicly available:

[3]:
sessions = one.alyx.rest('sessions', 'list')

Each session is given a unique identifier (EID); this EID is what you will use to download data for a given session:

[4]:
# Take the first session
example_sess = sessions[0]
# Each session has a unique experiment id
eid = example_sess['id']

Find a session that has a dataset of interest

Not all sessions will have all the datasets available. As such, it may be important for you to filter and search for only sessions with particular datasets of interest. The detailed list of datasets can be found in this document.

In the example below, we want to find all sessions that have spikes.times data:

[5]:
# Find sessions that have spikes.times datasets
sessions_with_spikes = one.alyx.rest('sessions', 'list', dataset_types='spikes.times')

Find data associated with a release or publication

Datasets are often associated to a publication, and are tagged as such to facilitate reproducibility of analysis. You can list all tags and their associated publications like this:

[6]:
# List and print all tags in the public database
tags = {t['name']: t['description'] for t in one.alyx.rest('tags', 'list') if t['public']}
for key, value in tags.items():
    print(f"{key}\n{value}\n")
2021_Q1_IBL_et_al_Behaviour
https://doi.org/10.7554/eLife.63711

2021_Q2_PreRelease
https://figshare.com/articles/online_resource/Spike_sorting_pipeline_for_the_International_Brain_Laboratory/19705522/3

2021_Q2_Varol_et_al
https://doi.org/10.1109/ICASSP39728.2021.9414145

2021_Q3_Whiteway_et_al
https://doi.org/10.1371/journal.pcbi.1009439

2022_Q2_IBL_et_al_RepeatedSite
https://doi.org/10.1101/2022.05.09.491042

2022_Q3_IBL_et_al_DAWG
https://doi.org/10.1101/827873

2022_Q4_IBL_et_al_BWM
https://figshare.com/articles/preprint/Data_release_-_Brainwide_map_-_Q4_2022/21400815

You can use the tag to filter when browsing the public database:

[7]:
# Note that tags are associated with datasets originally
# Find datasets that are tagged for the repeated site paper
datasets_rep_site = one.alyx.rest('datasets', 'list', tag='2022_Q2_IBL_et_al_RepeatedSite')

# Find sessions that have data and are tagged for the repeated site paper
# (you have access to the tag endpoint from the session list)
sessions_rep_site = one.alyx.rest('sessions', 'list', dataset_types='spikes.times', tag='2022_Q2_IBL_et_al_RepeatedSite')

# Find insertions that are tagged
# (you do not have access to the tag endpoint from the insertion list, so you need to create a django query)
ins_str_query = 'datasets__tags__name,2022_Q2_IBL_et_al_RepeatedSite'
insertions_rep_site = one.alyx.rest('insertions', 'list', django=ins_str_query)

However, if you are only interested in data with a specific tag, the cleanest approach is to follow these instructions to work with a tag-specific cache table.

Downloading data using the ONE-api

Once sessions of interest are identified with the unique identifier (EID), we can download all files in the alf collection:

[8]:
# Download all data in alf collection
files = one.load_collection(eid, 'alf', download_only=True)

# Show where files have been downloaded to
print(f'Files downloaded to {files[0].parent}')
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_rightCamera.times.npy: 100%|██████████| 6.63M/6.63M [00:00<00:00, 14.0MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_trials.table.pqt: 100%|██████████| 68.7k/68.7k [00:00<00:00, 415kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/rightROIMotionEnergy.position.npy: 100%|██████████| 160/160 [00:00<00:00, 1.61kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_bodyCamera.dlc.pqt: 100%|██████████| 2.91M/2.91M [00:00<00:00, 15.1MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_wheel.position.npy: 100%|██████████| 10.2M/10.2M [00:00<00:00, 17.1MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/licks.times.npy: 100%|██████████| 414k/414k [00:00<00:00, 2.85MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_wheelMoves.peakAmplitude.npy: 100%|██████████| 15.8k/15.8k [00:00<00:00, 141kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/bodyCamera.ROIMotionEnergy.npy: 100%|██████████| 1.33M/1.33M [00:00<00:00, 7.63MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_leftCamera.dlc.pqt: 100%|██████████| 53.3M/53.3M [00:01<00:00, 41.6MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_rightCamera.features.pqt: 100%|██████████| 13.5M/13.5M [00:00<00:00, 24.7MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/bodyROIMotionEnergy.position.npy: 100%|██████████| 160/160 [00:00<00:00, 1.34kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_trials.goCueTrigger_times.npy: 100%|██████████| 7.21k/7.21k [00:00<00:00, 64.6kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/leftCamera.ROIMotionEnergy.npy: 100%|██████████| 2.65M/2.65M [00:00<00:00, 11.4MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_wheel.timestamps.npy: 100%|██████████| 10.2M/10.2M [00:00<00:00, 26.5MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_leftCamera.times.npy: 100%|██████████| 2.65M/2.65M [00:00<00:00, 7.57MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_bodyCamera.times.npy: 100%|██████████| 1.33M/1.33M [00:00<00:00, 7.35MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_leftCamera.features.pqt: 100%|██████████| 5.62M/5.62M [00:00<00:00, 22.5MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/rightCamera.ROIMotionEnergy.npy: 100%|██████████| 6.63M/6.63M [00:00<00:00, 20.3MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/leftROIMotionEnergy.position.npy: 100%|██████████| 160/160 [00:00<00:00, 1.22kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_rightCamera.dlc.pqt: 100%|██████████| 116M/116M [00:01<00:00, 115MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_trials.stimOff_times.npy: 100%|██████████| 7.21k/7.21k [00:00<00:00, 51.9kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probes.description.json: 100%|██████████| 468/468 [00:00<00:00, 2.97kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_wheelMoves.intervals.npy: 100%|██████████| 31.4k/31.4k [00:00<00:00, 219kB/s]
Files downloaded to /home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf

To download the spikesorting data we need to find out which probe label (probeXX) was used for this session. This can be done by finding the probe insertion associated with this session

[9]:
insertion = one.alyx.rest('insertions', 'list', session=eid)[0]
probe_label = insertion['name']
files = one.load_collection(eid, f'alf/{probe_label}/pykilosort', download_only=True)

# Show where files have been downloaded to
print(f'Files downloaded to {files[0].parent}')
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.depths.npy: 100%|██████████| 1.93k/1.93k [00:00<00:00, 15.3kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.templates.npy: 100%|██████████| 25.3M/25.3M [00:00<00:00, 26.7MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.clusters.npy: 100%|██████████| 25.3M/25.3M [00:00<00:00, 63.7MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/channels.mlapdv.npy: 100%|██████████| 4.74k/4.74k [00:00<00:00, 29.3kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.metrics.pqt: 100%|██████████| 59.7k/59.7k [00:00<00:00, 449kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/drift.um.npy: 100%|██████████| 182k/182k [00:00<00:00, 1.21MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_phy_spikes_subset.waveforms.npy: 100%|██████████| 489M/489M [00:02<00:00, 186MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.channels.npy: 100%|██████████| 3.73k/3.73k [00:00<00:00, 34.4kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.amps.npy: 100%|██████████| 50.6M/50.6M [00:00<00:00, 122MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.uuids.csv: 100%|██████████| 16.7k/16.7k [00:00<00:00, 168kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_kilosort_whitening.matrix.npy: 100%|██████████| 1.18M/1.18M [00:00<00:00, 6.47MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_phy_spikes_subset.channels.npy: 100%|██████████| 2.98M/2.98M [00:00<00:00, 11.4MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.amps.npy: 100%|██████████| 3.73k/3.73k [00:00<00:00, 33.7kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/drift.times.npy: 100%|██████████| 20.3k/20.3k [00:00<00:00, 191kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.peakToTrough.npy: 100%|██████████| 3.73k/3.73k [00:00<00:00, 27.0kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/channels.brainLocationIds_ccf_2017.npy: 100%|██████████| 3.20k/3.20k [00:00<00:00, 25.5kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.waveformsChannels.npy: 100%|██████████| 57.7k/57.7k [00:00<00:00, 559kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.depths.npy: 100%|██████████| 50.6M/50.6M [00:00<00:00, 81.5MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_ibl_log.info_pykilosort.log: 100%|██████████| 3.52k/3.52k [00:00<00:00, 22.8kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/templates.waveformsChannels.npy: 100%|██████████| 57.7k/57.7k [00:00<00:00, 598kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/templates.amps.npy: 100%|██████████| 3.73k/3.73k [00:00<00:00, 30.3kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/drift_depths.um.npy: 100%|██████████| 200/200 [00:00<00:00, 1.94kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/templates.waveforms.npy: 100%|██████████| 4.72M/4.72M [00:00<00:00, 15.3MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.times.npy: 100%|██████████| 50.6M/50.6M [00:00<00:00, 129MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.waveforms.npy: 100%|██████████| 4.72M/4.72M [00:00<00:00, 20.4MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_phy_spikes_subset.spikes.npy: 100%|██████████| 186k/186k [00:00<00:00, 1.12MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/channels.rawInd.npy: 100%|██████████| 3.20k/3.20k [00:00<00:00, 24.6kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.samples.npy: 100%|██████████| 50.6M/50.6M [00:00<00:00, 117MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/channels.localCoordinates.npy: 100%|██████████| 3.20k/3.20k [00:00<00:00, 28.8kB/s]
Files downloaded to /home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort

To load in the data we can use some of the following loading methods

[10]:
# Load in all trials datasets
trials = one.load_object(eid, 'trials', collection='alf')

# Load in a single wheel dataset
wheel_times = one.load_dataset(eid, '_ibl_wheel.timestamps.npy')

Loading different objects

Examples for loading different objects can be found in the following tutorials here

Advanced examples

Example 1: Searching for sessions from a specific lab

Let’s imagine you are interested in obtaining the data from a given lab, that was part of the Reproducible Ephys data release. If you want to use data associated to a given lab only, you could simply query for the whole dataset as shown above, and filter sessions_rep_site for the key “lab” of a given value, for example:

[11]:
lab_name = 'mrsicflogellab'
sessions_lab = [item for item in sessions_rep_site if item['lab'] == lab_name]

However, if you wanted to query only the data for a given lab, it might be most judicious to first know the list of all labs available, select an arbitrary lab name from it, and query the specific sessions from it.

To get this list, use one.alyx.rest

[12]:
# List of labs (and all metadata information associated)
labs = one.alyx.rest('labs', 'list',
                     django='session__data_dataset_session_related__tags__name,2022_Q2_IBL_et_al_RepeatedSite')
# Note the change in the django filter compared to searching over 'sessions'

# Example lab name
lab_name = labs[0]['name']  # e.g. 'mrsicflogellab'

# Searching for RS sessions with specific lab name
sessions_lab = one.alyx.rest('sessions', 'list', dataset_types='spikes.times', lab=lab_name,
                             tag='2022_Q2_IBL_et_al_RepeatedSite')