Download the public datasets
To get a feel for the structure of the data we recommended first downloading the alf data for a single repeated site session and exploring how the data is stored locally on disk. An example alf folder can be downloaded from here. Documentation explaining the data structure can be found here.
In the following sections, we explain how to use the ONE-api to search for and download datasets for any session released. Using the ONE-api is the recommended method to browse through and download available datasets.
Installation
Environment
To use IBL data you will need a python environment with python > 3.7. To create a new environment from scratch you can install anaconda and follow the instructions below to create a new python environment (more information can also be found here)
conda create --name ibl python=3.9
Make sure to always activate this environment before installing or working with the IBL data
conda activate ibl
Install packages
To use IBL data you will need to install the ONE-api package. We also recommend installing ibllib. These can be installed via pip.
pip install ONE-api
pip install ibllib
Setting up credentials
Credentials can be setup in a python terminal in the following way
[1]:
from one.api import ONE
pw = 'international'
one = ONE(base_url='https://openalyx.internationalbrainlab.org', password=pw, silent=True)
Explore and download data using the ONE-api
Useful links
To get a good understanding of the ONE-api and the various methods available we recommend working through these tutorials.
Quick-start examples are given below.
Launch the ONE-api
Prior to do any searching / downloading, you need to instantiate ONE :
[2]:
from one.api import ONE
one = ONE(base_url='https://openalyx.internationalbrainlab.org')
List all sessions available
Once ONE is instantiated, you can use the REST ONE-api to list all sessions publicly available:
[3]:
sessions = one.alyx.rest('sessions', 'list')
Each session is given a unique identifier (EID); this EID is what you will use to download data for a given session:
[4]:
# Take the first session
example_sess = sessions[0]
# Each session has a unique experiment id
eid = example_sess['id']
Find a session that has a dataset of interest
Not all sessions will have all the datasets available. As such, it may be important for you to filter and search for only sessions with particular datasets of interest. The detailed list of datasets can be found in this document.
In the example below, we want to find all sessions that have spikes.times
data:
[5]:
# Find sessions that have spikes.times datasets
sessions_with_spikes = one.alyx.rest('sessions', 'list', dataset_types='spikes.times')
Find data associated with a release or publication
Datasets are often associated to a publication, and are tagged as such to facilitate reproducibility of analysis. You can list all tags and their associated publications like this:
[6]:
# List and print all tags in the public database
tags = {t['name']: t['description'] for t in one.alyx.rest('tags', 'list') if t['public']}
for key, value in tags.items():
print(f"{key}\n{value}\n")
2021_Q1_IBL_et_al_Behaviour
https://doi.org/10.7554/eLife.63711
2021_Q2_PreRelease
https://figshare.com/articles/online_resource/Spike_sorting_pipeline_for_the_International_Brain_Laboratory/19705522/3
2021_Q2_Varol_et_al
https://doi.org/10.1109/ICASSP39728.2021.9414145
2021_Q3_Whiteway_et_al
https://doi.org/10.1371/journal.pcbi.1009439
2022_Q2_IBL_et_al_RepeatedSite
https://doi.org/10.1101/2022.05.09.491042
2022_Q3_IBL_et_al_DAWG
https://doi.org/10.1101/827873
2022_Q4_IBL_et_al_BWM
https://figshare.com/articles/preprint/Data_release_-_Brainwide_map_-_Q4_2022/21400815
You can use the tag to filter when browsing the public database:
[7]:
# Note that tags are associated with datasets originally
# Find datasets that are tagged for the repeated site paper
datasets_rep_site = one.alyx.rest('datasets', 'list', tag='2022_Q2_IBL_et_al_RepeatedSite')
# Find sessions that have data and are tagged for the repeated site paper
# (you have access to the tag endpoint from the session list)
sessions_rep_site = one.alyx.rest('sessions', 'list', dataset_types='spikes.times', tag='2022_Q2_IBL_et_al_RepeatedSite')
# Find insertions that are tagged
# (you do not have access to the tag endpoint from the insertion list, so you need to create a django query)
ins_str_query = 'datasets__tags__name,2022_Q2_IBL_et_al_RepeatedSite'
insertions_rep_site = one.alyx.rest('insertions', 'list', django=ins_str_query)
However, if you are only interested in data with a specific tag, the cleanest approach is to follow these instructions to work with a tag-specific cache table.
Downloading data using the ONE-api
Once sessions of interest are identified with the unique identifier (EID), we can download all files in the alf collection:
[8]:
# Download all data in alf collection
files = one.load_collection(eid, 'alf', download_only=True)
# Show where files have been downloaded to
print(f'Files downloaded to {files[0].parent}')
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_rightCamera.times.npy: 100%|██████████| 6.63M/6.63M [00:00<00:00, 14.0MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_trials.table.pqt: 100%|██████████| 68.7k/68.7k [00:00<00:00, 415kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/rightROIMotionEnergy.position.npy: 100%|██████████| 160/160 [00:00<00:00, 1.61kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_bodyCamera.dlc.pqt: 100%|██████████| 2.91M/2.91M [00:00<00:00, 15.1MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_wheel.position.npy: 100%|██████████| 10.2M/10.2M [00:00<00:00, 17.1MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/licks.times.npy: 100%|██████████| 414k/414k [00:00<00:00, 2.85MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_wheelMoves.peakAmplitude.npy: 100%|██████████| 15.8k/15.8k [00:00<00:00, 141kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/bodyCamera.ROIMotionEnergy.npy: 100%|██████████| 1.33M/1.33M [00:00<00:00, 7.63MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_leftCamera.dlc.pqt: 100%|██████████| 53.3M/53.3M [00:01<00:00, 41.6MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_rightCamera.features.pqt: 100%|██████████| 13.5M/13.5M [00:00<00:00, 24.7MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/bodyROIMotionEnergy.position.npy: 100%|██████████| 160/160 [00:00<00:00, 1.34kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_trials.goCueTrigger_times.npy: 100%|██████████| 7.21k/7.21k [00:00<00:00, 64.6kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/leftCamera.ROIMotionEnergy.npy: 100%|██████████| 2.65M/2.65M [00:00<00:00, 11.4MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_wheel.timestamps.npy: 100%|██████████| 10.2M/10.2M [00:00<00:00, 26.5MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_leftCamera.times.npy: 100%|██████████| 2.65M/2.65M [00:00<00:00, 7.57MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_bodyCamera.times.npy: 100%|██████████| 1.33M/1.33M [00:00<00:00, 7.35MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_leftCamera.features.pqt: 100%|██████████| 5.62M/5.62M [00:00<00:00, 22.5MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/rightCamera.ROIMotionEnergy.npy: 100%|██████████| 6.63M/6.63M [00:00<00:00, 20.3MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/leftROIMotionEnergy.position.npy: 100%|██████████| 160/160 [00:00<00:00, 1.22kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_rightCamera.dlc.pqt: 100%|██████████| 116M/116M [00:01<00:00, 115MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_trials.stimOff_times.npy: 100%|██████████| 7.21k/7.21k [00:00<00:00, 51.9kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probes.description.json: 100%|██████████| 468/468 [00:00<00:00, 2.97kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/_ibl_wheelMoves.intervals.npy: 100%|██████████| 31.4k/31.4k [00:00<00:00, 219kB/s]
Files downloaded to /home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf
To download the spikesorting data we need to find out which probe label (probeXX
) was used for this session. This can be done by finding the probe insertion associated with this session
[9]:
insertion = one.alyx.rest('insertions', 'list', session=eid)[0]
probe_label = insertion['name']
files = one.load_collection(eid, f'alf/{probe_label}/pykilosort', download_only=True)
# Show where files have been downloaded to
print(f'Files downloaded to {files[0].parent}')
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.depths.npy: 100%|██████████| 1.93k/1.93k [00:00<00:00, 15.3kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.templates.npy: 100%|██████████| 25.3M/25.3M [00:00<00:00, 26.7MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.clusters.npy: 100%|██████████| 25.3M/25.3M [00:00<00:00, 63.7MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/channels.mlapdv.npy: 100%|██████████| 4.74k/4.74k [00:00<00:00, 29.3kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.metrics.pqt: 100%|██████████| 59.7k/59.7k [00:00<00:00, 449kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/drift.um.npy: 100%|██████████| 182k/182k [00:00<00:00, 1.21MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_phy_spikes_subset.waveforms.npy: 100%|██████████| 489M/489M [00:02<00:00, 186MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.channels.npy: 100%|██████████| 3.73k/3.73k [00:00<00:00, 34.4kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.amps.npy: 100%|██████████| 50.6M/50.6M [00:00<00:00, 122MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.uuids.csv: 100%|██████████| 16.7k/16.7k [00:00<00:00, 168kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_kilosort_whitening.matrix.npy: 100%|██████████| 1.18M/1.18M [00:00<00:00, 6.47MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_phy_spikes_subset.channels.npy: 100%|██████████| 2.98M/2.98M [00:00<00:00, 11.4MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.amps.npy: 100%|██████████| 3.73k/3.73k [00:00<00:00, 33.7kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/drift.times.npy: 100%|██████████| 20.3k/20.3k [00:00<00:00, 191kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.peakToTrough.npy: 100%|██████████| 3.73k/3.73k [00:00<00:00, 27.0kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/channels.brainLocationIds_ccf_2017.npy: 100%|██████████| 3.20k/3.20k [00:00<00:00, 25.5kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.waveformsChannels.npy: 100%|██████████| 57.7k/57.7k [00:00<00:00, 559kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.depths.npy: 100%|██████████| 50.6M/50.6M [00:00<00:00, 81.5MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_ibl_log.info_pykilosort.log: 100%|██████████| 3.52k/3.52k [00:00<00:00, 22.8kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/templates.waveformsChannels.npy: 100%|██████████| 57.7k/57.7k [00:00<00:00, 598kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/templates.amps.npy: 100%|██████████| 3.73k/3.73k [00:00<00:00, 30.3kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/drift_depths.um.npy: 100%|██████████| 200/200 [00:00<00:00, 1.94kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/templates.waveforms.npy: 100%|██████████| 4.72M/4.72M [00:00<00:00, 15.3MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.times.npy: 100%|██████████| 50.6M/50.6M [00:00<00:00, 129MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/clusters.waveforms.npy: 100%|██████████| 4.72M/4.72M [00:00<00:00, 20.4MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/_phy_spikes_subset.spikes.npy: 100%|██████████| 186k/186k [00:00<00:00, 1.12MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/channels.rawInd.npy: 100%|██████████| 3.20k/3.20k [00:00<00:00, 24.6kB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/spikes.samples.npy: 100%|██████████| 50.6M/50.6M [00:00<00:00, 117MB/s]
/home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort/channels.localCoordinates.npy: 100%|██████████| 3.20k/3.20k [00:00<00:00, 28.8kB/s]
Files downloaded to /home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/steinmetzlab/Subjects/NR_0027/2022-08-23/001/alf/probe00/pykilosort
To load in the data we can use some of the following loading methods
[10]:
# Load in all trials datasets
trials = one.load_object(eid, 'trials', collection='alf')
# Load in a single wheel dataset
wheel_times = one.load_dataset(eid, '_ibl_wheel.timestamps.npy')
Loading different objects
Examples for loading different objects can be found in the following tutorials here
Advanced examples
Example 1: Searching for sessions from a specific lab
Let’s imagine you are interested in obtaining the data from a given lab, that was part of the Reproducible Ephys data release. If you want to use data associated to a given lab only, you could simply query for the whole dataset as shown above, and filter sessions_rep_site
for the key “lab” of a given value, for example:
[11]:
lab_name = 'mrsicflogellab'
sessions_lab = [item for item in sessions_rep_site if item['lab'] == lab_name]
However, if you wanted to query only the data for a given lab, it might be most judicious to first know the list of all labs available, select an arbitrary lab name from it, and query the specific sessions from it.
To get this list, use one.alyx.rest
[12]:
# List of labs (and all metadata information associated)
labs = one.alyx.rest('labs', 'list',
django='session__data_dataset_session_related__tags__name,2022_Q2_IBL_et_al_RepeatedSite')
# Note the change in the django filter compared to searching over 'sessions'
# Example lab name
lab_name = labs[0]['name'] # e.g. 'mrsicflogellab'
# Searching for RS sessions with specific lab name
sessions_lab = one.alyx.rest('sessions', 'list', dataset_types='spikes.times', lab=lab_name,
tag='2022_Q2_IBL_et_al_RepeatedSite')