Data organisation

IBL data is structured using the ALyx File (ALF) name specification. Please see this section for a detailed introduction to the ALF format (especially the sections on Collections, Revisions and Namespaces).

Here we further describe some key ALF concepts in the context of the IBL data.

Session folder

As per the ALF convention, each experiment has a unique session folder which has the following pattern,

lab/Subjects/subject/date/session_number
e.g cortexlab/Subjects/KS023/2020-03-04/001

Datasets and data file structure

The following file structure shows a subset of the datasets used in IBL. For a full description of all datasets please see this document.

lab/
├─ Subjects/
|  ├─ subject/
|  |  ├─ 2021-06-30/
|  |  │  ├─ 001/
|  |  │  │  ├─ alf/
|  |  │  │  │  ├─ probe00/
|  |  │  │  │  │  ├─ spikes.clusters.npy
|  |  │  │  │  │  ├─ spikes.times.npy
|  |  │  │  │  ├─ probe01/
|  |  │  │  │  │  ├─ #2021-07-05#/
|  |  │  │  │  │  │  ├─ spikes.clusters.npy
|  |  │  │  │  │  │  ├─ spikes.times.npy
|  |  │  │  │  │  ├─ spikes.clusters.npy
|  |  │  │  │  │  ├─ spikes.times.npy
|  |  │  │  |  ├─ _ibl_trials.intervals.npy
|  |  │  │  |  ├─ probes.description.json

As per the ALF convention, a few things should be noticed: - spikes collected on two different probes are grouped into two separate collection folders:

alf/probe00
alf/probe01
  • the usage of the namespace _ibl_ in _ibl_trials.intervals.npy indicates that this dataset is specific to the IBL and not a standard in the broad scientific community.

  • the usage of a revision folder for storing a new spike sorting output:

alf/probe01/#2021-07-05#/spikes.clusters.npy
alf/probe01/spikes.clusters.npy

Experiment IDs

Each experimental session is identified by a unique identifier known as an experiment ID (eid). The most common way of representing eids within the IBL are,

(str) : An experiment UUID as a string
e.g  '
(str) : A session string of the form <subject>/<date>/<number>
e.g 'KS023/2019-12-10/001'
(Path) : A pathlib ALF path of the form <lab>/Subjects/<subject>/<date>/<number>
e.g Path(r'cortexlab/Subjects/KS023/2019-12-10/001')

This unique identifier is extremely important, as it enables scientists to search for and download the data associated with a specific experimental session stored in the database.

For more information on data search and download, please visit the ONE section. Note: ONE also supports a range of other experimental session representations. Please see this tutorial for more details on session representation and for useful conversion tricks.