Introduction to IBL data

Data organisation

IBL data is structured using the ALyx File (ALF) name specification. For more details about ALF see this section. Here we describe some key ALF concepts in the context of the IBL data.

The following file structure shows a subset of the datasets used in IBL. For a full description of all datasets please see this document

lab/
├─ Subjects/
|  ├─ subject/
|  |  ├─ 2021-06-30/
|  |  │  ├─ 001/
|  |  │  │  ├─ alf/
|  |  │  │  │  ├─ probe00/
|  |  │  │  │  │  ├─ spikes.clusters.npy
|  |  │  │  │  │  ├─ spikes.times.npy
|  |  │  │  │  ├─ probe01/
|  |  │  │  │  │  ├─ #2021-07-05#/
|  |  │  │  │  │  │  ├─ spikes.clusters.npy
|  |  │  │  │  │  │  ├─ spikes.times.npy
|  |  │  │  │  │  ├─ spikes.clusters.npy
|  |  │  │  │  │  ├─ spikes.times.npy
|  |  │  │  |  ├─ _ibl_trials.intervals.npy
|  |  │  │  |  ├─ probes.description.json

Session Folder

Each experiment has a unique session folder which has the following pattern,

lab/Subjects/subject/date/session_number
e.g cortexlab/Subjects/KS023/2020-03-04/001

Collections

Within a session folder, files are organised into a number of sub-folders, called collections. These are used to group identical datasets by device or preprocessing sofware. For example spikes collected on two different probes are grouped into the collections,

alf/probe00
alf/probe01

Revisions

Revisions are used when one or more versions of the same dataset, with slighlty different preprocessing, exist. Revision folders are denoted by a date surrounded by a pound sign. In the example file strucutre shown above, the dataset spikes.times.npy has two versions,

alf/probe01/spikes.times.npy
alf/probe01/#2021-07-05#/spikes.times.npy

Filenames

The file names contain at least two components; an object and an attribute, separated by a period. For example a file called trials.intervals represents the trials object with an intervals attribute, while a file called probes.description has the probes objects with the description attribute. All attributes belonging to the same object are identical size.

Namespaces

You will notice that some of the filenames begin with an underscore, for example _ibl_trials.intervals.npy. The pattern _ibl_ is referred to as a namespace and is used for datasets that are not intended to be standard in the community, but are specific to the IBL.

Experiment IDs

Each experimental session is identified by a unique identifier known as an experiment ID (eid). The most common way of representing eids within the IBL are,

(str) : An experiment UUID as a string
e.g  '
(str) : A session string of the form <subject>/<date>/<number>
e.g 'KS023/2019-12-10/001'
(Path) : A pathlib ALF path of the form <lab>/Subjects/<subject>/<date>/<number>
e.g Path(r'cortexlab/Subjects/KS023/2019-12-10/001')

ONE also supports a range of other experimental session representations. Please see this tutorial for more details and for useful conversion tricks.