Get to know the datasets and folder structure
What is the folder structure for experiment sessions
Generally speaking, all the data for a single experiment session fits into a folder, which name is characterised by: it’s lab name / a folder Subjects / the name of the subject (i.e. the mouse nickname) / the date / the session number.
mainenlab/Subjects/ZFM-01576/2020-12-01/001, which data can be browsed through here.
A lab can host multiple subjects, e.g. the Churchland lab hosts
There can be multiple sessions done in one day per subject, in such case the number of session
001would increase to
Sometimes, the valuable data is found only in a later session in the day (in the case of a restart for example), so it is not uncommon to see sessions for which only the
003folder is saved for example.
How are data files organised within a session folder
Within a session folder (such as mainenlab/Subjects/ZFM-01576/2020-12-01/001/) there will be multiple folders containing different kinds of data as explained below. An example layout of the folder structure is shown here:
subject/ ├─ 2020-12-01/ │ ├─ 001/ │ │ ├─ alf/ │ │ │ ├─ probeXX/ │ │ │ ├─ pykilosort/ │ │ ├─ raw_ephys_data │ │ │ ├─ probeXX/ │ │ ├─ raw_video_data
Generally speaking, the following subfolders will contain:
alf/: The extracted data, to be used in analysis (notably the spike sorting, trials and DLC data).
raw_ephys_data/: The raw ephys data (in this case, Neuropixels data)
raw_video_data/: The raw video data
raw_passive_data/: The raw passive data (events that occur during the replay of task stimuli)
raw_behavior_data/: The raw behavior data (events that occur during a trials)
spike_sorters/: The raw output data for each spike sorter used
logs/: logged information
Analysis is conducted mainly on the data contained in the first three subfolders, i.e.
raw_video_data/, which the content of is detailed below.
Some data (e.g.
raw_passive_data/) may not be available for all sessions.
Processed data: alf folder
Everything contained within the
alf folder is processed data, with all times synchronised to a common clock. This folder contains the data that is required for the majority of all analysis. All files in the alf folders follow the Alyx File naming convention, where related datasets are grouped into a common object e.g trials or wheel.
alf/ folder notably contains:
the probe folder (
probe01/), in which are the processed output of the spike sorting to be used for analysis
the processed behavior trials data
the processed DLC data (for each camera used, here
the processed wheel data
the processed passive protocol data
Below is breakdown of the dataset objects contained in each folder.
Download an example alf folder
An example of the files contained in a sample alf folder can be downloaded by clicking here
Datasets in alf
alf folder contains the processed behaviour, wheel and video data. Browse the documentation detailing these datasets and how to load them:
Datasets in alf/probeXX/pykilosort
alf/probeXX/ folder contains the spike sorted data. Typically, only one version of spike sorting output is available, and stored in a folder named
pykilosort (see example).
alf/probeXX/can contain the output of multiple spike sorters.
In such a case, it would contain a first spike sorter output directly into the folder itself (see e.g. if there is the
clusterdatasets directly in the folder
alf/probeXX/), and a secondary version under a subfolder (here the subfolder named
In the case of multiple spike sorting version being available, the data loading methods use the default version from
pykilosort(see loading example).
Browse the documentation detailing the spike sorting datasets and how to load them:
Data stored in folders with the prefix
raw, contain original data collected from each recording device (e.g Neuropixel probe or camera). Data in these folders are in the clock of the recording device and are not synchronised.
Notably, the raw electrophysiology data that has been used to obtain the spikesorting and the raw video data that has been used to extract DLC features are available for most sessions.
A summary of the data contained in each folder is given below.
Datasets in raw_ephys_data
raw_ephys_data folder contains the synchronisation data recorded from the NIDAQ device via the software SpikeGLX.
For recordings obtained with 3A Neuropixel probes this folder is empty and synchronisation pulses are stored in the raw_ephys_data/probeXX folder
Browse the documentation detailing the SpikeGLX datasets and how to load them:
Datasets in raw_ephys_data/probeXX
raw_ephys_data/probeXX folder contains the raw electrophysiology data acquired on a given probe.
These datasets have a large data size !
It is possible to download only (smaller-sized) chunks of the raw ephys data, rather than the whole file at once (cf loading example below)
Browse the documentation detailing the raw ephys datasets and how to load them:
_spikeglx_ephysData*.ap, _spikeglx_ephysData*.lf -> raw ephys data in AP and LFP band recorded using spikeglx
ephysTimeRmsAP, ephysTimeRmsLF -> rms noise in AP and LFP band across recording
ephysSpectralDensityAP, ephysSpectralDensityLF -> power spectrum in AP and LFP band
Datasets in raw_video_data
raw_video_data folder contains the raw camera data for each of the camera (e.g.
These dataset have a large data size !
It is possible to download only selected frames of the raw video data, rather than the whole file at once (cf loading example below)
You can view the raw video data in the browser by clicking on it, e.g. _iblrig_leftCamera.raw.
Browse the documentation detailing the raw video datasets and how to load them: