Data Release - Spike sorting benchmark datasets
Spike sorting is the process by which one extracts the spikes information (times, waveforms) from the raw electrophysiology signals. In the case of our Brainwide map dataset, such electrophysiology signals are heterogeneous, changing from one brain region to another. By looking at recordings in widespread anatomical regions we discovered that the current spike sorting method has a limited range of applicability. Specifically, we uncovered two major issues. Firstly, there is no one-size-fit-all: an algorithm that is well tailored to extract signals for a given region or electrode type may perform poorly for another. Secondly, and most importantly, we had very limited means to conclude on the performance of such algorithms beyond eyes on the data and qualitative judgements.
In order to facilitate the development of spike sorting algorithms, we aim to provide benchmarks datasets (for a full explanation, see our proposal Spike Net). Here, we explain how to download these datasets, taken to represent various anatomical regions from our Brainwide map.
Overview of the Data
We have selected 13 recordings for our benchmarks.
The insertion IDs are :
pids = [ '1a276285-8b0e-4cc9-9f0a-a3a002978724', '1e104bf4-7a24-4624-a5b2-c2c8289c0de7', '5d570bf6-a4c6-4bf1-a14b-2c878c84ef0e', '5f7766ce-8e2e-410c-9195-6bf089fea4fd', '6638cfb3-3831-4fc2-9327-194b76cf22e1', '749cb2b7-e57e-4453-a794-f6230e4d0226', 'd7ec0892-0a6c-4f4f-9d8f-72083692af5c', 'da8dfec1-d265-44e8-84ce-6ae9c109b8bd', 'dab512bd-a02d-4c1f-8dbc-9155a163efc0', 'dc7e9403-19f7-409f-9240-05ee57cb7aea', 'e8f9fba4-d151-4b00-bee7-447f0f3e752c', 'eebcaf65-7fa4-4118-869d-a084e84530e2', 'fe380793-8035-414e-b000-09bfe5ece92a', ]
(Coming soon) In 2023, we will provide 20-30min chunks of raw electrophysiology data processed in a standardised fashion, with manual annotations of spikes and quality metrics for spike detection recall. These will serve as benchmarks for spike sorting algorithm development. In the meantime, you can familiarise yourself with the data heterogeneity by looking at the whole recordings.
View the data
You can view the whole electrophysiology data:
Data structure and download
The organisation of the data follows the standard IBL data structure.
These instructions to download an example dataset for one session, and get familiarised with the data structure
These instructions to learn how to use the ONE-api to search and download the released datasets
These instructions to get familiarised with specific data loading functions
The tag associated to this release is (coming up soon)