brainbox.population.decode

Population functions.

Code from https://github.com/cortex-lab/phylib/blob/master/phylib/stats/ccg.py by C. Rossant. Code for decoding by G. Meijer Code from sigtest_pseudosessions and sigtest_linshift by B. Benson

Functions

`classify`	Classify trial identity (e.g. stim left/right) from neural population activity.
`get_spike_counts_in_bins`	Return the number of spikes in a sequence of time intervals, for each neuron.
`lda_project`	Use linear discriminant analysis to project population vectors to the line that best separates the two groups.
`regress`	Perform linear regression to predict a continuous variable from neural data
`sigtest_linshift`	Uses a provably conservative Linear Shift technique (Harris, Kenneth Arxiv 2021, https://arxiv.org/ftp/arxiv/papers/2012/2012.06862.pdf) to estimate significance level of a statistical measure.
`sigtest_pseudosessions`	Estimates significance level of any statistical measure following Harris, Arxiv, 2021 (https://www.biorxiv.org/content/10.1101/2020.11.29.402719v2).
`xcorr`	Compute all pairwise cross-correlograms among the clusters appearing in spike_clusters.

get_spike_counts_in_bins(spike_times, spike_clusters, intervals)[source]

Return the number of spikes in a sequence of time intervals, for each neuron.

Parameters:

spike_times (1D array) – spike times (in seconds)
spike_clusters (1D array) – cluster ids corresponding to each event in spikes
intervals (2D array of shape (n_events, 2)) – the start and end times of the events

Returns:

counts (2D array of shape (n_neurons, n_events)) – the spike counts of all neurons ffrom scipy.stats import sem, tor all events value (i, j) is the number of spikes of neuron neurons[i] in interval #j
cluster_ids (1D array) – list of cluster ids

xcorr(spike_times, spike_clusters, bin_size=None, window_size=None)[source]

Compute all pairwise cross-correlograms among the clusters appearing in spike_clusters.

:param : :type : param spike_times: Spike times in seconds. :param : :type : type spike_times: array-like :param : :type : param spike_clusters: Spike-cluster mapping. :param : :type : type spike_clusters: array-like :param : :type : param bin_size: Size of the bin, in seconds. :param : :type : type bin_size: float :param : :type : param window_size: Size of the window, in seconds. :param : :type : type window_size: float :param Returns an (n_clusters: :param n_clusters: :param winsize_samples) array with all pairwise: :param cross-correlograms.:

classify(population_activity, trial_labels, classifier, cross_validation=None, return_training=False)[source]

Classify trial identity (e.g. stim left/right) from neural population activity.

Parameters:

population_activity (2D array (trials x neurons)) – population activity of all neurons in the population for each trial.
trial_labels (1D or 2D array) – identities of the trials, can be any number of groups, accepts integers and strings
classifier (scikit-learn object) –

which decoder to use, for example Gaussian with Multinomial likelihood:
from sklearn.naive_bayes import MultinomialNB classifier = MultinomialNB()
cross_validation (None or scikit-learn object) –

which cross-validation method to use, for example 5-fold:
from sklearn.model_selection import KFold cross_validation = KFold(n_splits=5)
return_training (bool) – if set to True the classifier will also return the performance on the training set

Returns:

accuracy (float) – accuracy of the classifier
pred (1D array) – predictions of the classifier
prob (1D array) – probablity of classification
training_accuracy (float) – accuracy of the classifier on the training set (only if return_training is True)

regress(population_activity, trial_targets, regularization=None, cross_validation=None, return_training=False)[source]

Perform linear regression to predict a continuous variable from neural data

Parameters:

population_activity (2D array (trials x neurons)) – population activity of all neurons in the population for each trial.
trial_targets (1D or 2D array) – the decoding target per trial as a continuous variable
regularization (None or string) – None = no regularization using ordinary least squares linear regression ‘L1’ = L1 regularization using Lasso ‘L2’ = L2 regularization using Ridge regression
cross_validation (None or scikit-learn object) –

which cross-validation method to use, for example 5-fold:
from sklearn.model_selection import KFold cross_validation = KFold(n_splits=5)
return_training (bool) – if set to True the classifier will also return the performance on the training set

Returns:

pred (1D array) – array with predictions
pred_training (1D array) – array with predictions for the training set (only if return_training is True)

lda_project(spike_times, spike_clusters, event_times, event_groups, pre_time=0, post_time=0.5, cross_validation='kfold', num_splits=5, prob_left=None, custom_validation=None)[source]

Use linear discriminant analysis to project population vectors to the line that best separates the two groups. When cross-validation is used, the LDA projection is fitted on the training data after which the test data is projected to this projection.

spike_times1D array

spike times (in seconds)

spike_clusters1D array

cluster ids corresponding to each event in spikes

event_times1D array

times (in seconds) of the events from the two groups

event_groups1D array

group identities of the events, can be any number of groups, accepts integers and strings

cross_validationstring

which cross-validation method to use, options are:: ‘none’ No cross-validation ‘kfold’ K-fold cross-validation ‘leave-one-out’ Leave out the trial that is being decoded ‘block’ Leave out the block the to-be-decoded trial is in ‘custom’ Any custom cross-validation provided by the user

num_splitsinteger

** only for ‘kfold’ cross-validation ** Number of splits to use for k-fold cross validation, a value of 5 means that the decoder will be trained on 4/5th of the data and used to predict the remaining 1/5th. This process is repeated five times so that all data has been used as both training and test set.

prob_left1D array

** only for ‘block’ cross-validation ** the probability of the stimulus appearing on the left for each trial in event_times

custom_validationgenerator

** only for ‘custom’ cross-validation ** a generator object with the splits to be used for cross validation using this format:

(

(split1_train_idxs, split1_test_idxs), (split2_train_idxs, split2_test_idxs), (split3_train_idxs, split3_test_idxs),

…)

n_neuronsint

Group size of number of neurons to be sub-selected

Returns:: lda_projection – the position along the LDA projection axis for the population vector of each trial
Return type:: 1D array

sigtest_pseudosessions(X, y, fStatMeas, genPseudo, npseuds=200)[source]

Estimates significance level of any statistical measure following Harris, Arxiv, 2021 (https://www.biorxiv.org/content/10.1101/2020.11.29.402719v2). fStatMeas computes a scalar statistical measure (e.g. R^2) between the data, X, and the decoded variable, y. pseudosessions are generated npseuds times to create a null distribution of statistical measures. Significance level is reported relative to this null distribution.

X2-d array: Data of size (elements, timetrials)
y1-d array: predicted variable of size (timetrials)
fStatMeasfunction: takes arguments (X, y) and returns a statistical measure relating how well X decodes y
genPseudofunction: takes no arguments () and returns a pseudosession (same shape as y) drawn from the experimentally known null-distribution of y
npseudsint: the number of pseudosessions used to estimate the significance level

Returns:

alpha (p-value e.g. at a significance level of b, if alpha <= b then reject the null) – hypothesis.
statms_real (the value of the statistical measure evaluated on X and y)
statms_pseuds (array of statistical measures evaluated on pseudosessions)

sigtest_linshift(X, y, fStatMeas, D=300)[source]

Uses a provably conservative Linear Shift technique (Harris, Kenneth Arxiv 2021, https://arxiv.org/ftp/arxiv/papers/2012/2012.06862.pdf) to estimate significance level of a statistical measure. fStatMeas computes a scalar statistical measure (e.g. R^2) from the data matrix, X, and the variable, y. A central window of X and y of size, D, is linearly shifted to generate a null distribution of statistical measures. Significance level is reported relative to this null distribution.

X2-d array: Data of size (elements, timetrials)
y1-d array: predicted variable of size (timetrials)
fStatMeasfunction: takes arguments (X, y) and returns a scalar statistical measure of how well X decodes y
Dint: the window length along the center of y used to compute the statistical measure. must have room to shift both right and left: len(y) >= D+2

Returns:

alpha (conservative p-value e.g. at a significance level of b, if alpha <= b then reject the) – null hypothesis.
statms_real (the value of the statistical measure evaluated on X and y)
statms_pseuds (a 1-d array of statistical measures evaluated on shifted versions of y)