brainbox.population.decode

Population functions.

Code from https://github.com/cortex-lab/phylib/blob/master/phylib/stats/ccg.py by C. Rossant. Code for decoding by G. Meijer Code from sigtest_pseudosessions and sigtest_linshift by B. Benson

Functions

classify

Classify trial identity (e.g. stim left/right) from neural population activity.

get_spike_counts_in_bins

Return the number of spikes in a sequence of time intervals, for each neuron.

lda_project

Use linear discriminant analysis to project population vectors to the line that best separates the two groups.

regress

Perform linear regression to predict a continuous variable from neural data

sigtest_linshift

Uses a provably conservative Linear Shift technique (Harris, Kenneth Arxiv 2021, https://arxiv.org/ftp/arxiv/papers/2012/2012.06862.pdf) to estimate significance level of a statistical measure.

sigtest_pseudosessions

Estimates significance level of any statistical measure following Harris, Arxiv, 2021 (https://www.biorxiv.org/content/10.1101/2020.11.29.402719v2).

xcorr

Compute all pairwise cross-correlograms among the clusters appearing in spike_clusters.

get_spike_counts_in_bins(spike_times, spike_clusters, intervals)[source]

Return the number of spikes in a sequence of time intervals, for each neuron.

Parameters:
  • spike_times (1D array) – spike times (in seconds)

  • spike_clusters (1D array) – cluster ids corresponding to each event in spikes

  • intervals (2D array of shape (n_events, 2)) – the start and end times of the events

Returns:

  • counts (2D array of shape (n_neurons, n_events)) – the spike counts of all neurons ffrom scipy.stats import sem, tor all events value (i, j) is the number of spikes of neuron neurons[i] in interval #j

  • cluster_ids (1D array) – list of cluster ids

xcorr(spike_times, spike_clusters, bin_size=None, window_size=None)[source]

Compute all pairwise cross-correlograms among the clusters appearing in spike_clusters.

:param : :type : param spike_times: Spike times in seconds. :param : :type : type spike_times: array-like :param : :type : param spike_clusters: Spike-cluster mapping. :param : :type : type spike_clusters: array-like :param : :type : param bin_size: Size of the bin, in seconds. :param : :type : type bin_size: float :param : :type : param window_size: Size of the window, in seconds. :param : :type : type window_size: float :param Returns an (n_clusters: :param n_clusters: :param winsize_samples) array with all pairwise: :param cross-correlograms.:

classify(population_activity, trial_labels, classifier, cross_validation=None, return_training=False)[source]

Classify trial identity (e.g. stim left/right) from neural population activity.

Parameters:
  • population_activity (2D array (trials x neurons)) – population activity of all neurons in the population for each trial.

  • trial_labels (1D or 2D array) – identities of the trials, can be any number of groups, accepts integers and strings

  • classifier (scikit-learn object) –

    which decoder to use, for example Gaussian with Multinomial likelihood:

    from sklearn.naive_bayes import MultinomialNB classifier = MultinomialNB()

  • cross_validation (None or scikit-learn object) –

    which cross-validation method to use, for example 5-fold:

    from sklearn.model_selection import KFold cross_validation = KFold(n_splits=5)

  • return_training (bool) – if set to True the classifier will also return the performance on the training set

Returns:

  • accuracy (float) – accuracy of the classifier

  • pred (1D array) – predictions of the classifier

  • prob (1D array) – probablity of classification

  • training_accuracy (float) – accuracy of the classifier on the training set (only if return_training is True)

regress(population_activity, trial_targets, regularization=None, cross_validation=None, return_training=False)[source]

Perform linear regression to predict a continuous variable from neural data

Parameters:
  • population_activity (2D array (trials x neurons)) – population activity of all neurons in the population for each trial.

  • trial_targets (1D or 2D array) – the decoding target per trial as a continuous variable

  • regularization (None or string) – None = no regularization using ordinary least squares linear regression ‘L1’ = L1 regularization using Lasso ‘L2’ = L2 regularization using Ridge regression

  • cross_validation (None or scikit-learn object) –

    which cross-validation method to use, for example 5-fold:

    from sklearn.model_selection import KFold cross_validation = KFold(n_splits=5)

  • return_training (bool) – if set to True the classifier will also return the performance on the training set

Returns:

  • pred (1D array) – array with predictions

  • pred_training (1D array) – array with predictions for the training set (only if return_training is True)

lda_project(spike_times, spike_clusters, event_times, event_groups, pre_time=0, post_time=0.5, cross_validation='kfold', num_splits=5, prob_left=None, custom_validation=None)[source]

Use linear discriminant analysis to project population vectors to the line that best separates the two groups. When cross-validation is used, the LDA projection is fitted on the training data after which the test data is projected to this projection.

spike_times1D array

spike times (in seconds)

spike_clusters1D array

cluster ids corresponding to each event in spikes

event_times1D array

times (in seconds) of the events from the two groups

event_groups1D array

group identities of the events, can be any number of groups, accepts integers and strings

cross_validationstring
which cross-validation method to use, options are:

‘none’ No cross-validation ‘kfold’ K-fold cross-validation ‘leave-one-out’ Leave out the trial that is being decoded ‘block’ Leave out the block the to-be-decoded trial is in ‘custom’ Any custom cross-validation provided by the user

num_splitsinteger

** only for ‘kfold’ cross-validation ** Number of splits to use for k-fold cross validation, a value of 5 means that the decoder will be trained on 4/5th of the data and used to predict the remaining 1/5th. This process is repeated five times so that all data has been used as both training and test set.

prob_left1D array

** only for ‘block’ cross-validation ** the probability of the stimulus appearing on the left for each trial in event_times

custom_validationgenerator

** only for ‘custom’ cross-validation ** a generator object with the splits to be used for cross validation using this format:

(

(split1_train_idxs, split1_test_idxs), (split2_train_idxs, split2_test_idxs), (split3_train_idxs, split3_test_idxs),

…)

n_neuronsint

Group size of number of neurons to be sub-selected

Returns:

lda_projection – the position along the LDA projection axis for the population vector of each trial

Return type:

1D array

sigtest_pseudosessions(X, y, fStatMeas, genPseudo, npseuds=200)[source]

Estimates significance level of any statistical measure following Harris, Arxiv, 2021 (https://www.biorxiv.org/content/10.1101/2020.11.29.402719v2). fStatMeas computes a scalar statistical measure (e.g. R^2) between the data, X, and the decoded variable, y. pseudosessions are generated npseuds times to create a null distribution of statistical measures. Significance level is reported relative to this null distribution.

X2-d array

Data of size (elements, timetrials)

y1-d array

predicted variable of size (timetrials)

fStatMeasfunction

takes arguments (X, y) and returns a statistical measure relating how well X decodes y

genPseudofunction

takes no arguments () and returns a pseudosession (same shape as y) drawn from the experimentally known null-distribution of y

npseudsint

the number of pseudosessions used to estimate the significance level

Returns:

  • alpha (p-value e.g. at a significance level of b, if alpha <= b then reject the null) – hypothesis.

  • statms_real (the value of the statistical measure evaluated on X and y)

  • statms_pseuds (array of statistical measures evaluated on pseudosessions)

sigtest_linshift(X, y, fStatMeas, D=300)[source]

Uses a provably conservative Linear Shift technique (Harris, Kenneth Arxiv 2021, https://arxiv.org/ftp/arxiv/papers/2012/2012.06862.pdf) to estimate significance level of a statistical measure. fStatMeas computes a scalar statistical measure (e.g. R^2) from the data matrix, X, and the variable, y. A central window of X and y of size, D, is linearly shifted to generate a null distribution of statistical measures. Significance level is reported relative to this null distribution.

X2-d array

Data of size (elements, timetrials)

y1-d array

predicted variable of size (timetrials)

fStatMeasfunction

takes arguments (X, y) and returns a scalar statistical measure of how well X decodes y

Dint

the window length along the center of y used to compute the statistical measure. must have room to shift both right and left: len(y) >= D+2

Returns:

  • alpha (conservative p-value e.g. at a significance level of b, if alpha <= b then reject the) – null hypothesis.

  • statms_real (the value of the statistical measure evaluated on X and y)

  • statms_pseuds (a 1-d array of statistical measures evaluated on shifted versions of y)