brainbox.population.decode
Population functions.
Code from https://github.com/cortex-lab/phylib/blob/master/phylib/stats/ccg.py by C. Rossant. Code for decoding by G. Meijer Code from sigtest_pseudosessions and sigtest_linshift by B. Benson
Functions
Classify trial identity (e.g. stim left/right) from neural population activity. |
|
Return the number of spikes in a sequence of time intervals, for each neuron. |
|
Use linear discriminant analysis to project population vectors to the line that best separates the two groups. |
|
Perform linear regression to predict a continuous variable from neural data |
|
Uses a provably conservative Linear Shift technique (Harris, Kenneth Arxiv 2021, https://arxiv.org/ftp/arxiv/papers/2012/2012.06862.pdf) to estimate significance level of a statistical measure. |
|
Estimates significance level of any statistical measure following Harris, Arxiv, 2021 (https://www.biorxiv.org/content/10.1101/2020.11.29.402719v2). |
|
Compute all pairwise cross-correlograms among the clusters appearing in spike_clusters. |
- get_spike_counts_in_bins(spike_times, spike_clusters, intervals)[source]
Return the number of spikes in a sequence of time intervals, for each neuron.
- Parameters:
spike_times (1D array) – spike times (in seconds)
spike_clusters (1D array) – cluster ids corresponding to each event in spikes
intervals (2D array of shape (n_events, 2)) – the start and end times of the events
- Returns:
counts (2D array of shape (n_neurons, n_events)) – the spike counts of all neurons ffrom scipy.stats import sem, tor all events value (i, j) is the number of spikes of neuron neurons[i] in interval #j
cluster_ids (1D array) – list of cluster ids
- xcorr(spike_times, spike_clusters, bin_size=None, window_size=None)[source]
Compute all pairwise cross-correlograms among the clusters appearing in spike_clusters.
:param : :type : param spike_times: Spike times in seconds. :param : :type : type spike_times: array-like :param : :type : param spike_clusters: Spike-cluster mapping. :param : :type : type spike_clusters: array-like :param : :type : param bin_size: Size of the bin, in seconds. :param : :type : type bin_size: float :param : :type : param window_size: Size of the window, in seconds. :param : :type : type window_size: float :param Returns an (n_clusters: :param n_clusters: :param winsize_samples) array with all pairwise: :param cross-correlograms.:
- classify(population_activity, trial_labels, classifier, cross_validation=None, return_training=False)[source]
Classify trial identity (e.g. stim left/right) from neural population activity.
- Parameters:
population_activity (2D array (trials x neurons)) – population activity of all neurons in the population for each trial.
trial_labels (1D or 2D array) – identities of the trials, can be any number of groups, accepts integers and strings
classifier (scikit-learn object) –
- which decoder to use, for example Gaussian with Multinomial likelihood:
from sklearn.naive_bayes import MultinomialNB classifier = MultinomialNB()
cross_validation (None or scikit-learn object) –
- which cross-validation method to use, for example 5-fold:
from sklearn.model_selection import KFold cross_validation = KFold(n_splits=5)
return_training (bool) – if set to True the classifier will also return the performance on the training set
- Returns:
accuracy (float) – accuracy of the classifier
pred (1D array) – predictions of the classifier
prob (1D array) – probablity of classification
training_accuracy (float) – accuracy of the classifier on the training set (only if return_training is True)
- regress(population_activity, trial_targets, regularization=None, cross_validation=None, return_training=False)[source]
Perform linear regression to predict a continuous variable from neural data
- Parameters:
population_activity (2D array (trials x neurons)) – population activity of all neurons in the population for each trial.
trial_targets (1D or 2D array) – the decoding target per trial as a continuous variable
regularization (None or string) – None = no regularization using ordinary least squares linear regression ‘L1’ = L1 regularization using Lasso ‘L2’ = L2 regularization using Ridge regression
cross_validation (None or scikit-learn object) –
- which cross-validation method to use, for example 5-fold:
from sklearn.model_selection import KFold cross_validation = KFold(n_splits=5)
return_training (bool) – if set to True the classifier will also return the performance on the training set
- Returns:
pred (1D array) – array with predictions
pred_training (1D array) – array with predictions for the training set (only if return_training is True)
- lda_project(spike_times, spike_clusters, event_times, event_groups, pre_time=0, post_time=0.5, cross_validation='kfold', num_splits=5, prob_left=None, custom_validation=None)[source]
Use linear discriminant analysis to project population vectors to the line that best separates the two groups. When cross-validation is used, the LDA projection is fitted on the training data after which the test data is projected to this projection.
- spike_times1D array
spike times (in seconds)
- spike_clusters1D array
cluster ids corresponding to each event in spikes
- event_times1D array
times (in seconds) of the events from the two groups
- event_groups1D array
group identities of the events, can be any number of groups, accepts integers and strings
- cross_validationstring
- which cross-validation method to use, options are:
‘none’ No cross-validation ‘kfold’ K-fold cross-validation ‘leave-one-out’ Leave out the trial that is being decoded ‘block’ Leave out the block the to-be-decoded trial is in ‘custom’ Any custom cross-validation provided by the user
- num_splitsinteger
** only for ‘kfold’ cross-validation ** Number of splits to use for k-fold cross validation, a value of 5 means that the decoder will be trained on 4/5th of the data and used to predict the remaining 1/5th. This process is repeated five times so that all data has been used as both training and test set.
- prob_left1D array
** only for ‘block’ cross-validation ** the probability of the stimulus appearing on the left for each trial in event_times
- custom_validationgenerator
** only for ‘custom’ cross-validation ** a generator object with the splits to be used for cross validation using this format:
- (
(split1_train_idxs, split1_test_idxs), (split2_train_idxs, split2_test_idxs), (split3_train_idxs, split3_test_idxs),
…)
- n_neuronsint
Group size of number of neurons to be sub-selected
- Returns:
lda_projection – the position along the LDA projection axis for the population vector of each trial
- Return type:
1D array
- sigtest_pseudosessions(X, y, fStatMeas, genPseudo, npseuds=200)[source]
Estimates significance level of any statistical measure following Harris, Arxiv, 2021 (https://www.biorxiv.org/content/10.1101/2020.11.29.402719v2). fStatMeas computes a scalar statistical measure (e.g. R^2) between the data, X, and the decoded variable, y. pseudosessions are generated npseuds times to create a null distribution of statistical measures. Significance level is reported relative to this null distribution.
- X2-d array
Data of size (elements, timetrials)
- y1-d array
predicted variable of size (timetrials)
- fStatMeasfunction
takes arguments (X, y) and returns a statistical measure relating how well X decodes y
- genPseudofunction
takes no arguments () and returns a pseudosession (same shape as y) drawn from the experimentally known null-distribution of y
- npseudsint
the number of pseudosessions used to estimate the significance level
- Returns:
alpha (p-value e.g. at a significance level of b, if alpha <= b then reject the null) – hypothesis.
statms_real (the value of the statistical measure evaluated on X and y)
statms_pseuds (array of statistical measures evaluated on pseudosessions)
- sigtest_linshift(X, y, fStatMeas, D=300)[source]
Uses a provably conservative Linear Shift technique (Harris, Kenneth Arxiv 2021, https://arxiv.org/ftp/arxiv/papers/2012/2012.06862.pdf) to estimate significance level of a statistical measure. fStatMeas computes a scalar statistical measure (e.g. R^2) from the data matrix, X, and the variable, y. A central window of X and y of size, D, is linearly shifted to generate a null distribution of statistical measures. Significance level is reported relative to this null distribution.
- X2-d array
Data of size (elements, timetrials)
- y1-d array
predicted variable of size (timetrials)
- fStatMeasfunction
takes arguments (X, y) and returns a scalar statistical measure of how well X decodes y
- Dint
the window length along the center of y used to compute the statistical measure. must have room to shift both right and left: len(y) >= D+2
- Returns:
alpha (conservative p-value e.g. at a significance level of b, if alpha <= b then reject the) – null hypothesis.
statms_real (the value of the statistical measure evaluated on X and y)
statms_pseuds (a 1-d array of statistical measures evaluated on shifted versions of y)