brainbox.modeling.glm¶

GLM fitting utilities based on NeuroGLM by Il Memming Park, Jonathan Pillow:

Berk Gercek International Brain Lab, 2020

Functions

`convbasis`
`d_neglog`
`dd_neglog`
`denseconv`
`full_rcos`
`neglog`
`raised_cosine`

Classes

NeuralGLM

Generalized Linear Model which seeks to describe spiking activity as the output of a poisson process.

class NeuralGLM(trialsdf, spk_times, spk_clu, vartypes, train=0.8, binwidth=0.02, mintrials=100)[source]¶

Bases: object

Generalized Linear Model which seeks to describe spiking activity as the output of a poisson process. Uses sklearn’s GLM methods under the hood while providing useful routines for dealing with neural data

add_covariate_timing(covlabel, eventname, bases, offset=0, deltaval=None, cond=None, desc='')[source]¶

Convenience wrapper for adding timing event regressors to the GLM. Automatically generates a one-hot vector for each trial as the regressor and adds the appropriate data structure to the model.

Parameters

covlabel (str) – Label which the covariate will use. Can be accessed via dot syntax of the instance usually.
eventname (str) – Label of the column in trialsdf which has the event timing for each trial.
bases (numpy.array) – nTB x nB array, i.e. number of time bins for the bases functions by number of bases. Each column in the array is used together to describe the response of a unit to that timing event.
offset (float, seconds) – Offset of bases functions relative to timing event. Negative values will ensure that
deltaval (None, str, or pandas series, optional) – Values of the kronecker delta function peak used to encode the event. If a string, the column in trialsdf with that label will be used. If a pandas series with indexes matching trialsdf, corresponding elements of the series will be the delta funtion val. If None (default) height is 1.
cond (None, list, or fun, optional) – Condition which to apply this covariate. Can either be a list of trial indices, or a function which takes in rows of the trialsdf and returns booleans.
desc (str, optional) – Additional information about the covariate, if desired. by default ‘’

add_covariate_boxcar(covlabel, boxstart, boxend, cond=None, height=None, desc='')[source]¶

add_covariate_raw(covlabel, raw, cond=None, desc='')[source]¶

add_covariate(covlabel, regressor, bases, offset=0, cond=None, desc='')[source]¶

Parent function to add covariates to model object. Takes a regressor in the form of a pandas Series object, a T x M array of M bases, and stores them for use in the design matrix generation.

Parameters

covlabel (str) – Label for the covariate being added. Will be exposed, if possible, through (instance).(covlabel) attribute.
regressor (pandas.Series) – Series in which each element is the value(s) of a regressor for a trial at that index. These will be convolved with the bases functions (if provided) to produce the components of the design matrix. Regressor must be (T / dt) x 1 array for each trial
bases (numpy.array or None) – T x M array of M basis functions over T timesteps. Columns will be convolved with the elements of regressor to produce elements of the design matrix. If None, it is assumed a raw regressor is being used.
offset (int, optional) – Offset of the regressor from the bases during convolution. Negative values indicate that the firing of the unit will be , by default 0
cond (list or func, optional) – Condition for which to apply covariate. Either a list of trials which the covariate applies to, or a function of the form f(dataframerow) which returns a boolean, by default None
desc (str, optional) – Description of the covariate for reference purposes, by default ‘’ (empty)

bin_spike_trains()[source]¶: Bins spike times passed to class at instantiation. Will not bin spike trains which did not meet the criteria for minimum number of spiking trials. Must be run before the NeuralGLM.fit() method is called.

compile_design_matrix(dense=True)[source]¶

Compiles design matrix for the current experiment based on the covariates which were added with the various NeuralGLM.add_covariate methods available. Can optionally compile a sparse design matrix using the scipy.sparse package, however that method may take longer depending on the degree of sparseness.

Parameters: dense (bool, optional) – Whether or not to compute a dense design matrix or a sparse one, by default True

fit(method='sklearn', alpha=0)[source]¶

Fit the current set of binned spikes as a function of the current design matrix. Requires NeuralGLM.bin_spike_trains and NeuralGLM.compile_design_matrix to be run first. Will store the fit weights to an internal variable. To access these fit weights in a pandas DataFrame use the NeuralGLM.combine_weights method.

Parameters

method (str, optional) – ‘sklearn’ or ‘minimize’, describes the fitting method used to obtain weights. Scikit-learn uses weight normalization and regularization and will return significantly different results from ‘minimize’, which simply minimizes the negative log likelihood of the data given the covariates, by default ‘sklearn’
alpha (float, optional) – Regularization strength for scikit-learn implementation of GLM fitting, where 0 is effectively unregularized weights. Does not function in the minimize option, by default 1

Returns

coefs (list) – List of coefficients fit. Not recommended to use these for interpretation. Use the .combine_weights() method instead.
intercepts (list) – List of intercepts (bias terms) fit. Not recommended to use these for interpretation.

combine_weights()[source]¶

Combined fit coefficients and intercepts to produce kernels where appropriate, which describe activity.

Returns: DataFrame in which each row is the fit weights for a given spiking unit. Columns are individual covariates added during the construction process. Indices are the cluster IDs for each of the cells that were fit (NOT a simple range(start, stop) index.)
Return type: pandas.DataFrame

score()[source]¶

Compute the squared deviance of the model, i.e. how much variance beyond the null model (a poisson process with the same mean, defined by the intercept, at every time step) the model which was fit explains. For a detailed explanation see https://bookdown.org/egarpor/PM-UC3M/glm-deviance.html`

Returns: A series in which the index are cluster IDs and each entry is the D^2 for the model fit to that cluster
Return type: pandas.Series

binf(t)[source]¶

Bin function for a given timestep. Returns the number of bins after trial start a given t would occur at.

Parameters: t (float) – Seconds after trial start
Returns: Number of bins corresponding to t using the binwidth of the model.
Return type: int

convbasis(stim, bases, offset=0)[source]¶

denseconv(X, bases)[source]¶

raised_cosine(duration, nbases, binfun)[source]¶

full_rcos(duration, nbases, binfun, n_before=1)[source]¶

neglog(weights, x, y)[source]¶

d_neglog(weights, x, y)[source]¶

dd_neglog(weights, x, y)[source]¶