brainbox.processing.processing

Processes data from one form into another, e.g. taking spike times and binning them into non-overlapping bins and convolving spike times with a gaussian kernel.

Functions

bin_spikes

Wrapper for bincount2D which is intended to take in a TimeSeries object of spike times and cluster identities and spit out spike counts in bins of a specified width binsize, also in another TimeSeries object.

bincount2D

Computes a 2D histogram by aggregating values in a 2D array.

filter_units

Filters units according to some parameters. **kwargs are the keyword parameters used to filter the units.

get_units_bunch

Returns a bunch, where the bunch keys are keys from spks with labels of spike information (e.g.

sync

Function for resampling a single or multiple time series to a single, evenly-spaced, delta t between observations.

sync(dt, times=None, values=None, timeseries=None, offsets=None, interp='zero', fillval=nan)[source]

Function for resampling a single or multiple time series to a single, evenly-spaced, delta t between observations. Uses interpolation to find values.

Can be used on raw numpy arrays of timestamps and values using the ‘times’ and ‘values’ kwargs and/or on brainbox.core.TimeSeries objects passed to the ‘timeseries’ kwarg. If passing both TimeSeries objects and numpy arrays, the offsets passed should be for the TS objects first and then the numpy arrays.

Uses scipy’s interpolation library to perform interpolation. See scipy.interp1d for more information regarding interp and fillval parameters.

Parameters
  • dt (float) – Separation of points which the output timeseries will be sampled at

  • timeseries (tuple of TimeSeries objects, or a single TimeSeries object.) – A group of time series to perform alignment or a single time series. Must have time stamps.

  • times (np.ndarray or list of np.ndarrays) – time stamps for the observations in ‘values’]

  • values (np.ndarray or list of np.ndarrays) – observations corresponding to the timestamps in ‘times’

  • offsets (tuple of floats, optional) – tuple of offsets for time stamps of each time series. Offsets for passed TimeSeries objects first, then offsets for passed numpy arrays. defaults to None

  • interp (str) – Type of interpolation to use. Refer to scipy.interpolate.interp1d for possible values, defaults to np.nan

  • fillval – Fill values to use when interpolating outside of range of data. See interp1d for possible values, defaults to np.nan

Returns

TimeSeries object with each row representing synchronized values of all input TimeSeries. Will carry column names from input time series if all of them have column names.

bincount2D(x, y, xbin=0, ybin=0, xlim=None, ylim=None, weights=None)[source]

Computes a 2D histogram by aggregating values in a 2D array.

Parameters
  • x – values to bin along the 2nd dimension (c-contiguous)

  • y – values to bin along the 1st dimension

  • xbin – scalar: bin size along 2nd dimension 0: aggregate according to unique values array: aggregate according to exact values (count reduce operation)

  • ybin – scalar: bin size along 1st dimension 0: aggregate according to unique values array: aggregate according to exact values (count reduce operation)

  • xlim – (optional) 2 values (array or list) that restrict range along 2nd dimension

  • ylim – (optional) 2 values (array or list) that restrict range along 1st dimension

  • weights – (optional) defaults to None, weights to apply to each value for aggregation

Returns

3 numpy arrays MAP [ny,nx] image, xscale [nx], yscale [ny]

bin_spikes(spikes, binsize, interval_indices=False)[source]

Wrapper for bincount2D which is intended to take in a TimeSeries object of spike times and cluster identities and spit out spike counts in bins of a specified width binsize, also in another TimeSeries object. Can either return a TS object with each row labeled with the corresponding interval or the value of the left edge of the bin.

Parameters
  • spikes (TimeSeries object with 'clusters' column and timestamps) – Spike times and cluster identities of sorted spikes

  • binsize (float) – Width of the non-overlapping bins in which to bin spikes

  • interval_indices (bool, optional) – Whether to use intervals as the time stamps for binned spikes, rather than the left edge value of the bins, defaults to False

Returns

Object with 2D array of shape T x N, for T timesteps and N clusters, and the associated time stamps.

Return type

TimeSeries object

get_units_bunch(spks_b, *args)[source]

Returns a bunch, where the bunch keys are keys from spks with labels of spike information (e.g. unit IDs, times, features, etc.), and the values for each key are arrays with values for each unit: these arrays are ordered and can be indexed by unit id.

Parameters
  • spks_b (bunch) – A spikes bunch containing fields with spike information (e.g. unit IDs, times, features, etc.) for all spikes.

  • features (list of strings (optional positional arg)) – A list of names of labels of spike information (which must be keys in spks) that specify which labels to return as keys in units. If not provided, all keys in spks are returned as keys in units.

Returns

units_b – A bunch with keys of labels of spike information (e.g. cluster IDs, times, features, etc.) whose values are arrays that hold values for each unit. The arrays for each key are ordered by unit ID.

Return type

bunch

Examples

1) Create a units bunch given a spikes bunch, and get the amps for unit #4 from the units bunch.

>>> import brainbox as bb
>>> import alf.io as aio
>>> import ibllib.ephys.spikes as e_spks
(*Note, if there is no 'alf' directory, make 'alf' directory from 'ks2' output directory):
>>> e_spks.ks2_to_alf(path_to_ks_out, path_to_alf_out)
>>> spks_b = aio.load_object(path_to_alf_out, 'spikes')
>>> units_b = bb.processing.get_units_bunch(spks_b)
# Get amplitudes for unit 4.
>>> amps = units_b['amps']['4']

TODO add computation time estimate?

filter_units(units_b, t, **kwargs)[source]

Filters units according to some parameters. **kwargs are the keyword parameters used to filter the units.

Parameters
  • units_b (bunch) – A bunch with keys of labels of spike information (e.g. cluster IDs, times, features, etc.) whose values are arrays that hold values for each unit. The arrays for each key are ordered by unit ID.

  • t (float) – Duration of time over which to calculate the firing rate and false positive rate.

  • Parameters (Keyword) –

  • ------------------

  • min_amp (float) – The minimum mean amplitude (in V) of the spikes in the unit. Default value is 50e-6.

  • min_fr (float) – The minimum firing rate (in Hz) of the unit. Default value is 0.5.

  • max_fpr (float) – The maximum false positive rate of the unit (using the fp formula in Hill et al. (2011) J Neurosci 31: 8699-8705). Default value is 0.2.

  • rp (float) – The refractory period (in s) of the unit. Used to calculate max_fp. Default value is 0.002.

Returns

filt_units – The ids of the filtered units.

Return type

ndarray

Examples

  1. Filter units according to the default parameters.
    >>> import brainbox as bb
    >>> import alf.io as aio
    >>> import ibllib.ephys.spikes as e_spks
    (*Note, if there is no 'alf' directory, make 'alf' directory from 'ks2' output directory):
    >>> e_spks.ks2_to_alf(path_to_ks_out, path_to_alf_out)
    # Get a spikes bunch, units bunch, and filter the units.
    >>> spks_b = aio.load_object(path_to_alf_out, 'spikes')
    >>> units_b = bb.processing.get_units_bunch(spks_b, ['times', 'amps', 'clusters'])
    >>> T = spks_b['times'][-1] - spks_b['times'][0]
    >>> filtered_units = bb.processing.filter_units(units_b, T)
    

2) Filter units with no minimum amplitude, a minimum firing rate of 1 Hz, and a max false positive rate of 0.2, given a refractory period of 2 ms.

>>> filtered_units  = bb.processing.filter_units(units_b, T, min_amp=0, min_fr=1)
TODO: units_b input arg could eventually be replaced by clstrs_b if the required metrics

are in clstrs_b[‘metrics’]