ibllib.io.raw_data_loaders

Raw Data Loader functions for PyBpod rig.

Module contains one loader function per raw datafile.

Functions

get_port_events

Return all event timestamps from bpod raw data trial that match 'name' --> looks in trial['behavior_data']['Events timestamps']

load_ambient_sensor

Load Ambient Sensor data from session.

load_bpod

Load both settings and data from bpod (.json and .jsonable)

load_bpod_fronts

Loads BNC1 and BNC2 bpod channels times and polarities from session_path

load_camera_frameData

Loads binary frame data from Bonsai camera recording workflow.

load_camera_frame_count

Load the embedded frame count for a given session.

load_camera_gpio

Load the GPIO for a given session.

load_camera_ssv_times

Load the bonsai frame and camera timestamps from Camera.timestamps.ssv

load_data

Load PyBpod data files (.jsonable).

load_embedded_frame_data

Load the embedded frame count and GPIO for a given session.

load_encoder_events

Load Rotary Encoder (RE) events raw data file.

load_encoder_positions

Load Rotary Encoder (RE) positions from raw data file within a session path.

load_encoder_trial_info

Load Rotary Encoder trial info from raw data file.

load_mic

Load Microphone wav file to np.array of len nSamples

load_settings

Load PyBpod Settings files (.json).

load_stim_position_screen

load_widefield_mmap

TODO Document this function

patch_settings

Modify various details in a settings file.

sync_trials_robust

Attempts to find matching timestamps in 2 time-series that have an offset, are drifting, and are most likely incomplete: sizes don't have to match, some pulses may be missing in any series.

trial_times_to_times

Parse and convert all trial timestamps to "absolute" time.

trial_times_to_times(raw_trial)[source]

Parse and convert all trial timestamps to “absolute” time. Float64 seconds from session start.

0—BpodStart—TrialStart0———TrialEnd0—–TrialStart1—TrialEnd1…0—ts0—ts1— tsN…absTS = tsN + TrialStartN - BpodStart

Bpod timestamps are in microseconds (µs) PyBpod timestamps are is seconds (s)

Parameters:

raw_trial (dict) – raw trial data

Returns:

trial data with modified timestamps

Return type:

dict

load_bpod(session_path, task_collection='raw_behavior_data')[source]

Load both settings and data from bpod (.json and .jsonable)

Parameters:
  • session_path – Absolute path of session folder

  • task_collection – Collection within sesison path with behavior data

Returns:

dict settings and list of dicts data

load_data(session_path: str | Path, task_collection='raw_behavior_data', time='absolute')[source]

Load PyBpod data files (.jsonable).

Bpod timestamps are in microseconds (µs) PyBpod timestamps are is seconds (s)

Parameters:

session_path (str, Path) – Absolute path of session folder

Returns:

A list of len ntrials each trial being a dictionary

Return type:

list of dicts

load_camera_frameData(session_path, camera: str = 'left', raw: bool = False) DataFrame[source]

Loads binary frame data from Bonsai camera recording workflow.

Parameters:
  • session_path (StrPath) – Path to session folder

  • camera (str, optional) – Load FramsData for specific camera. Defaults to ‘left’.

  • raw (bool, optional) – Whether to return raw or parsed data. Defaults to False.

Returns:

(raw=False, Default) pandas.DataFrame: 4 int64 columns: {

Timestamp, # float64 (seconds from session start) embeddedTimeStamp, # float64 (seconds from session start) embeddedFrameCounter, # int64 (Frame number from session start) embeddedGPIOPinState # object (State of each of the 4 GPIO pins as a

# list of numpy boolean arrays # e.g. np.array([True, False, False, False])

}

raw:
pandas.DataFrame: 4 int64 columns: {
Timestamp, # UTC ticks from BehaviorPC

# (100’s of ns since midnight 1/1/0001)

embeddedTimeStamp, # Camera timestamp (Needs unclycling and conversion) embeddedFrameCounter, # Frame counter (int) embeddedGPIOPinState # GPIO pin state integer representation of 4 pins

}

Return type:

parsed

load_camera_ssv_times(session_path, camera: str)[source]

Load the bonsai frame and camera timestamps from Camera.timestamps.ssv

NB: For some sessions the frame times are in the first column, in others the order is reversed. NB: If using the new bin file the bonsai_times is a float in seconds since first frame

Parameters:
  • session_path – Absolute path of session folder

  • camera – Name of the camera to load, e.g. ‘left’

Returns:

array of datetimes, array of frame times in seconds

load_embedded_frame_data(session_path, label: str, raw=False)[source]

Load the embedded frame count and GPIO for a given session. If the file doesn’t exist, or is empty, None values are returned.

Parameters:
  • session_path – Absolute path of session folder

  • label – The specific video to load, one of (‘left’, ‘right’, ‘body’)

  • raw – If True the raw data are returned without preprocessing, otherwise frame count is

returned starting from 0 and the GPIO is returned as a dict of indices :return: The frame count, GPIO

load_camera_frame_count(session_path, label: str, raw=True)[source]

Load the embedded frame count for a given session. If the file doesn’t exist, or is empty, a None value is returned.

Parameters:
  • session_path – Absolute path of session folder

  • label – The specific video to load, one of (‘left’, ‘right’, ‘body’)

  • raw – If True the raw data are returned without preprocessing, otherwise frame count is

returned starting from 0 :return: The frame count

load_camera_gpio(session_path, label: str, as_dicts=False)[source]

Load the GPIO for a given session. If the file doesn’t exist, or is empty, a None value is returned.

The raw binary file contains uint32 values (saved as doubles) where the first 4 bits represent the state of each of the 4 GPIO pins. The array is expanded to an n x 4 array by shifting each bit to the end and checking whether it is 0 (low state) or 1 (high state).

Parameters:
  • session_path – Absolute path of session folder

  • label – The specific video to load, one of (‘left’, ‘right’, ‘body’)

  • as_dicts – If False the raw data are returned boolean array with shape (n_frames, n_pins) otherwise GPIO is returned as a list of dictionaries with keys (‘indices’, ‘polarities’).

Returns:

An nx4 boolean array where columns represent state of GPIO pins 1-4. If as_dicts is True, a list of dicts is returned with keys (‘indices’, ‘polarities’), or None if the dictionary is empty.

load_settings(session_path: str | Path, task_collection='raw_behavior_data')[source]

Load PyBpod Settings files (.json).

[description]

Parameters:

session_path (str, Path) – Absolute path of session folder

Returns:

Settings dictionary

Return type:

dict

load_stim_position_screen(session_path, task_collection='raw_behavior_data')[source]
load_encoder_events(session_path, task_collection='raw_behavior_data', settings=False)[source]

Load Rotary Encoder (RE) events raw data file.

Assumes that a folder called “raw_behavior_data” exists in folder.

Events number correspond to following bpod states: 1: correct / hide_stim 2: stim_on 3: closed_loop 4: freeze_error / freeze_correct

>>> data.columns
>>> ['re_ts',   # Rotary Encoder Timestamp (ms) 'numpy.int64'
     'sm_ev',   # State Machine Event           'numpy.int64'
     'bns_ts']  # Bonsai Timestamp (int)        'pandas.Timestamp'
    # pd.to_datetime(data.bns_ts) to work in datetimes
Parameters:

session_path ([type]) – [description]

Returns:

dataframe w/ 3 cols and (ntrials * 3) lines

Return type:

Pandas.DataFrame

load_encoder_positions(session_path, task_collection='raw_behavior_data', settings=False)[source]

Load Rotary Encoder (RE) positions from raw data file within a session path.

Assumes that a folder called “raw_behavior_data” exists in folder. Positions are RE ticks [-512, 512] == [-180º, 180º] 0 == trial stim init position Positive nums are rightwards movements (mouse) or RE CW (mouse)

Variable line number, depends on movements.

Raw datafile Columns:

Position, RE timestamp, RE Position, Bonsai Timestamp

Position is always equal to ‘Position’ so this column was dropped.

>>> data.columns
>>> ['re_ts',   # Rotary Encoder Timestamp (ms)     'numpy.int64'
     're_pos',  # Rotary Encoder position (ticks)   'numpy.int64'
     'bns_ts']  # Bonsai Timestamp                  'pandas.Timestamp'
    # pd.to_datetime(data.bns_ts) to work in datetimes
Parameters:

session_path (str) – Absolute path of session folder

Returns:

dataframe w/ 3 cols and N positions

Return type:

Pandas.DataFrame

load_encoder_trial_info(session_path, task_collection='raw_behavior_data')[source]

Load Rotary Encoder trial info from raw data file.

Assumes that a folder calles “raw_behavior_data” exists in folder.

NOTE: Last trial probably inexistent data (Trial info is sent on trial start and data is only saved on trial exit…) max(trialnum) should be N+1 if N is the amount of trial data saved.

Raw datafile Columns:

>>> data.columns
>>> ['trial_num',     # Trial Number                     'numpy.int64'
     'stim_pos_init', # Initial position of visual stim  'numpy.int64'
     'stim_contrast', # Contrast of visual stimulus      'numpy.float64'
     'stim_freq',     # Frequency of gabor patch         'numpy.float64'
     'stim_angle',    # Angle of Gabor 0 = Vertical      'numpy.float64'
     'stim_gain',     # Wheel gain (mm/º of stim)        'numpy.float64'
     'stim_sigma',    # Size of patch                    'numpy.float64'
     'stim_phase',    # Phase of gabor                    'numpy.float64'
     'bns_ts' ]       # Bonsai Timestamp                 'pandas.Timestamp'
    # pd.to_datetime(data.bns_ts) to work in datetimes
Parameters:

session_path (str) – Absoulte path of session folder

Returns:

dataframe w/ 9 cols and ntrials lines

Return type:

Pandas.DataFrame

load_ambient_sensor(session_path, task_collection='raw_behavior_data')[source]

Load Ambient Sensor data from session.

Probably could be extracted to DatasetTypes: _ibl_trials.temperature_C, _ibl_trials.airPressure_mb, _ibl_trials.relativeHumidity Returns a list of dicts one dict per trial. dict keys are: dict_keys([‘Temperature_C’, ‘AirPressure_mb’, ‘RelativeHumidity’])

Parameters:

session_path (str) – Absoulte path of session folder

Returns:

list of dicts

Return type:

list

load_mic(session_path, task_collection='raw_behavior_data')[source]

Load Microphone wav file to np.array of len nSamples

Parameters:

session_path (str) – Absolute path of session folder

Returns:

An array of values of the sound waveform

Return type:

numpy.array

sync_trials_robust(t0, t1, diff_threshold=0.001, drift_threshold_ppm=200, max_shift=5, return_index=False)[source]

Attempts to find matching timestamps in 2 time-series that have an offset, are drifting, and are most likely incomplete: sizes don’t have to match, some pulses may be missing in any series. Only works with irregular time series as it relies on the derivative to match sync.

Parameters:
  • t0

  • t1

  • diff_threshold

  • drift_threshold_ppm

  • max_shift

:param return_index (False) :return:

load_bpod_fronts(session_path: str, data: list = False, task_collection: str = 'raw_behavior_data') list[source]

Loads BNC1 and BNC2 bpod channels times and polarities from session_path

Parameters:
  • session_path (str) – a valid session_path

  • data (list, optional) – pre-loaded raw data dict, defaults to False

Returns:

List of dicts BNC1 and BNC2 {“times”: np.array, “polarities”:np.array}

Return type:

list

get_port_events(trial: dict, name: str = '') list[source]

Return all event timestamps from bpod raw data trial that match ‘name’ –> looks in trial[‘behavior_data’][‘Events timestamps’]

Parameters:
  • trial (dict) – raw trial dict

  • name (str, optional) – name of event, defaults to ‘’

Returns:

Sorted list of event timestamps

Return type:

list

TODO: add polarities?

load_widefield_mmap(session_path, dtype=<class 'numpy.uint16'>, shape=(540, 640), n_frames=None, mode='r')[source]

TODO Document this function

Parameters:

session_path

patch_settings(session_path, collection='raw_behavior_data', new_collection=None, subject=None, number=None, date=None)[source]

Modify various details in a settings file.

This function makes it easier to change things like subject name in a settings as it will modify the subject name in the myriad paths. NB: This saves the settings into the same location it was loaded from.

Parameters:
  • session_path (str, pathlib.Path) – The session path containing the settings file.

  • collection (str) – The subfolder containing the settings file.

  • new_collection (str) – An optional new subfolder to change in the settings paths.

  • subject (str) – An optional new subject name to change in the settings.

  • number (str, int) – An optional new number to change in the settings.

  • date (str, datetime.date) – An optional date to change in the settings.

Returns:

The modified settings.

Return type:

dict

Examples

File is in /data/subject/2020-01-01/002/raw_behavior_data. Patch the file then move to new location. >>> patch_settings(‘/data/subject/2020-01-01/002’, number=’001’) >>> shutil.move(‘/data/subject/2020-01-01/002/raw_behavior_data/’, ‘/data/subject/2020-01-01/001/raw_behavior_data/’)

File is moved into new collection within the same session, then patched. >>> shutil.move(‘./subject/2020-01-01/002/raw_task_data_00/’, ‘./subject/2020-01-01/002/raw_task_data_01/’) >>> patch_settings(‘/data/subject/2020-01-01/002’, collection=’raw_task_data_01’, new_collection=’raw_task_data_01’)

Update subject, date and number. >>> new_session_path = Path(‘/data/foobar/2024-02-24/002’) >>> old_session_path = Path(‘/data/baz/2024-02-23/001’) >>> patch_settings(old_session_path, collection=’raw_task_data_00’, … subject=new_session_path.parts[-3], date=new_session_path.parts[-2], number=new_session_path.parts[-1]) >>> shutil.move(old_session_path, new_session_path)