ibllib.io.raw_data_loaders

Raw Data Loader functions for PyBpod rig.

Module contains one loader function per raw datafile.

Functions

`get_port_events`	Return all event timestamps from bpod raw data trial that match 'name' --> looks in trial['behavior_data']['Events timestamps']
`load_ambient_sensor`	Load Ambient Sensor data from session.
`load_bpod`	Load both settings and data from bpod (.json and .jsonable)
`load_bpod_fronts`	Loads BNC1 and BNC2 bpod channels times and polarities from session_path
`load_camera_frameData`	Loads binary frame data from Bonsai camera recording workflow.
`load_camera_frame_count`	Load the embedded frame count for a given session.
`load_camera_gpio`	Load the GPIO for a given session.
`load_camera_ssv_times`	Load the bonsai frame and camera timestamps from Camera.timestamps.ssv
`load_data`	Load PyBpod data files (.jsonable).
`load_embedded_frame_data`	Load the embedded frame count and GPIO for a given session.
`load_encoder_events`	Load Rotary Encoder (RE) events raw data file.
`load_encoder_positions`	Load Rotary Encoder (RE) positions from raw data file within a session path.
`load_encoder_trial_info`	Load Rotary Encoder trial info from raw data file.
`load_mic`	Load Microphone wav file to np.array of len nSamples
`load_settings`	Load PyBpod Settings files (.json).
`load_stim_position_screen`
`load_widefield_mmap`	TODO Document this function
`patch_settings`	Modify various details in a settings file.
`sync_trials_robust`	Attempts to find matching timestamps in 2 time-series that have an offset, are drifting, and are most likely incomplete: sizes don't have to match, some pulses may be missing in any series.
`trial_times_to_times`	Parse and convert all trial timestamps to "absolute" time.

trial_times_to_times(raw_trial)[source]

Parse and convert all trial timestamps to “absolute” time. Float64 seconds from session start.

0—BpodStart—TrialStart0———TrialEnd0—–TrialStart1—TrialEnd1…0—ts0—ts1— tsN…absTS = tsN + TrialStartN - BpodStart

Bpod timestamps are in microseconds (µs) PyBpod timestamps are is seconds (s)

Parameters:: raw_trial (dict) – raw trial data
Returns:: trial data with modified timestamps
Return type:: dict

load_bpod(session_path, task_collection='raw_behavior_data')[source]

Load both settings and data from bpod (.json and .jsonable)

Parameters:

session_path – Absolute path of session folder
task_collection – Collection within sesison path with behavior data

Returns:

dict settings and list of dicts data

load_data(session_path: str | Path, task_collection='raw_behavior_data', time='absolute')[source]

Load PyBpod data files (.jsonable).

Bpod timestamps are in microseconds (µs) PyBpod timestamps are is seconds (s)

Parameters:: session_path (str, Path) – Absolute path of session folder
Returns:: A list of len ntrials each trial being a dictionary
Return type:: list of dicts

load_camera_frameData(session_path, camera: str = 'left', raw: bool = False) → DataFrame[source]

Loads binary frame data from Bonsai camera recording workflow.

Parameters:

session_path (StrPath) – Path to session folder
camera (str, optional) – Load FramsData for specific camera. Defaults to ‘left’.
raw (bool, optional) – Whether to return raw or parsed data. Defaults to False.

Returns:

(raw=False, Default) pandas.DataFrame: 4 int64 columns: {

Timestamp, # float64 (seconds from session start) embeddedTimeStamp, # float64 (seconds from session start) embeddedFrameCounter, # int64 (Frame number from session start) embeddedGPIOPinState # object (State of each of the 4 GPIO pins as a

# list of numpy boolean arrays # e.g. np.array([True, False, False, False])

}

raw:

pandas.DataFrame: 4 int64 columns: {

Timestamp, # UTC ticks from BehaviorPC: # (100’s of ns since midnight 1/1/0001)

embeddedTimeStamp, # Camera timestamp (Needs unclycling and conversion) embeddedFrameCounter, # Frame counter (int) embeddedGPIOPinState # GPIO pin state integer representation of 4 pins

}

Return type:

parsed

load_camera_ssv_times(session_path, camera: str)[source]

Load the bonsai frame and camera timestamps from Camera.timestamps.ssv

NB: For some sessions the frame times are in the first column, in others the order is reversed. NB: If using the new bin file the bonsai_times is a float in seconds since first frame

Parameters:

session_path – Absolute path of session folder
camera – Name of the camera to load, e.g. ‘left’

Returns:

array of datetimes, array of frame times in seconds

load_embedded_frame_data(session_path, label: str, raw=False)[source]

Load the embedded frame count and GPIO for a given session. If the file doesn’t exist, or is empty, None values are returned.

Parameters:

session_path – Absolute path of session folder
label – The specific video to load, one of (‘left’, ‘right’, ‘body’)
raw – If True the raw data are returned without preprocessing, otherwise frame count is

returned starting from 0 and the GPIO is returned as a dict of indices :return: The frame count, GPIO

load_camera_frame_count(session_path, label: str, raw=True)[source]

Load the embedded frame count for a given session. If the file doesn’t exist, or is empty, a None value is returned.

Parameters:

session_path – Absolute path of session folder
label – The specific video to load, one of (‘left’, ‘right’, ‘body’)
raw – If True the raw data are returned without preprocessing, otherwise frame count is

returned starting from 0 :return: The frame count

load_camera_gpio(session_path, label: str, as_dicts=False)[source]

Load the GPIO for a given session. If the file doesn’t exist, or is empty, a None value is returned.

The raw binary file contains uint32 values (saved as doubles) where the first 4 bits represent the state of each of the 4 GPIO pins. The array is expanded to an n x 4 array by shifting each bit to the end and checking whether it is 0 (low state) or 1 (high state).

Parameters:

session_path – Absolute path of session folder
label – The specific video to load, one of (‘left’, ‘right’, ‘body’)
as_dicts – If False the raw data are returned boolean array with shape (n_frames, n_pins) otherwise GPIO is returned as a list of dictionaries with keys (‘indices’, ‘polarities’).

Returns:

An nx4 boolean array where columns represent state of GPIO pins 1-4. If as_dicts is True, a list of dicts is returned with keys (‘indices’, ‘polarities’), or None if the dictionary is empty.

load_settings(session_path: str | Path, task_collection='raw_behavior_data')[source]

Load PyBpod Settings files (.json).

[description]

Parameters:: session_path (str, Path) – Absolute path of session folder
Returns:: Settings dictionary
Return type:: dict

load_stim_position_screen(session_path, task_collection='raw_behavior_data')[source]

load_encoder_events(session_path, task_collection='raw_behavior_data', settings=False)[source]

Load Rotary Encoder (RE) events raw data file.

Assumes that a folder called “raw_behavior_data” exists in folder.

Events number correspond to following bpod states: 1: correct / hide_stim 2: stim_on 3: closed_loop 4: freeze_error / freeze_correct

>>> data.columns
>>> ['re_ts',   # Rotary Encoder Timestamp (ms) 'numpy.int64'
     'sm_ev',   # State Machine Event           'numpy.int64'
     'bns_ts']  # Bonsai Timestamp (int)        'pandas.Timestamp'
    # pd.to_datetime(data.bns_ts) to work in datetimes

Parameters:: session_path ([type]) – [description]
Returns:: dataframe w/ 3 cols and (ntrials * 3) lines
Return type:: Pandas.DataFrame

load_encoder_positions(session_path, task_collection='raw_behavior_data', settings=False)[source]

Load Rotary Encoder (RE) positions from raw data file within a session path.

Assumes that a folder called “raw_behavior_data” exists in folder. Positions are RE ticks [-512, 512] == [-180º, 180º] 0 == trial stim init position Positive nums are rightwards movements (mouse) or RE CW (mouse)

Variable line number, depends on movements.

Raw datafile Columns:: Position, RE timestamp, RE Position, Bonsai Timestamp

Position is always equal to ‘Position’ so this column was dropped.

>>> data.columns
>>> ['re_ts',   # Rotary Encoder Timestamp (ms)     'numpy.int64'
     're_pos',  # Rotary Encoder position (ticks)   'numpy.int64'
     'bns_ts']  # Bonsai Timestamp                  'pandas.Timestamp'
    # pd.to_datetime(data.bns_ts) to work in datetimes

Parameters:: session_path (str) – Absolute path of session folder
Returns:: dataframe w/ 3 cols and N positions
Return type:: Pandas.DataFrame

load_encoder_trial_info(session_path, task_collection='raw_behavior_data')[source]

Load Rotary Encoder trial info from raw data file.

Assumes that a folder calles “raw_behavior_data” exists in folder.

NOTE: Last trial probably inexistent data (Trial info is sent on trial start and data is only saved on trial exit…) max(trialnum) should be N+1 if N is the amount of trial data saved.

Raw datafile Columns:

>>> data.columns
>>> ['trial_num',     # Trial Number                     'numpy.int64'
     'stim_pos_init', # Initial position of visual stim  'numpy.int64'
     'stim_contrast', # Contrast of visual stimulus      'numpy.float64'
     'stim_freq',     # Frequency of gabor patch         'numpy.float64'
     'stim_angle',    # Angle of Gabor 0 = Vertical      'numpy.float64'
     'stim_gain',     # Wheel gain (mm/º of stim)        'numpy.float64'
     'stim_sigma',    # Size of patch                    'numpy.float64'
     'stim_phase',    # Phase of gabor                    'numpy.float64'
     'bns_ts' ]       # Bonsai Timestamp                 'pandas.Timestamp'
    # pd.to_datetime(data.bns_ts) to work in datetimes

Parameters:: session_path (str) – Absoulte path of session folder
Returns:: dataframe w/ 9 cols and ntrials lines
Return type:: Pandas.DataFrame

load_ambient_sensor(session_path, task_collection='raw_behavior_data')[source]

Load Ambient Sensor data from session.

Probably could be extracted to DatasetTypes: _ibl_trials.temperature_C, _ibl_trials.airPressure_mb, _ibl_trials.relativeHumidity Returns a list of dicts one dict per trial. dict keys are: dict_keys([‘Temperature_C’, ‘AirPressure_mb’, ‘RelativeHumidity’])

Parameters:: session_path (str) – Absoulte path of session folder
Returns:: list of dicts
Return type:: list

load_mic(session_path, task_collection='raw_behavior_data')[source]

Load Microphone wav file to np.array of len nSamples

Parameters:: session_path (str) – Absolute path of session folder
Returns:: An array of values of the sound waveform
Return type:: numpy.array

sync_trials_robust(t0, t1, diff_threshold=0.001, drift_threshold_ppm=200, max_shift=5, return_index=False)[source]

Attempts to find matching timestamps in 2 time-series that have an offset, are drifting, and are most likely incomplete: sizes don’t have to match, some pulses may be missing in any series. Only works with irregular time series as it relies on the derivative to match sync.

Parameters:

t0
t1
diff_threshold
drift_threshold_ppm –
max_shift –

:param return_index (False) :return:

load_bpod_fronts(session_path: str, data: list = False, task_collection: str = 'raw_behavior_data') → list[source]

Loads BNC1 and BNC2 bpod channels times and polarities from session_path

Parameters:

session_path (str) – a valid session_path
data (list, optional) – pre-loaded raw data dict, defaults to False

Returns:

List of dicts BNC1 and BNC2 {“times”: np.array, “polarities”:np.array}

Return type:

list

get_port_events(trial: dict, name: str = '') → list[source]

Return all event timestamps from bpod raw data trial that match ‘name’ –> looks in trial[‘behavior_data’][‘Events timestamps’]

Parameters:

trial (dict) – raw trial dict
name (str, optional) – name of event, defaults to ‘’

Returns:

Sorted list of event timestamps

Return type:

list

TODO: add polarities?

load_widefield_mmap(session_path, dtype=<class 'numpy.uint16'>, shape=(540, 640), n_frames=None, mode='r')[source]

TODO Document this function

Parameters:: session_path

patch_settings(session_path, collection='raw_behavior_data', new_collection=None, subject=None, number=None, date=None)[source]

Modify various details in a settings file.

This function makes it easier to change things like subject name in a settings as it will modify the subject name in the myriad paths. NB: This saves the settings into the same location it was loaded from.

Parameters:

session_path (str, pathlib.Path) – The session path containing the settings file.
collection (str) – The subfolder containing the settings file.
new_collection (str) – An optional new subfolder to change in the settings paths.
subject (str) – An optional new subject name to change in the settings.
number (str, int) – An optional new number to change in the settings.
date (str, datetime.date) – An optional date to change in the settings.

Returns:

The modified settings.

Return type:

dict

Examples

File is in /data/subject/2020-01-01/002/raw_behavior_data. Patch the file then move to new location. >>> patch_settings(‘/data/subject/2020-01-01/002’, number=’001’) >>> shutil.move(‘/data/subject/2020-01-01/002/raw_behavior_data/’, ‘/data/subject/2020-01-01/001/raw_behavior_data/’)

File is moved into new collection within the same session, then patched. >>> shutil.move(‘./subject/2020-01-01/002/raw_task_data_00/’, ‘./subject/2020-01-01/002/raw_task_data_01/’) >>> patch_settings(‘/data/subject/2020-01-01/002’, collection=’raw_task_data_01’, new_collection=’raw_task_data_01’)

Update subject, date and number. >>> new_session_path = Path(‘/data/foobar/2024-02-24/002’) >>> old_session_path = Path(‘/data/baz/2024-02-23/001’) >>> patch_settings(old_session_path, collection=’raw_task_data_00’, … subject=new_session_path.parts[-3], date=new_session_path.parts[-2], number=new_session_path.parts[-1]) >>> shutil.move(old_session_path, new_session_path)