HDF5 format specification
This page documents the on-disk layout of .h5 files written by lfpack.
Layout (current — multi-recording, pyramidal)
<file>.h5
└─ <recording>/ # one group per recording (e.g. "probe00")
└─ <scale_2digit>/ # zero-padded scale index (e.g. "00", "01")
├─ meta # scalar/array attributes (see below)
└─ chunks/
├─ 0/ # first compressed chunk
│ ├─ U_scaled float32 (nc, r) shuffle+gzip
│ ├─ vh_indices int32 (n_kept,) gzip
│ └─ vh_values float32 (n_kept,) gzip
├─ 1/
...
meta attributes
| Attribute | Type | Description |
|---|---|---|
nc |
int | Number of channels |
ns_total |
int | Total number of samples |
fs |
float | Nominal sample rate (Hz) |
fs_sync |
float | Sync-corrected sample rate (Hz); NaN when not available |
t0_sync |
float | Session-clock time in seconds at sample 0; NaN when not available |
epsilon |
float | SVD noise-floor threshold |
alpha |
float | Wavelet-packet threshold multiplier |
compress_chunk |
int | Samples per compressed chunk |
geometry_x |
float32 array | Channel x positions (µm) |
geometry_y |
float32 array | Channel y positions (µm) |
sglx_meta |
JSON string | Original SpikeGLX metadata |
fs_sync and t0_sync are written by compress_to_h5 when the caller supplies the corresponding keyword arguments. LFPackReader.fs returns fs_sync when it is finite, otherwise falls back to fs. LFPackReader.t0 returns t0_sync when finite, otherwise NaN. LFPackReader.times produces the full session-clock time vector using these values.
Chunk datasets
Each chunk group (chunks/<i>/) stores three datasets.
U_scaled
Shape (nc, r), dtype float32. Stores U[:, :r] * sv[:r] — the left singular vectors scaled by their singular values. Written with the HDF5 shuffle filter followed by gzip (level 4). The shuffle filter reorders bytes by significance (all MSBs first, then the next byte, …), improving gzip compression on dense float32 data by ~10–20%.
vh_indices and vh_values
Shape (n_kept,), dtype int32 and float32 respectively. Store the sparse non-zero wavelet-packet coefficients of the right singular vectors. The dense array can be reconstructed as:
Vh_hat = np.zeros(vh_shape, dtype=np.float32)
Vh_hat.flat[vh_indices] = vh_valuesvh_shape is stored as an attribute on the chunk group. Storing indices + values rather than a dense zeroed array and relying on gzip to compress zero runs gives a smaller file and faster reads.
Chunk group attributes
| Attribute | Type | Description |
|---|---|---|
r |
int | SVD rank for this chunk |
ns |
int | Number of samples in this chunk |
ns_extended |
int | Samples including guard bands |
vh_shape |
2-tuple | Shape of the dense Vh_hat array |
HDF5 version compatibility
All files are written with libver=('earliest', 'v110'), which pins the format to HDF5 1.10 features (released 2017). Any HDF5 ≥ 1.10 — and therefore any h5py ≥ 3.0 — can read lfpack files.
Do not change this to libver='latest'. That setting resolves to whatever the installed library considers current at write time, so files written on different machines end up in different formats. Reading them elsewhere then fails with cryptic low-level errors (bad version number for layout message) with no indication that a version mismatch is the cause.
Legacy flat layout
Files written by lfpack ≤ 0.0.x use a flat layout with meta and chunks/ at the root (no recording/scale hierarchy). LFPackReader detects this automatically (by checking for a meta key at root level) and remains fully readable. Writing this format is no longer supported.