Skip to content

Add hdf5-meta extension for an optional HDF5 metadata sidecar#355

Closed
daltoncass wants to merge 2 commits into
sigmf:mainfrom
daltoncass:feature/hdf5_meta
Closed

Add hdf5-meta extension for an optional HDF5 metadata sidecar#355
daltoncass wants to merge 2 commits into
sigmf:mainfrom
daltoncass:feature/hdf5_meta

Conversation

@daltoncass

Copy link
Copy Markdown
Contributor

Proposed in #354.

Summary

Adds a canonical hdf5-meta extension namespace (v1.0.0) defining an OPTIONAL HDF5 sidecar file — recording.sigmf-meta.h5 — that stores a columnar duplicate of a Recording's metadata for faster, smaller, column-oriented access on Recordings with large captures/annotations arrays.

Design

  • Pure sidecar. The .sigmf-meta JSON file remains complete, authoritative, and SigMF Compliant. The HDF5 file is a derived cache and is never required to read a Recording.
  • Backwards compatible. Tools without HDF5 support ignore the extension fields and read the JSON exactly as today. The extension is declared with "optional": true in core:extensions.
  • Columnar layout. global → HDF5 attributes; captures/annotations → compound (structured) datasets, one column per field, dot-notation field names (core.sample_start), gzip SHOULD.
  • No on-disk format change to existing SigMF files. This is purely an additive extension, permitted under the spec's allowance for extensions to define new files (additional_content.md: extensions "MAY define new top-level SigMF Objects, key/value pairs, new files, ...").

Extension fields (global)

field required type description
hdf5-meta:file yes string sidecar filename, relative to the .sigmf-meta
hdf5-meta:version no string extension version the sidecar conforms to

Notes

Compliance

A Recording with this extension is compliant when (1) the JSON file is a valid SigMF Recording without the sidecar, (2) hdf5-meta is listed in core:extensions with hdf5-meta:file set, and (3) if present, the sidecar matches the structure in the extension doc and is content-equivalent to the JSON.

Defines the hdf5-meta v1.0.0 extension: an OPTIONAL HDF5 sidecar file
(recording.sigmf-meta.h5) that stores a columnar duplicate of a
Recording's metadata for faster, smaller column-oriented access. The
.sigmf-meta JSON remains complete and authoritative; the sidecar is a
derived cache. Tools without HDF5 support ignore the extension and read
the JSON unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Clarify that hdf5-meta is a metadata-only performance cache: it is not a
sample-data container, does not represent multiple channels or arrays, and
does not change the SigMF data model. In particular it leaves both
multichannel mechanisms (core:num_channels and SigMF Collections) untouched
and composes with Collections per-Recording. Pre-empts a reviewer question
on the spec PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@daltoncass

Copy link
Copy Markdown
Contributor Author

Moving this to the community-extensions repo per your guidance that extensions start there and migrate into core if adopted by multiple entities: sigmf/community-extensions#2

The extension text is unchanged aside from a new Scope and Non-Goals section (§0.1) addressing the multichannel question — it clarifies that hdf5-meta is a metadata-only cache and does not touch sample data or overlap with core:num_channels/Collections. Reference implementation remains at sigmf/sigmf-python#157. Let's keep discussion going on #354.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant