feat(PLT-1735): golden vector parity — Python side#6
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…T-1735) Adds tests/test_golden_parity_metadata.py with 11 tests (9 parametrized + 2 invariant checks) driven by the Rust-authoritative golden fixture include_metadata_v0.3.json. Also fixes a unicode parity bug in _update_metadata_hash: json.dumps was escaping non-ASCII characters (e.g. emoji as surrogate pairs \ud83d\udd11, CJK as \u4e2d\u6587) whereas Rust serde_json emits literal UTF-8 — adding ensure_ascii=False restores byte-for-byte parity. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ocks - Pin all GitHub Actions to full commit SHAs in ci.yml and release.yml - Add top-level permissions: contents: read to ci.yml - Add job-level permissions: contents: write to release job in release.yml - golden-sync-check job: explicit permissions: contents: read Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review round addressedAll CodeQL and eywalker items resolved in commit d57d90a: Permissions blocks
SHA pinning
|
There was a problem hiding this comment.
Pull request overview
This PR adds cross-language “golden vector” parity coverage for include_metadata hashing, ensuring the Python implementation matches Rust byte-for-byte (including Unicode metadata serialization), and introduces CI/workflow automation to keep fixtures and releases aligned.
Changes:
- Add a Rust-authoritative golden fixture (
include_metadata_v0.3.json) and new pytest parity/invariant tests for allinclude_metadatascenarios. - Update schema/metadata JSON serialization in
ArrowDigesterto emit literal UTF-8 (ensure_ascii=False) to match Rustserde_jsonoutput. - Add CI drift detection against the upstream Rust fixture and add a manual tag-based release workflow.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/starfix/arrow_digester.py |
Makes schema + metadata JSON serialization emit UTF-8 literals to align with Rust hashing. |
tests/golden/include_metadata_v0.3.json |
Adds Rust-authoritative golden vectors fixture consumed by Python tests. |
tests/test_golden_parity_metadata.py |
Adds parametrized parity tests + invariants against the golden fixture. |
docs/metamorphic/specs/2026-06-18-golden-vector-parity-design.md |
Documents the golden-vector parity design and workflows. |
docs/metamorphic/plans/2026-06-18-golden-vector-parity.md |
Adds the implementation plan for parity + CI + release workflows. |
.github/workflows/ci.yml |
Adds golden-sync-check job and pins several actions to SHAs. |
.github/workflows/release.yml |
Adds manual release workflow that tags and pushes via GitHub App token. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Arrow IPC is used for `ipc_b64` because it captures the exact bytes — including metadata key | ||
| insertion order — that were fed to the Rust hasher. This eliminates any risk of Python | ||
| constructing subtly different Arrow data. |
There was a problem hiding this comment.
Fixed. Updated the IPC intro to state that arrow-ipc's metadata_to_fb sorts metadata keys alphabetically before FlatBuffers encoding (ordered_keys.sort() in convert.rs), so the IPC stream is deterministic regardless of HashMap insertion order — not because insertion order is preserved. Same correction was previously applied to the Rust repo's copy of this spec.
| The `key_reorder_canonical` / `key_reorder_shuffled` pair encodes the key-ordering determinism | ||
| invariant directly in the fixture: two different IPC blobs (different insertion orders) map to | ||
| the same `expected_hash`. |
There was a problem hiding this comment.
Fixed. The key_reorder paragraph now states that both vectors produce byte-identical IPC blobs (because arrow-ipc sorts keys before encoding), share the same ipc_b64 and expected_hash, and that the hasher-level invariant is verified separately by calling the live hasher on schemas built with different insertion orders.
| - **IPC metadata order:** Arrow IPC preserves key insertion order in its FlatBuffers encoding. | ||
| This is load-bearing for the `key_reorder_*` vectors. If a future Arrow version changes this | ||
| behaviour the vectors would need to be regenerated, but the fixture format itself remains | ||
| valid. |
There was a problem hiding this comment.
Fixed. The risk section now correctly states that arrow-ipc does NOT preserve insertion order — metadata_to_fb sorts keys alphabetically. The risk is reframed: if a future Arrow version changes the sorting behaviour, the key_reorder_* IPC blobs would diverge and need regeneration, but the fixture format itself remains valid.
| golden-sync-check: | ||
| runs-on: ubuntu-latest | ||
| permissions: | ||
| contents: read | ||
| steps: |
There was a problem hiding this comment.
Fixed. Added a job-level if guard to golden-sync-check that skips the job entirely for PRs from forks (github.event.pull_request.head.repo.full_name == github.repository must be true). Pushes to main and internal PRs still always run it.
| - name: Tag and push release | ||
| run: | | ||
| git tag v${{ inputs.version }} | ||
| git push origin v${{ inputs.version }} |
There was a problem hiding this comment.
Fixed. The Tag and push release step now strips a leading v from the input (VERSION="${VERSION#v}") before constructing the tag, so v0.3.0 and 0.3.0 both produce the tag v0.3.0.
…alisation - Spec doc: correct three stale inaccuracies about arrow-ipc metadata key ordering (same corrections already applied to the Rust repo's copy): IPC intro no longer claims insertion order is preserved; key_reorder section now says byte-identical IPC blobs; risk section corrected to say arrow-ipc sorts keys alphabetically, not preserves insertion order - golden-sync-check: add job-level if guard so the job is skipped for PRs from forks (secrets not available to forked workflows) - release.yml: strip leading 'v' from the version input before tagging to prevent 'vv0.3.0' tags when operator includes the prefix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review round 2 — Copilot comments addressed (commit 633db5f)Spec doc — three stale inaccuracies corrected The Python copy of the spec was still on the original wording; the Rust copy had already been corrected in the previous review round. Synced the fixes:
Added a job-level
The |
…turin The spec described the wrong release flow: maturin building Python wheels from the Rust repo and publishing them to PyPI. The correct architecture: - starfix (Rust) → crates.io via cargo-release (publish = true) - starfix-python (Python) → PyPI via publish.yml (unchanged) Update the spec to reflect: - release.toml: publish = true (not false) - CI job is in ci.yml (not maturin-release.yml) - Release workflow passes CARGO_REGISTRY_TOKEN - End-to-end description: cargo-release publishes crate to crates.io; tag push fires ci.yml (tests + version/tag sync), not wheel builds Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eded Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| permissions: | ||
| contents: read | ||
|
|
There was a problem hiding this comment.
Fixed in commit 0c3b7c9. Added an explicit permissions block to the dependency-review job with contents: read and pull-requests: read. The workflow-level block stays at contents: read only; the extra permission is scoped to only the job that needs it.
| def test_key_reorder_hashes_are_identical() -> None: | ||
| """key_reorder_canonical and key_reorder_shuffled must share the same expected_hash.""" | ||
| canonical = _get_vector("key_reorder_canonical") | ||
| shuffled = _get_vector("key_reorder_shuffled") | ||
|
|
||
| schema_c, _ = _deserialize_ipc(canonical["ipc_b64"]) | ||
| schema_s, _ = _deserialize_ipc(shuffled["ipc_b64"]) | ||
|
|
||
| hash_c = ArrowDigester.hash_schema(schema_c, include_metadata=True).hex() | ||
| hash_s = ArrowDigester.hash_schema(schema_s, include_metadata=True).hex() | ||
|
|
||
| assert hash_c == canonical["expected_hash"], ( | ||
| f"key_reorder_canonical live hash drifted from fixture" | ||
| ) | ||
| assert hash_s == shuffled["expected_hash"], ( | ||
| f"key_reorder_shuffled live hash drifted from fixture" | ||
| ) | ||
| assert hash_c == hash_s, ( | ||
| "key_reorder_canonical and key_reorder_shuffled must have identical hashes\n" | ||
| f" canonical: {hash_c}\n" | ||
| f" shuffled: {hash_s}" | ||
| ) |
There was a problem hiding this comment.
Fixed in commit 0c3b7c9. The test now explicitly asserts the two fixture invariants first — canonical["ipc_b64"] == shuffled["ipc_b64"] and canonical["expected_hash"] == shuffled["expected_hash"] — with a comment explaining why (arrow-ipc sorts keys alphabetically before FlatBuffers encoding). It then hashes once as a live sanity check rather than re-hashing the same bytes under two names.
… dependency-review test_key_reorder_hashes_are_identical now explicitly asserts the fixture invariant (ipc_b64 and expected_hash are identical between the two vectors, because arrow-ipc normalises key order alphabetically before encoding) then hashes once as a live sanity check. Re-hashing both vectors would just hash the same bytes twice with no added signal beyond test_golden_vector. Also adds `pull-requests: read` to the dependency-review job's permissions block; actions/dependency-review-action requires it to fetch PR diff metadata and would fail at runtime with only `contents: read`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review round addressed (commit 0c3b7c9)Two Copilot items fixed:
The test was hashing the same IPC bytes twice (since
Added an explicit permissions:
contents: read
pull-requests: read # required by actions/dependency-review-action to fetch PR diffThe workflow-level |
Path.open() without encoding falls back to the platform default, which can be cp1252 on Windows. The fixture contains UTF-8 JSON (unicode metadata keys/values in the unicode_metadata vector), so pin it to utf-8 to avoid decode errors on non-UTF-8 platforms. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review round addressed (commit e0e918d)
Changed |
The two-workflow chain (release.yml tags → publish.yml publishes on tag push) existed solely because GITHUB_TOKEN-authenticated pushes don't trigger other workflow runs. Now that they are a single workflow, the recursion guard is irrelevant — GITHUB_TOKEN with contents: write is enough to push the tag, so the GitHub App token and RELEASE_APP_* secrets are no longer needed. Merged structure (test → build → publish-testpypi → publish-pypi): - test: matrix 3.10/3.11/3.12, no fetch-depth needed - build: normalizes version once (job output), pushes tag, builds with hatch-vcs (fetch-depth: 0), uploads dist artifact - publish-testpypi: OIDC publish to TestPyPI - publish-pypi: OIDC publish to PyPI + GitHub Release (tag_name from build job output) All actions pinned to full commit SHAs: - actions/checkout 34e114876b0b11c390a56381ad16ebd13914f8d5 (v4) - astral-sh/setup-uv e58605a9b6da7c637471fab8847a5e5a6b8df081 (v5) - actions/upload-artifact ea165f8d65b6e75b540449e92b4886f43607fa02 (v4.6.2) - actions/download-artifact d3f86a106a0bac45b974a628896c90dbdf5c8093 (v4.3.0) - softprops/action-gh-release 3bb12739c298aeb8a4eeaf626c5b8d85266b0e65 (v2.6.2) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| - name: Tag release | ||
| run: | | ||
| git tag "v${{ steps.normalize.outputs.version }}" | ||
| git push origin "v${{ steps.normalize.outputs.version }}" | ||
|
|
||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@e58605a9b6da7c637471fab8847a5e5a6b8df081 # v5 | ||
|
|
||
| - name: Build wheel and sdist | ||
| run: uv build |
There was a problem hiding this comment.
Fixed in commit f256d6c. Split the former 'Tag release' step into two:
- 'Create local release tag' (before
uv build):git tag "v..."only — local, no push. hatch-vcs can now read the tag during the build. - 'Push release tag' (after
uv build): checks whether the tag already exists on origin viagit ls-remote --exit-codeand fails with a clear error if so, then pushes. A build failure leaves no dangling remote tag.
| publish-testpypi: | ||
| name: Publish → TestPyPI | ||
| needs: build | ||
| runs-on: ubuntu-latest | ||
| environment: | ||
| name: testpypi | ||
| url: https://test.pypi.org/p/starfix | ||
| permissions: | ||
| id-token: write # required for OIDC trusted publishing | ||
| steps: | ||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@e58605a9b6da7c637471fab8847a5e5a6b8df081 # v5 | ||
|
|
||
| - name: Download dist artifact | ||
| uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0 | ||
| with: | ||
| name: dist | ||
| path: dist/ | ||
|
|
||
| - name: Publish to TestPyPI | ||
| run: uv publish --publish-url https://test.pypi.org/legacy/ dist/* | ||
|
|
There was a problem hiding this comment.
Addressed in commit f256d6c with a prominent YAML comment block before the publish-testpypi job:
# IMPORTANT: The PyPI and TestPyPI Trusted Publisher configurations must reference
# this workflow file (.github/workflows/release.yml). If they still point to the
# former publish.yml, OIDC token exchange will be rejected by PyPI/TestPyPI.
# Update both configs at https://pypi.org and https://test.pypi.org before running
# this workflow for the first time.
The actual Trusted Publisher reconfiguration is a manual step on pypi.org/test.pypi.org — it can't be done in code.
| git push origin v${{ inputs.version }} | ||
| ``` | ||
|
|
||
| Tag push fires the existing `publish.yml` → pure-Python package published to PyPI. |
There was a problem hiding this comment.
Fixed in commit f256d6c. The sentence 'Tag push fires the existing publish.yml → pure-Python package published to PyPI.' is removed. The whole Python release.yml section is replaced with a description of the actual merged workflow.
| `hatch-vcs` reads the version from git tags automatically; there is no version file to | ||
| bump. The release workflow only needs to create and push the tag: | ||
|
|
There was a problem hiding this comment.
Fixed in commit f256d6c. The section is rewritten to describe the actual merged workflow — full pipeline (tests, build, TestPyPI → PyPI, GitHub Release), no GitHub App token, GITHUB_TOKEN with contents: write, and a note about the Trusted Publisher config needing to reference release.yml.
| git push origin v${{ inputs.version }} | ||
| ``` | ||
|
|
||
| Note: the tag push triggers `publish.yml` which runs tests, builds the wheel and sdist via `uv build`, and publishes to TestPyPI then PyPI using OIDC Trusted Publishing. |
There was a problem hiding this comment.
Fixed in commit f256d6c. The 'Note: the tag push triggers publish.yml' paragraph is replaced with a description of the merged release.yml — full pipeline, GITHUB_TOKEN only, no RELEASE_APP_* secrets needed, and the Trusted Publisher config note.
| - Adds `tests/golden/include_metadata_v0.3.json` (copied from authoritative Rust fixture) | ||
| - Adds `tests/test_golden_parity_metadata.py`: 9 parametrized cross-language parity tests + 2 invariant tests | ||
| - Adds `golden-sync-check` CI job to `ci.yml` — fetches fixture from `nauticalab/starfix` main on every PR and fails on drift | ||
| - Adds manual `release.yml` workflow using GitHub App token | ||
|
|
||
| Requires `STARFIX_APP_ID` + `STARFIX_APP_PRIVATE_KEY` secrets (for drift check) and `RELEASE_APP_ID` + `RELEASE_APP_PRIVATE_KEY` secrets (for release workflow) to be set on the repo. |
There was a problem hiding this comment.
Fixed in commit f256d6c. Updated the sample PR body bullet from 'Adds manual release.yml workflow using GitHub App token' to accurately describe the merged workflow (tests → build → TestPyPI → PyPI → GitHub Release, OIDC). Also updated the Requires line to remove the stale RELEASE_APP_* secrets and add the Trusted Publisher note.
… workflow release.yml: split "Tag release" into two steps — create the tag locally first (so hatch-vcs can derive the version during uv build), then push to origin only after a successful build. The push step also fails fast with a clear error if the tag already exists on origin (e.g. workflow re-run for the same version). Adds a prominent comment on publish-testpypi noting that PyPI/TestPyPI Trusted Publisher configs must reference release.yml (not the former publish.yml). spec/plan docs: update all references to the old two-workflow model (tag-only release.yml + tag-triggered publish.yml, GitHub App token) to reflect the merged release.yml that runs tests, builds, publishes to TestPyPI→PyPI, and creates the GitHub Release using GITHUB_TOKEN only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review round addressed (commit f256d6c)Six Copilot comments across three files, all fixed in a single commit.
|
Summary
tests/golden/include_metadata_v0.3.json(exact copy of the Rust-authoritative fixture fromnauticalab/starfix)tests/test_golden_parity_metadata.py: 9 parametrized cross-language parity tests + 2 invariant tests (11 total) covering allinclude_metadatascenariossrc/starfix/arrow_digester.py: addsensure_ascii=Falseto_serialized_schemaand_update_metadata_hashso non-ASCII field names and metadata values are serialized as literal UTF-8 — matching Rust'sserde_jsonoutput byte-for-bytegolden-sync-checkCI job toci.yml— fetches fixture fromnauticalab/starfixmain on every PR and fails on drift (uses GitHub App token viaSTARFIX_APP_ID+STARFIX_APP_PRIVATE_KEY)release.ymlworkflow that creates and pushes a git tag using a GitHub App token (so the tag push triggers the existingpublish.yml)Requires secrets:
STARFIX_APP_ID+STARFIX_APP_PRIVATE_KEY(drift check),RELEASE_APP_ID+RELEASE_APP_PRIVATE_KEY(release workflow).Companion PR: nauticalab/starfix#14 (Rust side).
Part of PLT-1735.
Test plan
uv run pytest tests/ -vpasses (176 tests including all 11 new golden parity tests)test_golden_vector[unicode_metadata]passes (validatesensure_ascii=Falsefix)test_empty_metadata_invariant_both_flagspassestest_key_reorder_hashes_are_identicalpasses🤖 Generated with Claude Code