Skip to content

api: run discovery and pin currently-passing corpus files to the round-trip baseline #209

Description

@webern

Sub-task of #208. See the parent tracking issue for the full plan and context.

Goal

Baseline the current reality and bank the zero-cost win: run discovery, record the aggregate counts, and pin every file that already PASSes into roundtrip-baseline.txt. No api changes in this phase — only files that pass today get pinned.

Steps

  1. Run discovery and capture the output:

    make discover-api-roundtrip | tee /tmp/discovery.txt
    

    Each line is STATUS<TAB>relpath<TAB>detail (PASS|FAIL|SKIP|LOADFAIL|GETDATAFAIL|CREATEFAIL); the last line is the aggregate summary. The binary always exits 0.

  2. Record the aggregate counts from the summary line (e.g. N PASS, N FAIL, ... and the total discovered). These go into the baseline header comment (see below) — they are the durable historical record.

  3. Extract the PASS paths (the relpath column of every PASS line):

    grep -P '^PASS\t' /tmp/discovery.txt | cut -f2 | sort
    
  4. Append the new PASS paths to src/private/mxtest/api/roundtrip-baseline.txt, keeping the file sorted and excluding the one already pinned (ksuite/k016a_Miscellaneous_Fields.xml).

  5. Update the baseline header comment to reflect the new discovery run: add a dated entry with the fresh aggregate counts and the new pinned total, alongside the existing 2026-06-12 capture note (don't delete the history — append to it).

  6. Confirm the gate with the grown baseline:

    make test-api-roundtrip
    

    Every pinned file must pass (zero tolerance). Then make check for formatting.

  7. Commit the grown baseline + updated header in one deliberate commit referencing this issue.

What gets checked in

  • src/private/mxtest/api/roundtrip-baseline.txt — the new PASS paths appended, and the header comment updated with the dated counts. This is the only file changed.

No per-file discovery snapshot is checked in. Decision and rationale below.

Snapshot decision: ephemeral, not durable

The full per-file PASS/FAIL/... table is a point-in-time observation, not a contract. Nothing in the build reads it — the regression gate reads only roundtrip-baseline.txt. Phases 1–3 re-run discovery against the then-current code, so a checked-in dump would immediately go stale and actively mislead. Checking it in would add a consumer-less artifact that rots — against the repo's grain (data/corpus.xml and *.features.xml exist because tooling consumes them and are regenerable via make audit; a frozen discovery dump is regenerable via make discover-api-roundtrip but has no consumer).

The two facts worth keeping durable are (a) the grown baseline and (b) the aggregate counts. The baseline header is already the canonical home for this history (it documents the 2026-06-12 capture: 829 discovered, 794 produced output, 1 passed strict). Recording the new counts there keeps the record co-located with the artifact it explains — zero new files, zero rot.

Paste the full per-file table into this issue (or the PR) as ephemeral evidence of the run; it lives in the issue history, not the tree.

Format

  • Baseline path lines — existing format: one data/-relative path per line, # starts a comment, file kept sorted. Example: ksuite/k016a_Miscellaneous_Fields.xml.
  • Baseline header counts — a new dated paragraph in the existing # comment block, matching the prose style already there: date of run, total discovered, the PASS/FAIL/SKIP/LOADFAIL/GETDATAFAIL/CREATEFAIL breakdown, and the resulting pinned total.
  • Issue evidence — the raw harness TSV (STATUS<TAB>relpath<TAB>detail) plus the summary line, in a fenced block. No reformatting.

Definition of done

  • make discover-api-roundtrip has been run and its aggregate counts recorded.
  • Every file reported PASS by discovery is pinned in roundtrip-baseline.txt (sorted, deduped against the existing entry).
  • The baseline header comment is updated with a dated entry: new counts + new pinned total.
  • make test-api-roundtrip passes against the grown baseline (zero failures).
  • make check passes.
  • The full per-file discovery table + summary line is posted in this issue (or the linked PR) as evidence.
  • No discovery snapshot file is checked into the tree (baseline header carries the durable counts).
  • Commit references api: run discovery and pin currently-passing corpus files to the round-trip baseline #209; this issue is checked off in api: expand corpus coverage of the api round-trip pass-list #208.

Metadata

Metadata

Assignees

No one assigned

    Labels

    aiIssues opened by, or through, a coding agent.area/mx::apinon-breakingfixes or implementation that do not require breaking changestesting

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions