Skip to content

feat(protogen): detect enum/union Number-column collisions and attach book/sheet name to single-mode errors#422

Open
Kybxd wants to merge 1 commit into
masterfrom
detect-enum/union-number
Open

feat(protogen): detect enum/union Number-column collisions and attach book/sheet name to single-mode errors#422
Kybxd wants to merge 1 commit into
masterfrom
detect-enum/union-number

Conversation

@Kybxd

@Kybxd Kybxd commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

This PR fixes two related issues in parseEnumType / parseUnionType and the
single-mode error path that surfaces their errors. Both touch
internal/protogen/sheet_mode.go and the single-mode arm of
internal/protogen/protogen.go, so they are bundled into one change.


Part 1 — Detect Number-column collisions in enum / union value sheets

Problem

Enum and union descriptor sheets have an optional Number column.
The legacy rule was simple:

  • if a row sets Number explicitly → use it;
  • otherwise → fall back to i + 1 (the 0-based row index plus one).

unique: true on the descriptor's Number field enforces that
explicitly-set values don't collide with each other. But when some
rows set Number and others leave it blank, nothing prevents the
i + 1 fallback of a blank row from colliding with another row's
explicit value. The result was silently-duplicated proto field numbers,
which in turn produced duplicate generated enum values / union arms —
and the failure only surfaced much later, far from the offending sheet.

Concretely, a sheet like

Name Number
V1 3
V2
V3

would compile cleanly and emit two values with proto number 3.

Fix

Introduce resolveFieldNumber as the shared resolution helper for both
parseEnumType and parseUnionType:

  • Build a usedNumbers set from all rows that set Number explicitly,
    in a pre-pass.

  • Per row, if Number is set, return it as-is (uniqueness across
    explicit values is still enforced by the existing unique prop on
    the descriptor field).

  • Otherwise return i + 1, but check usedNumbers first; on collision
    return a structured error that names the offending row and tells the
    author exactly what to do:

    <kind> value "<name>" (row index <i>) has empty Number column,
    but the fallback value <N> is already taken by another row's
    explicit Number. Either fill in the Number column for this row,
    or leave Number blank for all rows
    

When no row sets Number, usedNumbers is empty and the collision
check is a no-op, so the byte-for-byte behaviour of the legacy
auto-numbered 1..N case is preserved.

Tests

internal/protogen/sheet_mode_test.go (new):

  • TestResolveFieldNumber
    • all-blank-uses-fallback — pure auto-numbering parity.
    • all-explicit-uses-explicit-values.
    • mixed-no-collision-uses-fallback — explicit 10, 30 + blanks at
      i=1,310, 2, 30, 4.
    • mixed-collision-returns-error — explicit 3 at i=0 + blank at
      i=2 (fallback 3) → error on the third row; first two resolve
      cleanly. Asserts the Reason text references the offending value
      name and the union kind.
  • TestNewUnionField_NumberFallback/collision-reported — same scenario
    driven through newUnionField, guards against future regressions
    that bypass resolveFieldNumber.

Part 2 — Attach BookName / SheetName to single-mode sheet errors

Problem

Errors raised inside single-mode sheets (MODE_ENUM_TYPE /
MODE_STRUCT_TYPE / MODE_UNION_TYPE) — including the new collision
error from Part 1 — render with empty structured fields in the Desc
summary:

BookName:  <no value>
SheetName: <no value>
Reason:    <kind> value "..." (row index N) has empty Number column, ...

The renderer template has dedicated BookName / SheetName slots, but
this code path never populated them.

Root cause

parseSpecialSheetMode's single-mode branches return their inner error
bare:

case tableaupb.Mode_MODE_UNION_TYPE:
    if err := parseUnionType(...); err != nil {
        return nil, err   // no WrapKV
    }

convertTableSheet only does sheetCollector.Collect(err) — also
without attaching anything — and bookCollector.Collect(sheetErr) at
the outer layer doesn't wrap either. So the entire single-mode error
chain never carries KeyBookName / KeySheetName as structured fields.

Multi-mode (MODE_*_TYPE_MULTI) was already fine because each block
already does WrapKV(KeyBookName, KeySheetName) per-block at the
parse-error site. This PR brings single-mode to parity using one unified
attach point instead of duplicating the wrap across three branches.

Fix

Make convertTableSheet the single source of truth for
BookName / SheetName on the single-mode error path:

worksheets, err := gen.parseSpecialSheetMode(...)
if err != nil {
    return sheetCollector.Collect(xerrors.WrapKV(err,
        xerrors.KeyBookName,  debugBookName,
        xerrors.KeySheetName, debugSheetName))
}

One edit covers all three single-mode branches.

xerrors.NewDesc merges duplicate KV with innermost-wins semantics
(internal/x/xerrors/desc.go), so this outer layer doesn't fight with
anything wrapped deeper down.

Cleanup of the downstream sources

With the unified attach point in place, downstream error sources no
longer need to plumb / wrap sheet name themselves:

  • resolveFieldNumber does not accept sheetName and does not
    WrapKV(KeySheetName, ...). Its Reason stays purely about the
    collision (kind + value name + row index + conflicting number).
  • newUnionField does not accept sheetName. The previous
    xerrors.Wrapf(err, "failed to parse union type %s of sheet: %s", ...)
    loses its sheet-name tail and becomes "failed to parse union type %s".
  • parseEnumType / parseUnionType call sites updated accordingly.

This eliminates a second source of truth and removes the temptation for
future contributors to inline sheet.Name into free-text "for
convenience".


After this change

Single-mode sheets now render fully-populated BookName / SheetName,
matching multi-mode behaviour, and the new collision error is reported
exactly where the author can act on it:

BookName:  <actual book>
SheetName: <actual sheet>
Reason:    <kind> value "..." (row index N) has empty Number column,
           but the fallback value M is already taken by another row's
           explicit Number. Either fill in the Number column for this
           row, or leave Number blank for all rows

Verification

  • go test ./... green, incl. test/functest e2e codegen.
  • go build ./... clean.
  • No new lint findings.

parseSpecialSheetMode's single-mode branches (MODE_ENUM_TYPE / MODE_STRUCT_TYPE / MODE_UNION_TYPE) return their inner error bare, while multi-mode branches already WrapKV per-block. As a result errors surfaced from single-mode sheets rendered 'BookName: <no value>' / 'SheetName: <no value>' in the Desc summary.

Make convertTableSheet the single source of truth for those structured fields: WrapKV(KeyBookName, KeySheetName) once on the err path right after parseSpecialSheetMode returns. This covers all three single-mode branches with one edit and keeps downstream error sources (resolveFieldNumber / newUnionField) focused on their pure free-text Reason.

Also drop the now-redundant sheetName plumbing from resolveFieldNumber / newUnionField and stop inlining the sheet name into the 'failed to parse union type ...' Reason, so there is exactly one layer that owns Book/SheetName.
@github-actions

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 26, 2026, 7:50 AM

@Kybxd Kybxd requested a review from wenchy June 26, 2026 07:52
@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 80.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.95%. Comparing base (b6690c8) to head (5ec1c0f).

Files with missing lines Patch % Lines
internal/protogen/protogen.go 0.00% 3 Missing ⚠️
internal/protogen/sheet_mode.go 88.88% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #422      +/-   ##
==========================================
+ Coverage   74.94%   74.95%   +0.01%     
==========================================
  Files          88       88              
  Lines        9388     9408      +20     
==========================================
+ Hits         7036     7052      +16     
- Misses       1781     1784       +3     
- Partials      571      572       +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines +93 to +94
// resolveFieldNumber computes the proto field number for one row of an
// enum/union value table whose "Number" column is optional.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also for Struct definition in Sheet.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not allowed to specify Number for structs since they may nest other structs.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants