Skip to content

fix(svm): lower-bound quorum for getSlot to reject single-provider tip lag#1454

Open
droplet-rl wants to merge 1 commit into
masterfrom
droplet/T90K0AL22-C09KEFHB9JB-1780055845-867349
Open

fix(svm): lower-bound quorum for getSlot to reject single-provider tip lag#1454
droplet-rl wants to merge 1 commit into
masterfrom
droplet/T90K0AL22-C09KEFHB9JB-1780055845-867349

Conversation

@droplet-rl

Copy link
Copy Markdown
Contributor

Summary

QuorumFallbackSolanaRpcFactory._getQuorum only enforces quorum for getBlock / getBlockTime; getSlot falls through to quorum 1. The dataworker disputer's chain-tip read is therefore single-provider even when configured with NODE_QUORUM=2. On 2026-05-29 a transient Chainstack lag drove the disputer's latestHeightSearched below the previous bundle's end slot — getWidestPossibleExpectedBlockRange soft-paused, validatePendingRootBundle then fired its "end block > expected + buffer" dispute on the proposer's (healthy) bundle.

Strict-equality quorum cannot apply to getSlot — providers' tip values converge but never agree exactly at the head of the chain. This PR adds lower-bound quorum semantics for getSlot: query all configured providers in parallel, sort successful bigint responses descending, and return the value at index K-1. Semantically: "at least K providers report the chain has reached at least this slot."

  • Rejects one lagging provider when N > K (the laggard drops out of the top-K window).
  • Forces a single outlier-high provider to have at least one ally to influence the result.
  • Guarded on nodeQuorumThreshold > 1 so quorum=1 bots keep the single-provider fast path (no extra RPC load on relayer fills).
  • Throws with the existing wrap format when fewer than K providers respond successfully.

Logs divergentProviders plus per-provider tip values when providers disagree, mirroring the strict-equality path's mismatch warn.

Precondition

For a bot configured with NODE_QUORUM=2, the rejection of a single lagging provider only kicks in when N ≥ 3 providers are configured. Solana RPC lists in bot-configs already exceed this in practice (quicknode/quicknode_jito + alchemy + chainstack + helius), so no config change is needed to realize the benefit.

Why this is safe

  • The new path is only taken for method === "getSlot" with nodeQuorumThreshold > 1. Every other method retains the existing equality-quorum + fallback flow, byte-for-byte.
  • The K-th highest is a deterministic, well-defined point on the sorted distribution; it can only equal a value some real provider returned.
  • Failure mode preserved: when < K providers succeed, the call throws with the same createSendErrorWithMessage wrap shape (Not enough providers succeeded on getSlot call to reach lower-bound quorum (X/K)), so callers handle it identically.
  • No SDK consumer relies on getSlot returning the single highest provider's value — getNearestSlotTime walks back from whatever slot it receives, so a slightly lower tip is harmless.

Out of scope / follow-ups

This PR scopes narrowly to getSlot. The same pattern applies cleanly to getBlockHeight and (with shape-specific aggregation) to getLatestBlockhash; I'd handle each in a follow-up PR once we've validated lower-bound quorum in production for getSlot first.

Test plan

  • yarn hardhat test test/providers/solana/quorumFallbackRpcFactory.test.ts — 14/14 (6 new lower-bound cases: K-th-highest selection, outlier-high rejection, fewer-than-K failure, all-equal short-circuit, single-provider fast path on quorum=1, exact min-of-K when only K succeed).
  • yarn tsc --noEmit -p tsconfig.build.json clean.
  • yarn lint-check clean.
  • Watch a disputer run on zion-across-disputer after SDK bump lands and confirm no false dispute on getSlot divergence.

🤖 Generated with Claude Code

…p lag

`QuorumFallbackSolanaRpcFactory._getQuorum` only enforced quorum for
`getBlock` / `getBlockTime`; `getSlot` fell through to quorum 1, so the
dataworker disputer's chain-tip read was single-provider even with
`NODE_QUORUM=2`. A Chainstack lag drove the disputer's
`latestHeightSearched` below the previous bundle's end slot, soft-pausing
the expected range and disputing the proposer's (healthy) bundle.

Strict-equality quorum cannot apply to `getSlot` — providers' tip values
converge but never agree exactly at the head of the chain. Instead this
PR adds "lower-bound quorum" semantics for `getSlot`: query all
configured providers in parallel, sort successful bigint responses
descending, and return the value at index `(K-1)`. Semantically: "at
least K providers report the chain has reached at least this slot."

- Rejects one lagging provider when N > K (the laggard drops out of the
  top-K window).
- Forces a single outlier-high provider to have at least one ally to
  influence the result.
- Guarded on `nodeQuorumThreshold > 1` so quorum=1 bots keep the
  single-provider fast path (no extra RPC load on relayer fills).
- Falls back to throw with the existing wrap format when fewer than K
  providers respond successfully.

Logs `divergentProviders` + per-provider tip values when providers
disagree, mirroring the strict-equality path's mismatch warn.

Test plan
- `yarn hardhat test test/providers/solana/quorumFallbackRpcFactory.test.ts` (14/14)
- `yarn tsc --noEmit -p tsconfig.build.json` clean
- `yarn lint-check` clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant