fix(svm): lower-bound quorum for getSlot to reject single-provider tip lag#1454
Open
droplet-rl wants to merge 1 commit into
Open
fix(svm): lower-bound quorum for getSlot to reject single-provider tip lag#1454droplet-rl wants to merge 1 commit into
droplet-rl wants to merge 1 commit into
Conversation
…p lag `QuorumFallbackSolanaRpcFactory._getQuorum` only enforced quorum for `getBlock` / `getBlockTime`; `getSlot` fell through to quorum 1, so the dataworker disputer's chain-tip read was single-provider even with `NODE_QUORUM=2`. A Chainstack lag drove the disputer's `latestHeightSearched` below the previous bundle's end slot, soft-pausing the expected range and disputing the proposer's (healthy) bundle. Strict-equality quorum cannot apply to `getSlot` — providers' tip values converge but never agree exactly at the head of the chain. Instead this PR adds "lower-bound quorum" semantics for `getSlot`: query all configured providers in parallel, sort successful bigint responses descending, and return the value at index `(K-1)`. Semantically: "at least K providers report the chain has reached at least this slot." - Rejects one lagging provider when N > K (the laggard drops out of the top-K window). - Forces a single outlier-high provider to have at least one ally to influence the result. - Guarded on `nodeQuorumThreshold > 1` so quorum=1 bots keep the single-provider fast path (no extra RPC load on relayer fills). - Falls back to throw with the existing wrap format when fewer than K providers respond successfully. Logs `divergentProviders` + per-provider tip values when providers disagree, mirroring the strict-equality path's mismatch warn. Test plan - `yarn hardhat test test/providers/solana/quorumFallbackRpcFactory.test.ts` (14/14) - `yarn tsc --noEmit -p tsconfig.build.json` clean - `yarn lint-check` clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
QuorumFallbackSolanaRpcFactory._getQuorumonly enforces quorum forgetBlock/getBlockTime;getSlotfalls through to quorum 1. The dataworker disputer's chain-tip read is therefore single-provider even when configured withNODE_QUORUM=2. On 2026-05-29 a transient Chainstack lag drove the disputer'slatestHeightSearchedbelow the previous bundle's end slot —getWidestPossibleExpectedBlockRangesoft-paused,validatePendingRootBundlethen fired its "end block > expected + buffer" dispute on the proposer's (healthy) bundle.Strict-equality quorum cannot apply to
getSlot— providers' tip values converge but never agree exactly at the head of the chain. This PR adds lower-bound quorum semantics forgetSlot: query all configured providers in parallel, sort successful bigint responses descending, and return the value at indexK-1. Semantically: "at least K providers report the chain has reached at least this slot."nodeQuorumThreshold > 1so quorum=1 bots keep the single-provider fast path (no extra RPC load on relayer fills).Logs
divergentProvidersplus per-provider tip values when providers disagree, mirroring the strict-equality path's mismatch warn.Precondition
For a bot configured with
NODE_QUORUM=2, the rejection of a single lagging provider only kicks in whenN ≥ 3providers are configured. Solana RPC lists inbot-configsalready exceed this in practice (quicknode/quicknode_jito + alchemy + chainstack + helius), so no config change is needed to realize the benefit.Why this is safe
method === "getSlot"withnodeQuorumThreshold > 1. Every other method retains the existing equality-quorum + fallback flow, byte-for-byte.< Kproviders succeed, the call throws with the samecreateSendErrorWithMessagewrap shape (Not enough providers succeeded on getSlot call to reach lower-bound quorum (X/K)), so callers handle it identically.getSlotreturning the single highest provider's value —getNearestSlotTimewalks back from whatever slot it receives, so a slightly lower tip is harmless.Out of scope / follow-ups
This PR scopes narrowly to
getSlot. The same pattern applies cleanly togetBlockHeightand (with shape-specific aggregation) togetLatestBlockhash; I'd handle each in a follow-up PR once we've validated lower-bound quorum in production forgetSlotfirst.Test plan
yarn hardhat test test/providers/solana/quorumFallbackRpcFactory.test.ts— 14/14 (6 new lower-bound cases: K-th-highest selection, outlier-high rejection, fewer-than-K failure, all-equal short-circuit, single-provider fast path on quorum=1, exact min-of-K when only K succeed).yarn tsc --noEmit -p tsconfig.build.jsonclean.yarn lint-checkclean.zion-across-disputerafter SDK bump lands and confirm no false dispute ongetSlotdivergence.🤖 Generated with Claude Code