fix(svm): resolve slot timestamps via getBlocks instead of per-slot walk#1447
fix(svm): resolve slot timestamps via getBlocks instead of per-slot walk#1447droplet-rl wants to merge 4 commits into
Conversation
SVMSpokePoolClient.getTimestampForBlock() and getNearestSlotTime() both walked backwards one slot at a time via getBlockTime() to recover from skipped slots. When a bundle boundary lands inside a skipped-slot run (observed on mainnet at slots 422461540-422461547), every BundleDataClient lookup for that boundary fans out into N getBlockTime() RPCs per call times NODE_QUORUM providers, repeated across every propose/validate/execute pass. In a production dataworker run on 2026-05-27 this kept the SVM executor primary alive for 57+ minutes making nothing but getBlockTime calls against the same 8-slot range, ultimately tripping the serverless hub's rejectSpokeDelay and triggering a retry that raced the primary. Replace the per-slot walk with findNearestProducedSlot(), which issues a single getBlocks() call to ask the RPC which slots in a 32-slot window produced a block, then iterates the window backwards in fixed steps if nothing is produced in the initial range. getBlocks() latency is flat (~75 ms) up to ~1024 slots on every provider we benchmarked, so the recovery cost is bounded regardless of skip-run length. Also memoise per-block timestamp lookups inside SVMSpokePoolClient so repeated boundary-slot queries across a single dataworker run resolve once. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 05a4e05372
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Attach a .catch() on the cached promise so a transient RPC failure in findNearestProducedSlot/getBlocks (or any other rejection from the resolver) clears the cache entry instead of poisoning getTimestampForBlock for the lifetime of the client. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Addressed both review comments in 73ae3a6:
|
- Trim verbose docstrings on findNearestProducedSlot and its defaults. - Loop uses ++i and `break` (over `return undefined`) so the function has a single trailing exit. - Invert the timestamp check in _resolveTimestampForBlock for a positive early return. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Addressed all five inline comments from this review in c612662:
|
The previous commit unconditionally issued getBlocks() + getBlockTime() for every slot lookup, regressing the common case (target slot has a block) from 1 RPC to 2. Restore the 1-RPC happy path in both SVMSpokePoolClient.getTimestampForBlock() and getNearestSlotTime() by probing getTimestampForSlot() for the target slot first, only falling through to findNearestProducedSlot() when the target itself was skipped. Also evict the per-block timestamp cache on rejection so transient RPC errors don't poison the cache for the rest of the run. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
⏳ Stopped 1 background subagent(s) at the 30m drain budget. Their results were not gathered: ✅ 3 background subagent(s) finished after the main reply: If you need the stopped results, ask again — the agent can re-run those subagents. |
53c8055 to
204db04
Compare
|
@codex review |
|
No action needed from me — that comment is a |
|
Codex Review: Didn't find any major issues. Bravo. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Codex came back clean ("Didn't find any major issues") on the current HEAD (204db04). No further code changes warranted from this trigger. PR is ready for human review. |
|
Approval received from pxrl on review 4371702294 — no inline comments, empty body. PR is approved at HEAD 204db04. No code changes needed; ready for merge whenever you'd like. |
|
Question, not a change request — replied inline on the PR. Short version: JS is single-threaded so the |
204db04 to
53c8055
Compare
Summary
SVMSpokePoolClient.getTimestampForBlock()andgetNearestSlotTime()recover from skipped slots by decrementing_slotone at a time and re-callinggetBlockTime. When a target slot lands inside a real skipped-slot run on Solana mainnet, the cost compounds — everyBundleDataClient.getBundleBlockTimestampslookup × NODE_QUORUM providers × propose/validate/execute pass.getBlockTimefor 57+ minutes against the same 8 slots before the serverless-orchestration hub'srejectSpokeDelayfired a retry that then raced the primary.findNearestProducedSlot(provider, targetSlot): a singlegetBlocks(lower, upper)call asks the RPC which slots in a 32-slot window produced a block, then iterates the window backwards in fixed steps if the initial range is empty.findNearestProducedSlotis exported fromarch/svm/SpokeUtils.tsand used by bothSVMSpokePoolClient.getTimestampForBlockandgetNearestSlotTime.SVMSpokePoolClient: bundle data loading hits the same boundary slots repeatedly across propose/validate/execute passes; storing the Promise (not the resolved value) also dedupes concurrent lookups.Why this is safe
getBlocksis a standard Solana JSON-RPC method (GetBlocksApiin@solana/kit; in the spec since v1.7 / mid-2021 whengetConfirmedBlockswas renamed). Verified live againstapi.mainnet-beta.solana.comand shape-checked against the kit type surface.getBlockTimeRPCs × 2 providers per single resolution.maxIterations × window(default 32 × 32 = 1024 slots backward). If we still find nothing, the function returnsundefinedand the caller throws the sameUnable to resolve time…error the old code did.getTimestampForBlockandgetNearestSlotTimeare unchanged; this is a pure implementation swap.Test plan
yarn lint-checkclean on touched files.yarn build(tsc) clean.test/Solana.findNearestProducedSlot.tscovers: produced slot in initial window, iterate backwards when initial window is empty, return undefined when no produced slot inside the iteration bound, stop descending past slot 0.yarn test.Follow-ups (separate PRs)
across-protocol/relayer.🤖 Generated with Claude Code