Skip to content

evm-rpc: trace state diffs per-transaction when a block's whole-block response is too big#507

Open
elina-chertova wants to merge 1 commit into
masterfrom
alert-fix/L8b2z3-base-sepolia-statediff-pertx
Open

evm-rpc: trace state diffs per-transaction when a block's whole-block response is too big#507
elina-chertova wants to merge 1 commit into
masterfrom
alert-fix/L8b2z3-base-sepolia-statediff-pertx

Conversation

@elina-chertova

Copy link
Copy Markdown
Contributor

Cause (proven)

The base-sepolia EVM dumper (dump-base-sepolia-0, ns evm-archive) stalled — block rate collapsed to 0–2 blocks/sec with a 100h+ ETA, stuck around block 42625944 (0x28a6b98).

That block is huge (4676 txns, ~28 MB callTracer). The dumper's addDebugStateDiffs issues a whole-block debug_traceBlockByNumber with prestateTracer + diffMode. Every probed provider rejects that one block's response as oversized:

{"code":-32008,"message":"Response is too big","data":"Exceeded max limit of 167772160"}

Confirmed against alchemy, dwellir (the newly-configured base-sepolia archive endpoint), and uniblock. A block batch can't be split below a single block, so the dumper retried/crash-looped on that block forever, emitting no new data.

Probing the per-transaction path (debug_traceTransaction prestateTracer/diffMode) on txns of the same block returns small responses (~4.7 KB, ~0.11s each) — so the data is fetchable, just not as one whole-block response.

Fix

In evm/evm-rpc/src/rpc.ts (@subsquid/evm-rpc, used by the dumper):

  • Detect the size-cap error (-32008 / "Response is too big" / "Exceeded max limit") in the whole-block state-diff validateError, returning a RESPONSE_TOO_BIG sentinel instead of throwing.
  • On that sentinel, fall back to traceStateDiffsPerTransaction: trace each tx individually via debug_traceTransaction and reassemble the per-block DebugStateDiffResult[] (reattaching tx hashes, since the per-tx call returns a bare diff with no envelope).
  • If any single tx still can't be traced, the block is flagged invalid for retry (existing behaviour) — no silent data loss.

This is distinct from #505, which retries the transient -32020 "response too large"; that handling explicitly excludes the persistent oversized single-block case fixed here.

Test

Added evm/evm-rpc/test/state-diff-too-big.test.ts (+ MockRpcClient error/validateError support):

  • RED against pre-fix rpc.ts: throws RpcError: Response is too big (reproduces the production crash).
  • GREEN after fix: falls back to per-tx tracing, logs the fallback, block succeeds.
  • Full @subsquid/evm-rpc suite: 148 tests pass, tsc clean. Rush change file included.

🤖 Generated with Claude Code

… response is too big

A very large block's debug_traceBlockByNumber prestateTracer (diffMode)
response can exceed the provider's JSON-RPC response size cap (commonly
160 MiB), which geth/erigon and managed providers (alchemy, dwellir,
uniblock) reject with -32008 "Response is too big" / "Exceeded max limit".
The block batch can't be split below a single block, so the dumper stalled
/crash-looped forever on that block, emitting no new data.

Detect the size-cap error and fall back to per-transaction
debug_traceTransaction, where each response is small, then reassemble the
per-block state diff. Mirrors the existing tolerant trace paths instead of
treating an oversized response as fatal.
@elina-chertova

Copy link
Copy Markdown
Contributor Author

Same root cause recurred today — fresh live evidence confirming this fix is still needed (and not yet deployed).

Alert: base-sepolia_No_Dumper_Data, onset 2026-06-27T00:20 UTC. dump-base-sepolia-0 (ns evm-archive) went silent at 00:04 UTC, stuck after raw block 42971578; chain head is 43375082 (~403k blocks behind). 0 restarts, no error logs — the dumper is wedged on the next block's whole-block state-diff exactly as described here.

Stuck block 42971579 (0x28fb1bb): 6294 txns, 712M gas, ~38 MB callTracer. Whole-block debug_traceBlockByNumber + prestateTracer(diffMode) is rejected identically by two independent providers:

  • dwellir (api-base-sepolia-archive.n.dwellir.com, the active archive endpoint): {"code":-32008,"message":"Response is too big","data":"Exceeded max limit of 167772160"}
  • alchemy (base-sepolia.g.alchemy.com): identical -32008 … Exceeded max limit of 167772160

Per-transaction debug_traceTransaction prestateTracer/diffMode on the same block returns ~4.7 KB in ~0.14s — i.e. the per-tx fallback in this PR resolves it. A provider swap cannot help (160 MiB cap is a node-level limit hit by every provider), so this code fix is the durable resolution. Requesting review/merge + an evm-dump image rebuild & deploy so base-sepolia stops re-paging.

@elina-chertova

Copy link
Copy Markdown
Contributor Author

Re-paged again today (2nd recurrence in ~8h). Same root cause, fix is already in this PR — just not merged/deployed yet.

Alert: base-sepolia_No_Dumper_Data, onset 2026-06-27T08:06 UTC. dump-base-sepolia-0 (ns evm-archive, image subsquid/evm-dump:b9e5131b, 0 restarts) went silent at ~08:02 UTC, wedged after block 43012892; chain head ~43258109 (~245k behind). Rate decayed 2→0 blocks/sec, no error logs — the classic whole-block state-diff wedge.

Stuck block 43012893 (0x29021a5, only 21 txns / 4.9M gas): whole-block debug_traceBlockByHash + prestateTracer(diffMode) is rejected -32008 "Response is too big" … Exceeded max limit of 167772160. Note the block is tiny — the oversized response is the node tracer's whole-block aggregation, not heavy chain data.

New independent cross-check this run: besides dwellir (active archive endpoint), the official public endpoint https://sepolia.base.org returns the byte-identical -32008 … 167772160 for the same block — a different operator/node than dwellir, reconfirming a node-level cap, not a provider-config issue (a swap can't help). Per-transaction debug_traceTransaction prestateTracer/diffMode on this block's txns returns ~4.7–30 KB each — i.e. the per-tx fallback in this PR resolves it.

Requesting review/merge + an evm-dump image rebuild & deploy so base-sepolia stops re-paging.

@elina-chertova elina-chertova requested a review from tmcgroul June 29, 2026 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant