Skip to content

fix(evm-rpc): keep finalizer queue contiguous so finalized head advances past probe window#517

Open
elina-chertova wants to merge 1 commit into
masterfrom
alert-fix/1suVP6-evm-finalizer-gap
Open

fix(evm-rpc): keep finalizer queue contiguous so finalized head advances past probe window#517
elina-chertova wants to merge 1 commit into
masterfrom
alert-fix/1suVP6-evm-finalizer-gap

Conversation

@elina-chertova

Copy link
Copy Markdown
Contributor

Problem

For chains whose finalized head advances in large jumps (L2 / rollup batch finality — e.g. zksync-mainnet, worldchain-mainnet, alpen-testnet, robinhood-mainnet), the hot-block data service stops advancing its reported finalized head. hotblocks_last_finalized_block freezes while hotblocks_last_block (unfinalized head) keeps advancing, and the data service logs block finalization lags behind and prevents cache purging on every new head. This fires the Portal_Hotblocks_Head_Metrics_Absent_* alert (increase(hotblocks_last_finalized_block[12h]) == 0).

Observed in production: zksync-mainnet reported finalized frozen at 71020159 while the chain's real finalized head (eth_getBlockByNumber("finalized"), agreed by two independent providers) was 71027108 — a ~6,900-block gap that only grows.

Root cause

Finalizer.visit() in evm/evm-rpc/src/data-source/finalizer.ts bounds its probe queue by overwriting the last slot once it exceeds ~50 entries:

if (this.queue.length > 50) {
    this.queue[this.queue.length - 1] = getBlockRef(block)   // drops the middle
} else {
    this.queue.push(getBlockRef(block))
}

This keeps only the oldest ~50 unfinalized refs plus the single latest block, leaving a gap in between. The finalizer probes from the front and can only advance the finalized head through contiguous refs. When the finality lag exceeds the window, it finalizes the ~50 contiguous refs, hits the gap, and can never reach the true finalized head — so the reported finalized head freezes.

Chains with smooth, incremental finality (Ethereum PoS, most L1s) slide within the window and are unaffected — matching the fact that only large-jump-finality networks froze.

Fix

Keep the queue contiguous — always push. The queue is now bounded by the finality lag (the range we must track anyway), and it holds tiny BlockRefs (number + hash), so memory is negligible relative to the full-block buffer the data service already keeps.

Test

Added evm/evm-rpc/src/data-source/finalizer.test.ts. The regression test feeds >50 unfinalized blocks in one batch with the finalized head beyond the old cap and asserts the finalizer reaches the true finalized head.

  • pre-fix: expected 50 to be 120 (frozen at the 50-block cap)
  • post-fix: passes; full evm-rpc suite green (154 passed, 27 skipped), tsc build and tsc --noEmit -p tsconfig.test.json clean.

Falsification

If, after deploying an evm-data-service image built from this commit, a large-jump-finality network's hotblocks_last_finalized_block still stops advancing while its RPC finalized tag advances, this fix is insufficient. (Note: running pods must be redeployed to a rebuilt image — the fix cannot take effect via restart alone, since the same code re-freezes once the lag re-exceeds the window.)

@elina-chertova elina-chertova requested a review from tmcgroul July 3, 2026 02:37
@elina-chertova

Copy link
Copy Markdown
Contributor Author

Related to #499 (also in finalizer.ts), but a different cause: #499 removes the finalized-only output.put that stalls head ingestion (the blockNumber % 32 lag spike); this PR fixes the queue.length > 50 overwrite that drops middle refs and freezes the reported finalized head on large-jump-finality chains. #499 leaves the queue-cap logic untouched, so the two are complementary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant