fix(evm-rpc): keep finalizer queue contiguous so finalized head advances past probe window#517
Open
elina-chertova wants to merge 1 commit into
Open
fix(evm-rpc): keep finalizer queue contiguous so finalized head advances past probe window#517elina-chertova wants to merge 1 commit into
elina-chertova wants to merge 1 commit into
Conversation
…ces past probe window
Contributor
Author
|
Related to #499 (also in |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
For chains whose finalized head advances in large jumps (L2 / rollup batch finality — e.g. zksync-mainnet, worldchain-mainnet, alpen-testnet, robinhood-mainnet), the hot-block data service stops advancing its reported finalized head.
hotblocks_last_finalized_blockfreezes whilehotblocks_last_block(unfinalized head) keeps advancing, and the data service logsblock finalization lags behind and prevents cache purgingon every new head. This fires thePortal_Hotblocks_Head_Metrics_Absent_*alert (increase(hotblocks_last_finalized_block[12h]) == 0).Observed in production: zksync-mainnet reported finalized frozen at
71020159while the chain's real finalized head (eth_getBlockByNumber("finalized"), agreed by two independent providers) was71027108— a ~6,900-block gap that only grows.Root cause
Finalizer.visit()inevm/evm-rpc/src/data-source/finalizer.tsbounds its probe queue by overwriting the last slot once it exceeds ~50 entries:This keeps only the oldest ~50 unfinalized refs plus the single latest block, leaving a gap in between. The finalizer probes from the front and can only advance the finalized head through contiguous refs. When the finality lag exceeds the window, it finalizes the ~50 contiguous refs, hits the gap, and can never reach the true finalized head — so the reported finalized head freezes.
Chains with smooth, incremental finality (Ethereum PoS, most L1s) slide within the window and are unaffected — matching the fact that only large-jump-finality networks froze.
Fix
Keep the queue contiguous — always
push. The queue is now bounded by the finality lag (the range we must track anyway), and it holds tinyBlockRefs (number + hash), so memory is negligible relative to the full-block buffer the data service already keeps.Test
Added
evm/evm-rpc/src/data-source/finalizer.test.ts. The regression test feeds >50 unfinalized blocks in one batch with the finalized head beyond the old cap and asserts the finalizer reaches the true finalized head.expected 50 to be 120(frozen at the 50-block cap)evm-rpcsuite green (154 passed, 27 skipped),tscbuild andtsc --noEmit -p tsconfig.test.jsonclean.Falsification
If, after deploying an
evm-data-serviceimage built from this commit, a large-jump-finality network'shotblocks_last_finalized_blockstill stops advancing while its RPCfinalizedtag advances, this fix is insufficient. (Note: running pods must be redeployed to a rebuilt image — the fix cannot take effect via restart alone, since the same code re-freezes once the lag re-exceeds the window.)