Skip to content

Delay before reconnecting to the worker NET-883#207

Merged
kalabukdima merged 1 commit into
mainfrom
fix/portal-reconnect-cooldown
Jun 17, 2026
Merged

Delay before reconnecting to the worker NET-883#207
kalabukdima merged 1 commit into
mainfrom
fix/portal-reconnect-cooldown

Conversation

@kalabukdima

@kalabukdima kalabukdima commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Problem

The portal was burning a lot of CPU and running OOM. Root cause: when many workers refuse the connections (NET-883), each refusal closes the connection and on_connection_closed re-dialed instantly. The dial→refuse→redial loop ran at QUIC-handshake speed — transport_libp2p_identify_errors_total climbed ~4,100/sec — churning anon heap that jemalloc retained until OOM.

Fix

Add a flat per-peer reconnect cooldown between a connection close and its redial. Default 30s, override via RECONNECT_COOLDOWN_SEC.

@kalabukdima kalabukdima force-pushed the fix/portal-reconnect-cooldown branch from 05e964d to d8829ba Compare June 17, 2026 07:21
@kalabukdima kalabukdima changed the title fix(transport): cool down worker reconnects to stop identify-error storm Delay before reconnecting to the worker Jun 17, 2026
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kalabukdima kalabukdima force-pushed the fix/portal-reconnect-cooldown branch from d8829ba to 73a5266 Compare June 17, 2026 07:23
@kalabukdima kalabukdima marked this pull request as ready for review June 17, 2026 07:30
@kalabukdima kalabukdima changed the title Delay before reconnecting to the worker Delay before reconnecting to the worker NET-883 Jun 17, 2026
@kalabukdima kalabukdima merged commit e9a3434 into main Jun 17, 2026
3 checks passed
@kalabukdima kalabukdima deleted the fix/portal-reconnect-cooldown branch June 17, 2026 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant