fix(chat): keep MCP server connections alive across their lifetime#42
Merged
Conversation
ReconcileMCPServers connected config MCP servers with a 30s-timeout context and called cancel() immediately after Connect() returned. The go-sdk's mcp.Client.Connect stores the context it is given for the connection's ENTIRE lifetime (StreamableClientTransport.Connect derives a cancellable context from it to drive the background "hanging GET" SSE listener and reconnect machinery), not just the initial handshake. Cancelling that context right after connect tore down the background SSE listener, so a later tool call on an otherwise-connected, healthy server that needed to re-establish its SSE stream failed with the real, user-reported error chain: connection closed: calling "tools/call": client is closing: hanging GET: failed to reconnect (session ID: ...): connection failed after 5 attempts: Get "https://<server>": context canceled Fix: hand Connect the session's own long-lived context (s.ctx) directly with no per-connect timeout, mirroring the built-in host-tool connect path in NewSession, which has never exhibited this bug. Deleting only the cancel() call would not suffice: the WithTimeout wrapper would itself auto-cancel the connection 30s after connecting. Adds a regression test that stands up a real streamable-HTTP MCP server and asserts the background hanging-GET SSE stream is not torn down once ReconcileMCPServers returns (it fails on the pre-fix code, passes now). Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ReconcileMCPServersconnected user-configured MCP servers with a 30s-timeout context and calledcancel()immediately afterConnect()returned.mcp.Client.Connectstores the context it's given for the connection's entire lifetime, not just the initial handshake —StreamableClientTransport.Connectderives a cancellable context from it to drive the background "hanging GET" SSE listener and reconnect machinery.Connectthe session's own long-lived context (s.ctx) directly, with no per-connect timeout — mirroring the built-in host-tool connect path inNewSession, which has never exhibited this bug. Deleting only thecancel()call would not have been sufficient: theWithTimeoutwrapper itself would still auto-cancel the connection 30s after connecting.Testing
TestReconcileMCPServersKeepsConnectionAliveForToolCalls(chat/session_reload_test.go): stands up a real streamable-HTTP MCP server and asserts the background hanging-GET SSE stream is not torn down onceReconcileMCPServersreturns. Confirmed it fails on the pre-fix code and passes on the fix.go build ./...,go vet ./...,go test ./...— 100% green across all packages.🤖 Generated with Claude Code