Skip to content

Make executor mcp ensure a durable daemon and bridge to it#1196

Open
RhysSullivan wants to merge 1 commit into
mainfrom
phase1-mcp-daemon
Open

Make executor mcp ensure a durable daemon and bridge to it#1196
RhysSullivan wants to merge 1 commit into
mainfrom
phase1-mcp-daemon

Conversation

@RhysSullivan

Copy link
Copy Markdown
Owner

What

executor mcp no longer owns the local database. It ensures a durable, detached daemon and bridges stdio JSON-RPC to it over HTTP. Concurrent cold starts run a race-safe election: exactly one process becomes the owner, and the rest wait for its manifest and attach rather than failing. The owner's lifetime is independent of any MCP client, so multiple MCP clients, the web UI, and the desktop app all share one local server.

Supersedes #1033, which had the first executor mcp process start a server in-process and bridge to itself. That tied the shared owner's lifetime to a transient client: when it exited, the server everyone else attached to went down. This builds on the merged start-lock primitive instead, so no client ever owns the database.

Tests

  • e2e/local/cli-mcp-daemon-attach-stress.test.ts: cold-start race, attach storm, and kill-under-load against the local dev server.
  • e2e/cli/election-cold-start.test.ts: fires N simultaneous clients at one cold data dir on the cli VM targets and asserts exactly one daemon is elected and every client attaches.
  • e2e/cli/election-cold-start.win.ps1: the same election proven on real Windows.

Verified the one-winner, rest-attach behavior on macOS, Linux, and Windows:

cold  ok=6 n=6 spawned=1 manifests=1 health=200
warm  ok=6 n=6 spawned=0 manifests=1 health=200

(cold: one client spawns the daemon, the other five attach, all round-trip; warm: all six attach, none spawn.)

executor mcp no longer starts a server in-process. It ensures a durable
detached daemon and bridges stdio JSON-RPC to that owner over HTTP.
Concurrent cold starts run a race-safe election: one process becomes the
owner and the rest wait for its manifest and attach instead of failing.
The owner's lifetime is independent of any MCP client, so many clients,
the web UI, and the desktop app share one local server.

This builds on the merged start-lock primitive so no client ever owns the
database, replacing the earlier approach where the first mcp process
started a server in-process and bridged to itself.

Adds a cold-start election probe across the cli VM targets plus a local
attach stress test; the one-winner, rest-attach behavior is verified on
macOS, Linux, and Windows.
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
executor-marketing ee75f4a Commit Preview URL

Branch Preview URL
Jun 28 2026, 11:24 PM

@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
❌ Deployment failed
View logs
executor-cloud ee75f4a Jun 28 2026, 11:24 PM

@github-actions

Copy link
Copy Markdown
Contributor

Cloudflare preview

Console https://executor-preview-pr-1196.executor-e2e.workers.dev
MCP https://executor-preview-pr-1196.executor-e2e.workers.dev/mcp
Deployed commit ee75f4a

Sign-in is Cloudflare Access (one-time PIN to an allowed email). The preview has its own database and encryption key; it is destroyed when this PR closes.

@greptile-apps

greptile-apps Bot commented Jun 28, 2026

Copy link
Copy Markdown

Greptile Summary

This PR replaces the in-process MCP server in executor mcp with a pure stdio-to-HTTP bridge that forwards JSON-RPC to a durable detached daemon, decoupling the server lifetime from any transient MCP client. It also hardens concurrent cold starts with a race-safe election loop: one process wins the start lock and spawns the daemon, while the rest wait for its manifest and attach rather than failing.

  • apps/cli/src/main.ts: removes runMcpStdioServer / getExecutor ownership, adds runMcpHttpBridge (bidirectional stdio ↔ Streamable-HTTP forwarding with clean shutdown), and replaces the old lock-then-spawn path with a 3-attempt election loop (spawnAndWaitForDaemon + spawnDaemonAsLockHolder) that treats lock-contention errors as a "wait for winner" signal rather than a hard failure.
  • e2e/local/cli-mcp-daemon-attach-stress.test.ts and e2e/cli/election-cold-start.test.ts: new stress and cross-OS election proofs covering attach-storm, cold-start race, kill-under-load, and the one-winner invariant on Linux, macOS, and Windows.

Confidence Score: 4/5

The production path in main.ts is well-structured: the election loop, lock release, and bridge teardown all look correct. The one concrete defect is in the new stress test's cleanup helper, which leaves orphan daemons behind on test runs but does not affect the shipped CLI.

The bridge and election logic in main.ts is carefully written with proper idempotency guards and lock-release-on-failure semantics. The stress test file has a missing readFileSync import that causes its daemon cleanup to silently no-op, so repeated local test runs may accumulate orphan processes — worth fixing before merging.

e2e/local/cli-mcp-daemon-attach-stress.test.ts — missing import breaks the daemon cleanup finalizer

Important Files Changed

Filename Overview
apps/cli/src/main.ts Core change: replaces the in-process MCP server with a pure stdio-to-HTTP bridge (runMcpHttpBridge) and adds a race-safe daemon election loop (spawnAndWaitForDaemon). Logic is sound; shutdown, idempotency guards, and lock release look correct.
e2e/local/cli-mcp-daemon-attach-stress.test.ts New stress tests for attach-storm, cold-start race, and kill-under-load. Contains a missing readFileSync import that silently breaks daemon cleanup, potentially leaving orphan daemons after test runs.
e2e/cli/election-cold-start.test.ts New cross-OS e2e proof of the daemon election: fires N simultaneous CLI clients at a cold data dir over SSH, asserts exactly one daemon elected and all clients succeed. Clean structure with proper cold/warm wave separation.
e2e/cli/election-cold-start.win.ps1 PowerShell companion proving the same election on Windows. Uses Start-Process + WaitForExit idiom; gracefully notes the EC2 ExitCode limitation and uses stdout content as the success signal instead.
apps/cli/package.json Adds @modelcontextprotocol/sdk ^1.29.0 as a direct dependency for the new bridge transports.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant C1 as executor mcp (client 1)
    participant C2 as executor mcp (client 2..N)
    participant FS as Filesystem (start-lock + manifest)
    participant D as Daemon (detached)

    par Cold-start race
        C1->>FS: acquireDaemonStartLock()
        C2->>FS: acquireDaemonStartLock()
    end

    FS-->>C1: lock acquired (winner)
    FS-->>C2: contention error (loser)

    C1->>D: spawnDetached()
    D-->>FS: write server manifest
    D-->>C1: health reachable

    C2->>FS: waitForDaemonStartupTarget (polls manifest)
    FS-->>C2: manifest found

    C1->>FS: releaseDaemonStartLock()

    C1->>FS: readActiveLocalServerManifest()
    FS-->>C1: manifest (URL + auth token)
    C2->>FS: readActiveLocalServerManifest()
    FS-->>C2: manifest (URL + auth token)

    par Bridge stdio to HTTP
        C1->>D: StreamableHTTPClientTransport /mcp
        C2->>D: StreamableHTTPClientTransport /mcp
    end

    Note over C1,D: MCP client stdin/stdout bridged to daemon over HTTP
    Note over C1,C2: Daemon lifetime independent of any MCP client
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant C1 as executor mcp (client 1)
    participant C2 as executor mcp (client 2..N)
    participant FS as Filesystem (start-lock + manifest)
    participant D as Daemon (detached)

    par Cold-start race
        C1->>FS: acquireDaemonStartLock()
        C2->>FS: acquireDaemonStartLock()
    end

    FS-->>C1: lock acquired (winner)
    FS-->>C2: contention error (loser)

    C1->>D: spawnDetached()
    D-->>FS: write server manifest
    D-->>C1: health reachable

    C2->>FS: waitForDaemonStartupTarget (polls manifest)
    FS-->>C2: manifest found

    C1->>FS: releaseDaemonStartLock()

    C1->>FS: readActiveLocalServerManifest()
    FS-->>C1: manifest (URL + auth token)
    C2->>FS: readActiveLocalServerManifest()
    FS-->>C2: manifest (URL + auth token)

    par Bridge stdio to HTTP
        C1->>D: StreamableHTTPClientTransport /mcp
        C2->>D: StreamableHTTPClientTransport /mcp
    end

    Note over C1,D: MCP client stdin/stdout bridged to daemon over HTTP
    Note over C1,C2: Daemon lifetime independent of any MCP client
Loading

Reviews (1): Last reviewed commit: "feat(cli): make executor mcp ensure a du..." | Re-trigger Greptile

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
import { Effect } from "effect";
import { mkdtempSync, readdirSync, rmSync } from "node:fs";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 readFileSync is used in stopAutoSpawnedDaemon but is not present in the node:fs import. At runtime this throws a ReferenceError, which the surrounding try/catch silently swallows, so the auto-spawned daemon is never sent SIGTERM. Subsequent rmSync still removes the data directory, leaving an orphan daemon process behind — accumulating across repeated test runs.

Suggested change
import { mkdtempSync, readdirSync, rmSync } from "node:fs";
import { mkdtempSync, readdirSync, readFileSync, rmSync } from "node:fs";

@pkg-pr-new

pkg-pr-new Bot commented Jun 28, 2026

Copy link
Copy Markdown

Open in StackBlitz

@executor-js/cli

npm i https://pkg.pr.new/@executor-js/cli@1196

@executor-js/config

npm i https://pkg.pr.new/@executor-js/config@1196

@executor-js/execution

npm i https://pkg.pr.new/@executor-js/execution@1196

@executor-js/sdk

npm i https://pkg.pr.new/@executor-js/sdk@1196

@executor-js/codemode-core

npm i https://pkg.pr.new/@executor-js/codemode-core@1196

@executor-js/runtime-quickjs

npm i https://pkg.pr.new/@executor-js/runtime-quickjs@1196

@executor-js/plugin-file-secrets

npm i https://pkg.pr.new/@executor-js/plugin-file-secrets@1196

@executor-js/plugin-graphql

npm i https://pkg.pr.new/@executor-js/plugin-graphql@1196

@executor-js/plugin-keychain

npm i https://pkg.pr.new/@executor-js/plugin-keychain@1196

@executor-js/plugin-mcp

npm i https://pkg.pr.new/@executor-js/plugin-mcp@1196

@executor-js/plugin-onepassword

npm i https://pkg.pr.new/@executor-js/plugin-onepassword@1196

@executor-js/plugin-openapi

npm i https://pkg.pr.new/@executor-js/plugin-openapi@1196

executor

npm i https://pkg.pr.new/executor@1196

commit: ee75f4a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant