Skip to content

fix(mcp): delegate_research runs out of the box (baseUrl default + researcher harness/model/BYOK env)#302

Merged
drewstone merged 4 commits into
mainfrom
fix/mcp-delegate-e2e
Jun 14, 2026
Merged

fix(mcp): delegate_research runs out of the box (baseUrl default + researcher harness/model/BYOK env)#302
drewstone merged 4 commits into
mainfrom
fix/mcp-delegate-e2e

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

Problem

agent-runtime-mcp's delegate_research could not complete a run in a fresh environment:

  1. Server crashes on startup without SANDBOX_BASE_URL@tangle-network/sandbox ≥0.6 makes baseUrl required, but loadSandboxClient only passed it when the env var was set.
  2. Researcher "produces no winner" instantly — the delegate built the profile with no harness/model and no provider auth. The profile's default harness (opencode/zai-coding-plan/glm-5.1) isn't broadly provisionable, and the sandbox SDK does not wire backend.model.apiKey into the in-box agent's OpenAI-compatible provider, so the agent made zero LLM calls.

Fix

  • Default SANDBOX_BASE_URL to the public Tangle sandbox so the server starts.
  • Researcher delegate now passes a working harness + model and injects the router creds as box env (OPENAI_API_KEY / OPENAI_BASE_URL). All env-overridable: MCP_RESEARCHER_HARNESS, MCP_RESEARCHER_MODEL (MCP_WORKER_MODEL fallback), MCP_RESEARCHER_ROUTER_BASE_URL; TANGLE_API_KEY doubles as the router key.

Evidence

With these, a researcher run gets past the instant "no winner" and runs a real multi-minute agent turn; the equivalent wiring produced a cited research dossier end-to-end. Typecheck clean.

Note

Full e2e via the sibling-sandbox executor is still gated on worker-box provisioning (the known deployed-sandbox/gateway issue, separate from this PR). This PR fixes the code-level layers (startup, auth, harness/model).

🤖 Generated with Claude Code

…YOK env so delegate_research runs out of the box

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — d1adaece

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T22:33:53Z

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Value Audit — better-approach-exists

Verdict better-approach-exists
Concerns 7 (1 medium-concern, 6 weak-concern)
Heuristic 0.0s
Duplication 0.0s
Interrogation 257.8s (2 bridge agents)
Total 257.8s

💰 Value — better-approach-exists

The change makes single-variant delegate_research start and make LLM calls by defaulting the sandbox URL and injecting a provisionable harness/model plus router creds, but it leaves the multi-variant fanout path on the broken defaults, so the fix is incomplete rather than fully out-of-the-box.

  • What it does: In src/mcp/bin.ts: (1) defaults SANDBOX_BASE_URL to https://sandbox.tangle.tools when constructing the sandbox client (line 357); (2) reads env overrides MCP_RESEARCHER_HARNESS, MCP_RESEARCHER_MODEL (with MCP_WORKER_MODEL fallback), and MCP_RESEARCHER_ROUTER_BASE_URL, defaulting to opencode, moonshotai/kimi-k2.6, and https://router.tangle.tools/v1 (lines 410-415); (3) introduce
  • Goals it achieves: Make delegate_research usable in a fresh environment without manually setting SANDBOX_BASE_URL or tuning researcher harness/model/auth; prevent the MCP server from crashing on startup because @tangle-network/sandbox >=0.6 requires baseUrl; prevent the researcher from producing no winner because the profile default harness is not broadly provisionable and the sandbox SDK does not forward `b
  • Assessment: The single-variant fix is coherent and follows codebase patterns: defaulting the sandbox URL matches src/mcp/delegation-profile.ts:48-79, and env-overridable harness/model follows the same style as MCP_CODER_FANOUT_HARNESSES (src/mcp/bin.ts:27-91). However, the change is incomplete. The multi-variant fanout path at src/mcp/bin.ts:502 still calls fanoutFactory({ task }) without passing th
  • Better / existing approach: Apply the same router-cred injection to every AgentRunSpec in the fanout result so multi-variant research runs also work out of the box. I checked src/mcp/bin.ts:367-372 (ResearcherFanoutPreset exposes agentRuns) and src/mcp/bin.ts:502 (fanoutFactory({ task }) is unchanged). The simplest fix is to map over fanout.agentRuns after fanoutFactory returns and overlay the same `sandboxOv

🎯 Usefulness — sound-with-nits

Fixes the researcher's single-variant path so it starts, authenticates, and calls an LLM out of the box, but leaves the multi-variant fanout path with the same broken defaults and introduces env/url conventions that diverge from the repo's shared router-resolution helper.

  • Integration: The new behavior is reachable: loadResearcherSupport in src/mcp/bin.ts:379 is invoked by main when the researcher is enabled; its delegate is registered as delegate_research in src/mcp/server.ts:189, and production callers mount the stdio MCP entry via buildDelegationMcpServer in src/mcp/delegation-profile.ts:69. The SANDBOX_BASE_URLdefault insrc/mcp/bin.ts:357` resolves the sand
  • Fit with existing patterns: It mirrors the coder delegate's profile-factory usage (coderProfile({ task, harness, model }) in src/mcp/delegates.ts:198). However, it hardcodes https://router.tangle.tools/v1 and reads a new MCP_RESEARCHER_ROUTER_BASE_URL env instead of reusing resolveRouterBaseUrl/TANGLE_ROUTER_BASE_URL from src/model-resolution.ts:36-48, and the MCP_WORKER_MODEL fallback (src/mcp/bin.ts:412)
  • Real-world viability: Holds up for the default single-variant path (including detached and streaming dispatch) because the harness/model/router constants are resolved once per process. It does not apply the same wiring to the variants > 1 fanout path (src/mcp/bin.ts:502), which still uses the non-provisionable default harnesses and gets no router env injection. The sandboxOverrides.env assignment in `src/mcp/bin.

💰 Value Audit

🟠 Multi-variant researcher fanout still uses the broken defaults [better-architecture] ``

The fanout path at src/mcp/bin.ts:502 calls fanoutFactory({ task }) without the new harness/model defaults and without injecting OPENAI_API_KEY/OPENAI_BASE_URL into each returned agentRun. So variants > 1 research delegations still use the non-provisionable default harness and make zero LLM calls, directly contradicting the PR title that delegate_research runs out of the box. The better approach is to post-process fanout.agentRuns with the same sandboxOverrides.env injectio

🟡 Resume prompt builder ignores the new harness/model defaults [against-grain] ``

resume.message at src/mcp/bin.ts:528 calls singleFactory({ task }) without passing harness/model and without the router-env injection. While taskToPrompt may not currently depend on them, this path is inconsistent with the buildPreset used for dispatch and could drift. It should use the same preset construction logic as the single-variant dispatch path.

🟡 Router base URL default is not reused from existing resolution [duplication] ``

The new default https://router.tangle.tools/v1 at src/mcp/bin.ts:415 duplicates the router URL constants used across bench files. src/model-resolution.ts:36-48 already centralizes router-base resolution via TANGLE_ROUTER_URL/TANGLE_ROUTER_BASE_URL and DEFAULT_ROUTER_BASE_URL. The env var MCP_RESEARCHER_ROUTER_BASE_URL is reasonable, but it should fall back to the existing TANGLE_ROUTER_BASE_URL/TANGLE_ROUTER_URL env vars before hardcoding the URL.

🎯 Usefulness Audit

🟡 Researcher router URL reinvents the project's model-resolution convention [fit] ``

src/mcp/bin.ts:415 hardcodes https://router.tangle.tools/v1 and reads MCP_RESEARCHER_ROUTER_BASE_URL, while src/model-resolution.ts:36-48 already resolves router bases from TANGLE_ROUTER_URL/TANGLE_ROUTER_BASE_URL. Reusing that helper and appending /v1 would align the MCP child with the rest of the codebase and avoid a one-off env key.

🟡 MCP_WORKER_MODEL fallback does not match the repo's WORKER_MODEL convention [fit] ``

src/mcp/bin.ts:412 falls back to MCP_WORKER_MODEL, a name introduced only here, whereas bench scripts and examples throughout the repo use WORKER_MODEL (e.g. bench/src/run.ts:111, examples/strategy-suite/strategy-suite.ts:128). Falling back to WORKER_MODEL (or MODEL, as in bench/src/research-loop.mts:31) would be more predictable.

🟡 Multi-variant fanout still uses broken defaults and gets no router creds [robustness] ``

When args.variants > 1, src/mcp/bin.ts:502 still calls multiHarnessResearcherFanout({ task }) without passing the env-driven harness/model or injecting OPENAI_API_KEY/OPENAI_BASE_URL. The factory's defaults include opencode/zai-coding-plan/glm-5.1 (agent-knowledge@1.4.0 profiles/index.d.ts), so fanout calls will continue to make zero LLM calls. Extend the buildPreset-style wiring to the fanout path or document that variants>1 requires additional provisioning.

🟡 sandboxOverrides.env is overwritten rather than merged [robustness] ``

In src/mcp/bin.ts:420-422, setting spec.sandboxOverrides.env replaces any env that researcherProfile or a future preset already supplied. It should merge with (spec.sandboxOverrides?.env ?? {}) so other required box env vars are preserved.


What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass What it asks
Heuristic Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication Do added function/class names already exist elsewhere in the repo?
Value Audit What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

value-audit · 20260614T224202Z

@tangletools

Copy link
Copy Markdown
Contributor

❌ Needs Work — d1adaece

Readiness 54/100 · Confidence 65/100 · 8 findings (1 high, 3 medium, 4 low)

deepseek glm aggregate
Readiness 69 54 54
Confidence 65 65 65
Correctness 69 54 54
Security 69 54 54
Testing 69 54 54
Architecture 69 54 54

Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision.

Blocking

🔴 HIGH Fanout path (variants > 1) bypasses router credential injection — src/mcp/bin.ts

Line 502: const fanout = fanoutFactory({ task }) — the fanout path calls the factory directly without applying buildPreset. The PR's stated goal is to inject OPENAI_API_KEY/OPENAI_BASE_URL into sandbox env so in-box agents can make LLM calls, but this only happens in the single-variant path (lines 435, 453). Multi-variant researcher runs will produce the same 'no winner' failure the PR is fixing. The fanout.agentRuns array needs the same sandboxOverrides.env injection applied to each agent run spec. Fix: either iterate fanout.agentRuns and inject

Other

🟠 MEDIUM No tests for buildPreset or env injection path — src/mcp/bin.ts

0 of 23 test files under tests/mcp/ exercise loadResearcherSupport or buildPreset. The researcher delegate tests use stubs that never invoke the real singleFactory, so the harness/model/env injection path (lines 404-426) and the sandboxOverrides mutation (lines 419-423) have zero test coverage. The env replacement bug (finding above) would not be caught by existing tests. Add a test that mocks singleFactory to return a preset with pre-existing sandboxOverrides.env and asserts the merge behavior, or better, an integration test using a real a

🟠 MEDIUM buildPreset replaces preset env vars instead of merging — src/mcp/bin.ts

At lines 419-423, buildPreset sets spec.sandboxOverrides = { ...(spec.sandboxOverrides ?? {}), env: { OPENAI_API_KEY: routerKey, OPENAI_BASE_URL: routerBaseUrl } }. This preserves other top-level sandboxOverrides properties (name, backend) but completely replaces any pre-existing env map from the researcher preset. If a future preset version ships with additional required env vars, they would be silently dropped. The sandboxOverrides type permits env?: Record<string, string> as a top‑level key, so this is a latent footgun. Fix: env: { ...(spec.sandboxOverrides?.env), OPENAI_API_KEY: routerKey, OPENAI_BASE_URL: routerBaseUrl }.

🟠 MEDIUM sandboxOverrides.env replaces rather than merges existing env vars — src/mcp/bin.ts

Lines 420-423: env: { OPENAI_API_KEY: routerKey, OPENAI_BASE_URL: routerBaseUrl } replaces the entire env object on spec.sandboxOverrides. If the researcher profile's default agentRunSpec already carries env vars the in-box agent needs (e.g., tool-specific config, namespace tokens), they are silently dropped. The top-level spread ...(spec.sandboxOverrides ?? {}) preserves other sandboxOverrides keys but not nested env entries. Fix: env: { ...(spec.sandboxOverrides?.env ?? {}), OPENAI_API_KEY: routerKey, OPENAI_BASE_URL: routerBaseUrl }.

🟡 LOW Empty-string SANDBOX_BASE_URL no longer falls back to default — src/mcp/bin.ts

Line 357: const baseUrl = process.env.SANDBOX_BASE_URL ?? 'https://sandbox.tangle.tools'. Nullish coalescing only handles null/undefined. The old code at line 356 used a falsy spread guard (...(baseUrl ? { baseUrl } : {})), so SANDBOX_BASE_URL='' would omit baseUrl entirely. Now an empty string passes through to the Sandbox constructor, which could fail to connect. Low probability — nobody intentionally sets it to empty — but a behavior regression.

🟡 LOW No independent router API key override — src/mcp/bin.ts

Line 413: routerKey is hardcoded to process.env.TANGLE_API_KEY with no separate override (e.g., MCP_RESEARCHER_ROUTER_KEY). The comment says 'TANGLE_API_KEY doubles as the router key,' but if the OpenAI-compatible router and the sandbox API use different credentials, there is no way to configure the router key independently. Fix: const routerKey = process.env.MCP_RESEARCHER_ROUTER_KEY ?? process.env.TANGLE_API_KEY.

🟡 LOW Resume message path omits harness/model from singleFactory call — src/mcp/bin.ts

At line 528, the resume driver's message() function calls singleFactory({ task }) without harness/model, unlike buildPreset which passes both. This only affects the display message on the delegation record (not execution — settleSingle correctly calls buildPreset), but if a future profile variant produces a different taskToPrompt based on harness/model, the displayed message would diverge from what was actually executed. Currently harmless since execution and display use the same task object.

🟡 LOW resume.message uses default harness/model, inconsistent with buildPreset — src/mcp/bin.ts

Line 528: singleFactory({ task }) in the resume.message function omits harness and model, while settleSingle (line 435) and the delegate (line 453) use buildPreset(task) with the configured values. If harness/model affect taskToPrompt, the resumed turn's prompt differs from the original dispatch. Low impact in practice — resume re-attaches to a running session and driveTurn is idempotent for in-flight turns — but the inconsistenc


tangletools · 2026-06-14T22:42:04Z · trace

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ 1 Blocking Finding — d1adaece

Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision.

Full immutable report for this review: trace

Summary comment for this run: full summary


tangletools · 2026-06-14T22:42:04Z · immutable trace

…visioning

Addresses PR #302 audit:
- HIGH: variants>1 fanout path now uses a provisionable harness/model and gets
  router creds injected per agent-run (was: zero LLM calls, same 'no winner').
- MEDIUM: extract resolveResearcherProvisioning + applyRouterEnv into a tested
  module; applyRouterEnv MERGES box env instead of replacing it.
- LOW: empty-string SANDBOX_BASE_URL falls back to default; MCP_RESEARCHER_ROUTER_KEY
  override; reuse resolveRouterBaseUrl (TANGLE_ROUTER_*) + WORKER_MODEL convention;
  resume.message uses buildPreset for prompt consistency.
tangletools
tangletools previously approved these changes Jun 14, 2026

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — 2ad0336c

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T22:49:38Z

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Value Audit — sound-with-nits

Verdict sound-with-nits
Concerns 4 (4 weak-concern)
Heuristic 0.0s
Duplication 0.0s
Interrogation 184.2s (2 bridge agents)
Total 184.2s

💰 Value — sound-with-nits

The change makes the MCP server's researcher delegate runnable in a fresh environment by defaulting the sandbox base URL and injecting a provisionable harness/model plus OpenAI-compatible router credentials into every agent-run spec; it's a coherent, scoped fix with only small reuse/doc nits.

  • What it does: It does three concrete things: (1) defaults SANDBOX_BASE_URL to https://sandbox.tangle.tools in loadSandboxClient (src/mcp/bin.ts:363) so the MCP server starts without the env var; (2) adds a resolveResearcherProvisioning() helper (src/mcp/researcher-provisioning.ts:42) that picks a worker harness/model and resolves router credentials from env (MCP_RESEARCHER_HARNESS, `MCP_RESEARCHER
  • Goals it achieves: It achieves the goal of making delegate_research work out-of-the-box: the server no longer crashes on startup when SANDBOX_BASE_URL is unset, and the in-box researcher agent no longer makes zero LLM calls due to an unprovisionable default harness and missing provider auth. It also keeps the wiring env-overridable so deployments can override harness/model/router without code changes.
  • Assessment: Good on its merits. The fix is narrowly scoped to the broken path, extracts testable logic into its own module, reuses the existing resolveRouterBaseUrl helper from src/model-resolution.ts, and applies the same provisioning consistently to single-variant, multi-variant fanout, and detached-resume paths. It matches the codebase's pattern of env-driven MCP configuration.
  • Better / existing approach: No materially better architecture or existing equivalent exists for the researcher provisioning. I looked at src/mcp/delegation-profile.ts (env forwarding for the MCP child), src/model-resolution.ts (router base resolution), src/mcp/delegates.ts and src/profiles/coder.ts (coder harness/model wiring), and src/runtime/sandbox-backend.ts/src/runtime/types.ts (sandbox override shape). The

🎯 Usefulness — sound-with-nits

Fixes the out-of-the-box researcher path by defaulting the sandbox base URL and injecting provisionable harness/model plus router creds into the MCP delegate, wired correctly through the existing MCP server.

  • Integration: The new behavior is reachable. main() in src/mcp/bin.ts:159-162 calls loadResearcherSupport() when the researcher tool is enabled; the returned delegate is passed to createMcpServer() at bin.ts:187; createMcpServer() registers the delegate_research tool whenever a delegate is supplied (src/mcp/server.ts:189-202). The provisioning helper is imported only by bin.ts and is not dead
  • Fit with existing patterns: It fits established patterns. Router base resolution reuses resolveRouterBaseUrl from src/model-resolution.ts (researcher-provisioning.ts:12), and the env naming mirrors the existing MCP_* convention in bin.ts (e.g. MCP_CODER_FANOUT_HARNESSES at bin.ts:27 vs the new MCP_RESEARCHER_FANOUT_HARNESSES at bin.ts:505). The only duplication is the public sandbox default, which already e
  • Real-world viability: It should hold up under normal use. Provisioning is resolved once per server load and reused across concurrent delegate_research calls, which is the right granularity for env-driven config. The fanout path applies router creds to every generated agent-run spec (bin.ts:512-514), not just the single-variant path. It handles whitespace/empty env values as unset and falls back through model/router

💰 Value Audit

🟡 Repeated default sandbox base URL literal [duplication] ``

src/mcp/delegation-profile.ts:48 already defines DEFAULT_SANDBOX_BASE_URL = 'https://sandbox.tangle.tools' and uses it to forward SANDBOX_BASE_URL to the MCP child. src/mcp/bin.ts:363 now repeats the same literal. Import the existing constant to keep the default in one place.

🟡 New env vars not added to the bin header inventory [maintenance] ``

src/mcp/bin.ts:13-56 lists the environment variables the MCP server honors, but it omits the new MCP_RESEARCHER_HARNESS, MCP_RESEARCHER_MODEL, MCP_WORKER_MODEL, MCP_RESEARCHER_ROUTER_KEY, MCP_RESEARCHER_ROUTER_BASE_URL, and MCP_RESEARCHER_FANOUT_HARNESSES variables introduced by this change. Adding them preserves the header as the canonical discovery surface.

🎯 Usefulness Audit

🟡 Sandbox base URL default is duplicated [ergonomics] ``

src/mcp/delegation-profile.ts:48 already centralizes DEFAULT_SANDBOX_BASE_URL = 'https://sandbox.tangle.tools', but src/mcp/bin.ts:363 hardcodes the same literal. Since both files live under src/mcp, export and reuse the constant to keep the two entry points from drifting.

🟡 Researcher fanout lacks per-harness model override [ergonomics] ``

The coder delegate supports fanoutModels indexed parallel to fanoutHarnesses (src/mcp/delegates.ts:147), but the new researcher fanout forces every variant to the same resolved model (bin.ts:510: models: fanoutHarnesses.map(() => model)). There is no MCP_RESEARCHER_FANOUT_MODELS counterpart. For now this is fine for the minimal fix, but it is an asymmetry a future caller will likely ask for.


What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass What it asks
Heuristic Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication Do added function/class names already exist elsewhere in the repo?
Value Audit What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

value-audit · 20260614T225433Z

@tangletools

Copy link
Copy Markdown
Contributor

✅ No Blockers — 2ad0336c

Readiness 70/100 · Confidence 70/100 · 10 findings (1 medium, 9 low)

deepseek glm aggregate
Readiness 70 83 70
Confidence 70 70 70
Correctness 70 83 70
Security 70 83 70
Testing 70 83 70
Architecture 70 83 70

Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision.

🟠 MEDIUM Fanout harness count not validated against variants — src/mcp/bin.ts

At lines 504-506, parseHarnesses(process.env.MCP_RESEARCHER_FANOUT_HARNESSES) may return an array shorter than variants. This is passed to fanoutFactory (line 507), which likely creates one AgentRunSpec per harness — fewer than variants. Then fanout.agentRuns.slice(0, variants) (line 517) produces fewer specs than maxIterations: variants ([line 526](https://github.com/tangle-network/agent-runtime/blob/2ad0336caacb853291ab318d13f30

🟡 LOW All fanout variants use the same model regardless of harness — src/mcp/bin.ts

Line 510: models: fanoutHarnesses.map(() => model) — every fanout variant gets the same model even when MCP_RESEARCHER_FANOUT_HARNESSES specifies different harnesses. If different harnesses require different models (e.g., a codex harness needs a different model id), there's no way to configure per-harness models. Impact: limits fanout diversity to harness-only. Acceptable as a v1 design but worth a follow-up env like MCP_RESEARCHER_FANOUT_MODELS.

🟡 LOW Fanout harness count vs variants mismatch has no guard — src/mcp/bin.ts

Lines 504-511: when MCP_RESEARCHER_FANOUT_HARNESSES is set with fewer entries than variants, fanoutFactory produces fewer agentRuns than variants. Then agentRuns.slice(0, variants) is a no-op (already shorter), but maxIterations is still set to variants (line 526). The loop may attempt more iterations than there are distinct runs. No validation warns or clamps. Impact: env-misconfiguration silently under-delivers fanout diversity. Fix: clamp maxIterations to Math.min(variants, fanout.agentRuns.length) or validate that fanoutHarnesses.length >= variant

🟡 LOW Missing env variable documentation in bin.ts header — src/mcp/bin.ts

Lines 13-56 document MCP server env variables but omit the 6 new researcher-provisioning variables introduced by this PR: MCP_RESEARCHER_HARNESS, MCP_RESEARCHER_MODEL, MCP_RESEARCHER_FANOUT_HARNESSES, MCP_RESEARCHER_ROUTER_KEY, MCP_RESEARCHER_ROUTER_BASE_URL, and MCP_WORKER_MODEL. Operators configuring researcher behavior have no in-code reference.

🟡 LOW No integration test for buildPreset wrapper in loadResearcherSupport — src/mcp/bin.ts

The buildPreset closure (lines 422-426) wraps singleFactory with harness/model injection and router env application. It is called in 4 code paths (settleSingle, single-variant delegate, fanout delegate, resume message). The unit tests in tests/mcp/researcher-provisioning.test.ts cover the underlying resolveResearcherProvisioning and applyRouterEnv functions, but no test verifies that buildPreset correctly integrates them — e.g., that the spec passed to runDetachedTurn or runLoop actually carries the provisioned harness/model/router env.

🟡 LOW parseHarnesses reads process.env directly, bypassing the injectable env pattern — src/mcp/bin.ts

Line 505: parseHarnesses(process.env.MCP_RESEARCHER_FANOUT_HARNESSES) reads env directly, while resolveResearcherProvisioning() (line 421) accepts an injectable env param for testability. This means the fanout harness override cannot be unit-tested without mocking process.env. Fix: thread env through resolveResearcherProvisioning's return or accept it as a parameter to loadResearcherSupport.

🟡 LOW Biome formatting violations will fail CI lint gate — tests/mcp/researcher-provisioning.test.ts

Biome reports: (1) import sort — type ProvisionableSpec must come after applyRouterEnv (line 2-6); (2) format — multiple assertion lines exceed print width (lines 33, 35-36, 41, 52-53, 57-58, 63). CI runs pnpm run lint = biome check src tests examples per .github/workflows/ci.yml, which will exit non-zero. Fix: pnpm run lint:fix or npx biome check --write tests/mcp/researcher-provisioning.test.ts. Note: the implementation file src/mcp/researcher-provisioning.ts also has one formatting violation

🟡 LOW Missing TANGLE_ROUTER_URL vs TANGLE_ROUTER_BASE_URL precedence test — tests/mcp/researcher-provisioning.test.ts

resolveRouterBaseUrl in model-resolution.ts:42 uses TANGLE_ROUTER_URL ?? TANGLE_ROUTER_BASE_URL, so TANGLE_ROUTER_URL takes priority. No test verifies this precedence when both env vars are set. The 'reuses the repo router base' test at line 50 only sets one at a time. Impact: low — the precedence is a simple ?? operator, unlikely to regress, but it's an untested contract.

🟡 LOW Version-preservation test exercises wrong code path — tests/mcp/researcher-provisioning.test.ts

Line 55-59: The comment says 'already-versioned base is left intact (no double /v1)' but this test passes TANGLE_ROUTER_URL: 'https://r.example.com/v1/'. resolveRouterBaseUrl strips /v1 (via .replace(//v1/?$/, '')) before the researcher-provisioning regex ever sees it. The researcher-provisioning regex then sees 'https://r.example.com' (no version) and appends /v1, coincidentally producing the same result. A direct test of the researcher regex would use MCP_RESEARCHER_ROUTER_BASE_URL: 'https://r.example.com/v1/' which bypasses resolveRouterBaseUrl. The /v2 case IS correctly tested via [line 22](https://github.com/tangle-network/agent-runtime/b

🟡 LOW applyRouterEnv no-op not tested with empty string routerKey — tests/mcp/researcher-provisioning.test.ts

Line 85-89: The no-op test only passes undefined as routerKey. The guard is if (!routerKey) return, which would also catch '' (empty string) and null. No test for routerKey='' — though unlikely in practice from the resolver (trimmed never returns ''), it's an uncovered branch if applyRouterEnv is called directly by other code.


tangletools · 2026-06-14T22:57:08Z · trace

…, lint/tests

Round-2 PR #302 audit:
- MEDIUM: clamp maxIterations/maxConcurrency to actual run count when the fanout
  harness list is shorter than variants (no over-claiming iterations).
- per-harness fanout models (MCP_RESEARCHER_FANOUT_MODELS); fanout harnesses now
  resolved via resolveResearcherProvisioning (injectable env, unit-testable).
- document the 6 new MCP_RESEARCHER_* env vars in the bin header.
- biome-clean; +composition + precedence + empty-key tests (13 total).
tangletools
tangletools previously approved these changes Jun 14, 2026

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — 7c972aa7

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T23:08:13Z

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Value Audit — sound-with-nits

Verdict sound-with-nits
Concerns 3 (3 weak-concern)
Heuristic 0.0s
Duplication 0.0s
Interrogation 179.5s (2 bridge agents)
Total 179.5s

💰 Value — sound-with-nits

Makes delegate_research runnable out-of-the-box by defaulting the sandbox baseUrl and injecting provisionable harness/model/OpenAI router creds into every researcher box env; a coherent, well-tested fix that mostly follows existing patterns.

  • What it does: The change does three concrete things in the MCP server: (1) defaults SANDBOX_BASE_URL to https://sandbox.tangle.tools in loadSandboxClient so the server starts without the env var (src/mcp/bin.ts:372); (2) adds src/mcp/researcher-provisioning.ts, which resolves a provisionable researcher harness/model and OpenAI-compatible router key/base from env; and (3) wires that provisioning into `
  • Goals it achieves: The change eliminates two failure modes that made delegate_research unusable in a fresh environment: the MCP server crashing on startup because sandbox ≥0.6 requires baseUrl, and researcher runs returning an instant "produced no winner" because the profile's default harness/model are not broadly provisionable and the sandbox SDK does not forward backend.model.apiKey to the in-box OpenAI-comp
  • Assessment: This is a good change on its merits. It is narrowly scoped to the broken path, extracts the env-resolution logic into a small, testable module, and applies the fix consistently to both single and fanout dispatch. It reuses the repo's existing resolveRouterBaseUrl from src/model-resolution.ts (src/mcp/researcher-provisioning.ts:12) and respects the codebase's pattern of env-driven, optional-p
  • Better / existing approach: No materially better architecture exists for the overall fix. The router-cred injection is a researcher-specific workaround caused by the sandbox SDK not wiring backend.model.apiKey for that profile, so generalizing it prematurely would be speculative. I looked for existing equivalents in src/mcp/delegation-profile.ts, src/mcp/delegates.ts, src/runtime/router-client.ts, and `src/runtime/su

🎯 Usefulness — sound-with-nits

Fixes the MCP researcher's out-of-the-box wiring: it defaults the sandbox base URL and injects a provisionable harness/model plus OpenAI-compatible router creds into the agent-run spec, all reachable through the existing delegate_research tool path.

  • Integration: The new behavior is reached through src/mcp/bin.ts:168-171 -> loadResearcherSupport -> createMcpServer({ researcherDelegate: ... }) at src/mcp/bin.ts:193-200, which registers delegate_research at src/mcp/server.ts:189-202. The provisioning helper is unit-tested in tests/mcp/researcher-provisioning.test.ts, and delegate_research already appears in the integration surface (`tests/mcp
  • Fit with existing patterns: It follows established patterns: env-driven harness/model/fanout config (like MCP_CODER_FANOUT_HARNESSES at src/mcp/bin.ts:27 and parseHarnesses at src/mcp/bin.ts:588), the WORKER_MODEL fallback convention, reuse of resolveRouterBaseUrl from src/model-resolution.ts, and profile factories accepting harness/model just as coderProfile does (src/mcp/delegates.ts:198-202). Keeping
  • Real-world viability: Defaults let the server start without SANDBOX_BASE_URL and the researcher run without manual harness/model/auth; concurrency is capped by maxConcurrency; fanout misconfiguration is clamped (effectiveVariants at src/mcp/bin.ts:528); empty/whitespace env values and already-versioned router bases are handled (src/mcp/researcher-provisioning.ts:36-47, :65). The main caveat is that the cano

💰 Value Audit

🟡 Hand-rolled model precedence chain could reuse resolveChatModel [duplication] ``

The model fallback MCP_RESEARCHER_MODEL → MCP_WORKER_MODEL → WORKER_MODEL → DEFAULT_MODEL in src/mcp/researcher-provisioning.ts:58-62 duplicates the precedence-resolver pattern that src/model-resolution.ts:89-98 (resolveChatModel) was extracted to unify. Since researcher-provisioning.ts already imports resolveRouterBaseUrl from the same module, using resolveChatModel([{source:'MCP_RESEARCHER_MODEL', model: env.MCP_RESEARCHER_MODEL}, ...], {source:'default', model: DEFAULT_MODEL}) w

🟡 Default sandbox base URL literal is duplicated [duplication] ``

src/mcp/bin.ts:372 hardcodes https://sandbox.tangle.tools, while src/mcp/delegation-profile.ts:48 already defines the same DEFAULT_SANDBOX_BASE_URL. The two live at different layers (MCP server vs. profile composer), but exporting and reusing the constant would prevent drift.

🎯 Usefulness Audit

🟡 Canonical profile builder does not forward the new override env vars [integration] ``

src/mcp/delegation-profile.ts:87-97 builds the MCP child's env map and forwards only TANGLE_API_KEY, SANDBOX_BASE_URL, and OTEL keys. The new MCP_RESEARCHER_HARNESS, MCP_RESEARCHER_MODEL, MCP_RESEARCHER_ROUTER_BASE_URL, and TANGLE_ROUTER_URL/TANGLE_ROUTER_BASE_URL overrides are therefore not automatically available when the MCP server is launched by composeProductionAgentProfile. The chosen defaults make this non-blocking, but a follow-up should forward those keys (or documen


What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass What it asks
Heuristic Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication Do added function/class names already exist elsewhere in the repo?
Value Audit What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

value-audit · 20260614T231303Z

@tangletools

Copy link
Copy Markdown
Contributor

✅ No Blockers — 7c972aa7

Readiness 74/100 · Confidence 70/100 · 11 findings (11 low)

deepseek glm aggregate
Readiness 80 74 74
Confidence 70 70 70
Correctness 80 74 74
Security 80 74 74
Testing 80 74 74
Architecture 80 74 74

Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision.

🟡 LOW Hardcoded sandbox default duplicates delegation-profile constant — src/mcp/bin.ts

Line 372 hardcodes 'https://sandbox.tangle.tools' inline. The same string is already exported as DEFAULT_SANDBOX_BASE_URL in src/mcp/delegation-profile.ts:48. Importing the constant would prevent drift if the default ever changes. Pure DRY nit; behavior is correct and matches the existing convention.

🟡 LOW No integration test for buildPreset wiring into delegate path — src/mcp/bin.ts

The researcher-provisioning.test.ts 'buildPreset composition' block (lines 132-154) manually simulates the resolve→apply flow on a hand-built spec, but no test exercises the actual loadResearcherSupport delegate with a mock peer to confirm buildPreset injects creds into a real factory-returned agentRunSpec. The single and fanout delegate paths (lines 445, 463, 517-524) are only covered indirectly. Given the PR's stated goal is making delegate_research actually produce a winner, a delegate-research integration test asserting OPENAI_API_KEY appears on the d

🟡 LOW Silent fanout harness/model length mismatch — src/mcp/bin.ts

At line 520, fanoutModels?.[i] ?? model gracefully handles out-of-bounds array access (index beyond MCP_RESEARCHER_FANOUT_MODELS length → undefined → falls back to model). However, when the operator misaligns MCP_RESEARCHER_FANOUT_MODELS length with MCP_RESEARCHER_FANOUT_HARNESSES length, no warning is emitted to stderr or logged. Operators debugging 'why does my fanout use the wrong model for harness 3' get no signal. Fix: add a warn-on-mismatch stanza after csv parsing.

🟡 LOW Structural cast as ProvisionableSpec trusts peer runtime shape — src/mcp/bin.ts

Lines 434, 523: preset.agentRunSpec and fanout specs are cast as ProvisionableSpec. If the @tangle-network/agent-knowledge peer ships an agentRunSpec whose sandboxOverrides field has a different runtime shape (e.g., a class instance, readonly fields, or a differently-named env key), applyRouterEnv would either silently no-op (if sandboxOverrides is a getter returning a frozen object) or write a field the peer never reads — producing the exact 'zero LLM calls, no winner' failure this PR fixes, but without any error. The ProvisionableSpec interface is intentionally permissive (Record<string, unknown>), which hides the coupling. A defensive check (e.g., asserting the spec is a

🟡 LOW Misleading JSDoc: applyRouterEnv says 'never replaces' but overwrites OPENAI_API_KEY/OPENAI_BASE_URL — src/mcp/researcher-provisioning.ts

Line 79-80: Merges — never replaces — any env the preset already supplied. The implementation at lines 91-94 spreads existing env first, then unconditionally sets OPENAI_API_KEY and OPENAI_BASE_URL. If a preset already provided these keys, they are overwritten. The comment is misleading — the function DOES replace router-specific keys. Behavior is likely intentional (router creds must win) but the comment contradicts the code. No test verifies the overwrite-vs-preserve behavior for these specific keys: the 'merge

🟡 LOW TANGLE_API_KEY silently promoted to sandbox OPENAI_API_KEY — src/mcp/researcher-provisioning.ts

Line 63: routerKey falls back to TANGLE_API_KEY, then applyRouterEnv injects it as OPENAI_API_KEY into the sandbox box env (line 92). A user who sets TANGLE_API_KEY for chat-model router auth may not expect it to be exposed as OPENAI_API_KEY inside researcher sandbox containers. The behavior IS documented in the module header and is the intended BYOK flow, and injection is gated on the key being present (non-empty). But there is no log/telemetry when the fallback fires, so the promotion is invisible. Consider a one-line st

🟡 LOW applyRouterEnv docstring overstates merge semantics — src/mcp/researcher-provisioning.ts

Line 79 docstring says 'merges — never replaces — any env the preset already supplied' but the spread order (existing env, then OPENAI_API_KEY, then OPENAI_BASE_URL) does overwrite those two specific keys if the preset already set them. In practice this is the correct behavior (router creds should win), but the docstring could mislead future readers into thinking existing OPENAI_* keys from the preset survive. Suggest: 'Merges router creds INTO the existing sandbox env (overrides OPENAI_API_KEY / OPENAI_BASE_URL if already present; preserves all other keys).'

🟡 LOW Composition test claims to mirror bin.ts buildPreset but doesn't exercise it — tests/mcp/researcher-provisioning.test.ts

Lines 137-155: The test comment says 'Mirrors what bin.ts buildPreset does' but it manually calls resolveResearcherProvisioning then applyRouterEnv. If bin.ts's actual buildPreset logic diverges from this manual composition (e.g., adds extra env, reorders calls, conditional branching), this test will not catch the regression. Consider either an integration test that calls buildPreset directly, or soften the comment to 'exercises the public surface that buildPreset composes'. Impact: low — the public API contract is still tested; only the integration glue is unverified.

🟡 LOW No direct test for MCP_RESEARCHER_ROUTER_BASE_URL with non-versioned URL — tests/mcp/researcher-provisioning.test.ts

Line 82 (Test 7) only tests MCP_RESEARCHER_ROUTER_BASE_URL with an already-versioned URL ('https://r.example.com/v1/'). The non-versioned case (e.g., 'https://r.example.com') through this specific env var is not tested directly — it is covered indirectly through TANGLE_ROUTER_BASE_URL in Test 5 (line 67), but the two codepaths converge at line 65 of the source only aft

🟡 LOW No test for MCP_RESEARCHER_FANOUT_MODELS set without MCP_RESEARCHER_FANOUT_HARNESSES — tests/mcp/researcher-provisioning.test.ts

Line 83-90: Tests both fanout envs set together. In bin.ts:516-521, fanoutHarnesses and fanoutModels are index-aligned — models without harnesses would produce an empty harness list (undefined → caller defaults to replicating the single harness N times for N variants — bin.ts:516). But if fanoutModels is set alone, the models array is returned by csv() while fanoutHarnesses is undefined, creating an index alignment that produces unexpected model assignments. A test that asserts fanoutModels is undefined when fanoutHarnesses is unset (or vice versa) would validate the index-aligned contract. Current behavior: silent, not crashing — but easy to mi

🟡 LOW Version-normalization regex inconsistency between resolveRouterBaseUrl and researcher path for /vN where N>1 — tests/mcp/researcher-provisioning.test.ts

resolveRouterBaseUrl (src/model-resolution.ts:46) only strips /v1 (regex //v1/?$/), while the researcher normalization (src/mcp/researcher-provisioning.ts:65) checks //v\d+/?$/ (any version). For a TANGLE_ROUTER_URL like 'https://r.example.com/v2', resolveRouterBaseUrl returns it unstripped, but the researcher regex then correctly recognizes /v2 and avoids double-appending. The final result is correct but for non-obvious reasons — the two layers have inconsistent version matching. No test exercises /v2 or higher through either path. Impact: low — end result is correct today; risk is future refactoring that removes one layer without understanding the补偿.


tangletools · 2026-06-14T23:17:58Z · trace

…s, more tests

Round-3 PR #302 audit (all LOW):
- DRY: export + reuse DEFAULT_SANDBOX_BASE_URL instead of an inline literal.
- Docstring accuracy: applyRouterEnv overlays (preserves preset env, sets the two
  OPENAI_ keys authoritatively) — was misleadingly worded 'never replaces'.
- Honest test naming for the resolve→apply composition; +tests for non-versioned
  router base and fanout-models-without-harnesses.

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — 2fcc8773

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T23:21:06Z

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Value Audit — sound

Verdict sound
Concerns 0 (none)
Heuristic 0.0s
Duplication 0.0s
Interrogation 111.3s (2 bridge agents)
Total 111.3s

🎯 Usefulness — sound

The PR fixes a real, reachable breakage in the canonical agent-runtime-mcp path and does so in the grain of the codebase.

  • Integration: The changed code is inside src/mcp/bin.ts, the agent-runtime-mcp binary entry point (package.json:83-85). main() calls loadResearcherSupport at bin.ts:171 whenever MCP_DISABLE_RESEARCHER is unset, and the new resolveResearcherProvisioning + applyRouterEnv are invoked at bin.ts:438-441 and 529-530 to build every researcher preset. The resulting researcherSupport.delegate is passed i
  • Fit with existing patterns: It fits established patterns rather than competing with them. Router-base resolution reuses the repo's resolveRouterBaseUrl from src/model-resolution.ts:44-48. Model/key fallback reuses the pervasive WORKER_MODEL / TANGLE_API_KEY conventions seen across bench/ and examples/. Harness/model/fanout env naming mirrors the coder delegate options in src/mcp/delegates.ts:127-175. Box-env in
  • Real-world viability: It handles realistic edge cases: empty/whitespace env values are treated as unset (trimmed at researcher-provisioning.ts:36-39), model resolution falls back through MCP_RESEARCHER_MODELMCP_WORKER_MODELWORKER_MODEL → default (l.58-62), missing router key makes applyRouterEnv a no-op (l.91), and fanout misconfig is bounded by fanout.agentRuns.slice(0, variants) with `effectiveVaria

No concerns — sound change, no better or existing approach found. ✅


What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass What it asks
Heuristic Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication Do added function/class names already exist elsewhere in the repo?
Value Audit What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

value-audit · 20260614T233054Z

@tangletools

Copy link
Copy Markdown
Contributor

✅ No Blockers — 2fcc8773

Readiness 80/100 · Confidence 70/100 · 8 findings (8 low)

deepseek glm aggregate
Readiness 83 80 80
Confidence 70 70 70
Correctness 83 80 80
Security 83 80 80
Testing 83 80 80
Architecture 83 80 80

Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision.

🟡 LOW DEFAULT_SANDBOX_BASE_URL behavioral change for unset SANDBOX_BASE_URL — src/mcp/bin.ts

Line 373: Previously when SANDBOX_BASE_URL was unset, baseUrl was omitted from the Sandbox constructor (SDK used its own internal default). Now DEFAULT_SANDBOX_BASE_URL ('https://sandbox.tangle.tools') is always passed. If the SDK's built-in default ever differed from this value, callers who relied on the SDK default would silently switch endpoints. In practice these should match and the SDK ≥0.6 makes baseUrl required, so this is the correct fix — just flagging the behavioral delta.

🟡 LOW No integration test for buildPreset composition in bin.ts — src/mcp/bin.ts

Lines 439-443: buildPreset composes singleFactory({task, harness, model}) + applyRouterEnv. The unit test (researcher-provisioning.test.ts:148-171) explicitly notes this peer-call half is not exercised. A mock singleFactory stub in a bin-level test would verify the full composition (harness/model passed to factory + router env applied to the resulting spec). Low risk because the individual functions are well-tested and the composition is 3 lines, but the integration seam is unverified.

🟡 LOW Duplicate CSV parsing in parseHarnesses vs csv helper — src/mcp/researcher-provisioning.ts

researcher-provisioning.ts:41-47 defines a csv() helper used for fanout harness + model lists. bin.ts:595-602 defines parseHarnesses() with identical logic used for coder fanout harnesses. The coder path does not reuse the researcher's csv helper, creating two copies of the same string→array parser. Impact: maintenance only — a behavior change in one won't reach the other. Fix: extract shared csv parser or have parseHarnesses delegate to the provisioning module's csv.

🟡 LOW CSV parser tested only with multi-entry + whitespace entries; single-entry not covered — tests/mcp/researcher-provisioning.test.ts

The csv() helper (researcher-provisioning.ts:41-47) is tested with two comma-separated entries and whitespace-only entries. A single entry without commas (e.g. 'opencode') is implicitly covered by the helper's logic (split still works), but no explicit test asserts the single-entry array result. Low severity — the helper is simple enough that the risk is negligible.

🟡 LOW Composition test skips the singleFactory integration point — tests/mcp/researcher-provisioning.test.ts

The final describe block (line 143) explicitly notes it does not exercise singleFactory(harness, model) — the actual production composition at bin.ts:440-441 where resolveResearcherProvisioning output feeds singleFactory then applyRouterEnv. The test comment is honest about this gap. This is the one path where a shape mismatch between ProvisionableSpec and the real agentRunSpec returned by singleFactory would surface. Not blocking since the type contract is narrow, but a mock singleFactory returning a spec-shaped object would close the last integration seam.

🟡 LOW Missing edge case: base URL with query parameters breaks /v1 normalization — tests/mcp/researcher-provisioning.test.ts

The regex //v\d+/?$/ at researcher-provisioning.ts:65 assumes the version segment is the last path component. A value like MCP_RESEARCHER_ROUTER_BASE_URL=https://r.example.com/v1?format=openai would fail the regex test (ends with '?format=openai', not '/v1'), causing a double-append: 'https://r.example.com/v1?format=openai/v1'. No test covers this. Low severity because the env var name implies a bare base URL, not a request URL with query params.

🟡 LOW No test for applyRouterEnv overwriting pre-existing OPENAI_API_KEY — tests/mcp/researcher-provisioning.test.ts

applyRouterEnv is documented as 'intentionally authoritative' for OPENAI_API_KEY/OPENAI_BASE_URL (researcher-provisioning.ts:73-74). The tests verify that non-conflicting keys (KEEP_ME, NS_TOKEN) are preserved, but no test asserts that a pre-existing OPENAI_API_KEY in the spec's env is overwritten by the router key. This is the one behavioral guarantee that distinguishes 'merge' from 'replace' semantics for these two specific keys. Add: pass a spec with sandboxOverrides.env.OPENAI_API_KEY already set, apply, assert it equals the new routerKey.

🟡 LOW Untested: MCP_RESEARCHER_ROUTER_BASE_URL with trailing slash but no version — tests/mcp/researcher-provisioning.test.ts

The test suite covers trailing-slash with /v1 (line 74-81) and no-trailing-slash without version (line 100-106), but not the combination of trailing slash without a version path (e.g. 'https://r.example.com/'). The source's .replace(//$/, '') handles this correctly, but there's no explicit assertion proving it.


tangletools · 2026-06-14T23:35:27Z · trace

@drewstone drewstone merged commit 671b8a9 into main Jun 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants