fix(mcp): delegate_research runs out of the box (baseUrl default + researcher harness/model/BYOK env)#302
Conversation
…YOK env so delegate_research runs out of the box
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved PR — d1adaece
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T22:33:53Z
tangletools
left a comment
There was a problem hiding this comment.
🟠 Value Audit — better-approach-exists
| Verdict | better-approach-exists |
| Concerns | 7 (1 medium-concern, 6 weak-concern) |
| Heuristic | 0.0s |
| Duplication | 0.0s |
| Interrogation | 257.8s (2 bridge agents) |
| Total | 257.8s |
💰 Value — better-approach-exists
The change makes single-variant delegate_research start and make LLM calls by defaulting the sandbox URL and injecting a provisionable harness/model plus router creds, but it leaves the multi-variant fanout path on the broken defaults, so the fix is incomplete rather than fully out-of-the-box.
- What it does: In
src/mcp/bin.ts: (1) defaultsSANDBOX_BASE_URLtohttps://sandbox.tangle.toolswhen constructing the sandbox client (line 357); (2) reads env overridesMCP_RESEARCHER_HARNESS,MCP_RESEARCHER_MODEL(withMCP_WORKER_MODELfallback), andMCP_RESEARCHER_ROUTER_BASE_URL, defaulting toopencode,moonshotai/kimi-k2.6, andhttps://router.tangle.tools/v1(lines 410-415); (3) introduce - Goals it achieves: Make
delegate_researchusable in a fresh environment without manually settingSANDBOX_BASE_URLor tuning researcher harness/model/auth; prevent the MCP server from crashing on startup because@tangle-network/sandbox>=0.6 requiresbaseUrl; prevent the researcher from producing no winner because the profile default harness is not broadly provisionable and the sandbox SDK does not forward `b - Assessment: The single-variant fix is coherent and follows codebase patterns: defaulting the sandbox URL matches
src/mcp/delegation-profile.ts:48-79, and env-overridable harness/model follows the same style asMCP_CODER_FANOUT_HARNESSES(src/mcp/bin.ts:27-91). However, the change is incomplete. The multi-variant fanout path atsrc/mcp/bin.ts:502still callsfanoutFactory({ task })without passing th - Better / existing approach: Apply the same router-cred injection to every
AgentRunSpecin the fanout result so multi-variant research runs also work out of the box. I checkedsrc/mcp/bin.ts:367-372(ResearcherFanoutPresetexposesagentRuns) andsrc/mcp/bin.ts:502(fanoutFactory({ task })is unchanged). The simplest fix is to map overfanout.agentRunsafterfanoutFactoryreturns and overlay the same `sandboxOv
🎯 Usefulness — sound-with-nits
Fixes the researcher's single-variant path so it starts, authenticates, and calls an LLM out of the box, but leaves the multi-variant fanout path with the same broken defaults and introduces env/url conventions that diverge from the repo's shared router-resolution helper.
- Integration: The new behavior is reachable:
loadResearcherSupportinsrc/mcp/bin.ts:379is invoked bymainwhen the researcher is enabled; its delegate is registered asdelegate_researchinsrc/mcp/server.ts:189, and production callers mount the stdio MCP entry viabuildDelegationMcpServerinsrc/mcp/delegation-profile.ts:69. TheSANDBOX_BASE_URLdefault insrc/mcp/bin.ts:357` resolves the sand - Fit with existing patterns: It mirrors the coder delegate's profile-factory usage (
coderProfile({ task, harness, model })insrc/mcp/delegates.ts:198). However, it hardcodeshttps://router.tangle.tools/v1and reads a newMCP_RESEARCHER_ROUTER_BASE_URLenv instead of reusingresolveRouterBaseUrl/TANGLE_ROUTER_BASE_URLfromsrc/model-resolution.ts:36-48, and theMCP_WORKER_MODELfallback (src/mcp/bin.ts:412) - Real-world viability: Holds up for the default single-variant path (including detached and streaming dispatch) because the harness/model/router constants are resolved once per process. It does not apply the same wiring to the
variants > 1fanout path (src/mcp/bin.ts:502), which still uses the non-provisionable default harnesses and gets no router env injection. ThesandboxOverrides.envassignment in `src/mcp/bin.
💰 Value Audit
🟠 Multi-variant researcher fanout still uses the broken defaults [better-architecture] ``
The fanout path at
src/mcp/bin.ts:502callsfanoutFactory({ task })without the newharness/modeldefaults and without injectingOPENAI_API_KEY/OPENAI_BASE_URLinto each returnedagentRun. Sovariants > 1research delegations still use the non-provisionable default harness and make zero LLM calls, directly contradicting the PR title thatdelegate_researchruns out of the box. The better approach is to post-processfanout.agentRunswith the samesandboxOverrides.envinjectio
🟡 Resume prompt builder ignores the new harness/model defaults [against-grain] ``
resume.messageatsrc/mcp/bin.ts:528callssingleFactory({ task })without passingharness/modeland without the router-env injection. WhiletaskToPromptmay not currently depend on them, this path is inconsistent with thebuildPresetused for dispatch and could drift. It should use the same preset construction logic as the single-variant dispatch path.
🟡 Router base URL default is not reused from existing resolution [duplication] ``
The new default
https://router.tangle.tools/v1atsrc/mcp/bin.ts:415duplicates the router URL constants used across bench files.src/model-resolution.ts:36-48already centralizes router-base resolution viaTANGLE_ROUTER_URL/TANGLE_ROUTER_BASE_URLandDEFAULT_ROUTER_BASE_URL. The env varMCP_RESEARCHER_ROUTER_BASE_URLis reasonable, but it should fall back to the existingTANGLE_ROUTER_BASE_URL/TANGLE_ROUTER_URLenv vars before hardcoding the URL.
🎯 Usefulness Audit
🟡 Researcher router URL reinvents the project's model-resolution convention [fit] ``
src/mcp/bin.ts:415hardcodeshttps://router.tangle.tools/v1and readsMCP_RESEARCHER_ROUTER_BASE_URL, whilesrc/model-resolution.ts:36-48already resolves router bases fromTANGLE_ROUTER_URL/TANGLE_ROUTER_BASE_URL. Reusing that helper and appending/v1would align the MCP child with the rest of the codebase and avoid a one-off env key.
🟡 MCP_WORKER_MODEL fallback does not match the repo's WORKER_MODEL convention [fit] ``
src/mcp/bin.ts:412falls back toMCP_WORKER_MODEL, a name introduced only here, whereas bench scripts and examples throughout the repo useWORKER_MODEL(e.g.bench/src/run.ts:111,examples/strategy-suite/strategy-suite.ts:128). Falling back toWORKER_MODEL(orMODEL, as inbench/src/research-loop.mts:31) would be more predictable.
🟡 Multi-variant fanout still uses broken defaults and gets no router creds [robustness] ``
When
args.variants > 1,src/mcp/bin.ts:502still callsmultiHarnessResearcherFanout({ task })without passing the env-driven harness/model or injectingOPENAI_API_KEY/OPENAI_BASE_URL. The factory's defaults includeopencode/zai-coding-plan/glm-5.1(agent-knowledge@1.4.0profiles/index.d.ts), so fanout calls will continue to make zero LLM calls. Extend thebuildPreset-style wiring to the fanout path or document that variants>1 requires additional provisioning.
🟡 sandboxOverrides.env is overwritten rather than merged [robustness] ``
In
src/mcp/bin.ts:420-422, settingspec.sandboxOverrides.envreplaces anyenvthatresearcherProfileor a future preset already supplied. It should merge with(spec.sandboxOverrides?.env ?? {})so other required box env vars are preserved.
What this audit checks
It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.
| Pass | What it asks |
|---|---|
| Heuristic | Vague title? Whitespace-only or cruft-bearing diff? (content signals only) |
| Duplication | Do added function/class names already exist elsewhere in the repo? |
| Value Audit | What does it do? What goal does it achieve? Is it good? Better architecture or already-exists? |
| Usefulness Audit | Does it integrate and fit? Will it hold up in real use and actually get used? |
Findings are concerns, not blocks — the human reviewer decides what to do with them.
❌ Needs Work —
|
| deepseek | glm | aggregate | |
|---|---|---|---|
| Readiness | 69 | 54 | 54 |
| Confidence | 65 | 65 | 65 |
| Correctness | 69 | 54 | 54 |
| Security | 69 | 54 | 54 |
| Testing | 69 | 54 | 54 |
| Architecture | 69 | 54 | 54 |
Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision.
Blocking
🔴 HIGH Fanout path (variants > 1) bypasses router credential injection — src/mcp/bin.ts
Line 502:
const fanout = fanoutFactory({ task })— the fanout path calls the factory directly without applyingbuildPreset. The PR's stated goal is to injectOPENAI_API_KEY/OPENAI_BASE_URLinto sandbox env so in-box agents can make LLM calls, but this only happens in the single-variant path (lines 435, 453). Multi-variant researcher runs will produce the same 'no winner' failure the PR is fixing. Thefanout.agentRunsarray needs the samesandboxOverrides.envinjection applied to each agent run spec. Fix: either iteratefanout.agentRunsand inject
Other
🟠 MEDIUM No tests for buildPreset or env injection path — src/mcp/bin.ts
0 of 23 test files under tests/mcp/ exercise
loadResearcherSupportorbuildPreset. The researcher delegate tests use stubs that never invoke the realsingleFactory, so the harness/model/env injection path (lines 404-426) and the sandboxOverrides mutation (lines 419-423) have zero test coverage. The env replacement bug (finding above) would not be caught by existing tests. Add a test that mockssingleFactoryto return a preset with pre-existingsandboxOverrides.envand asserts the merge behavior, or better, an integration test using a real a
🟠 MEDIUM buildPreset replaces preset env vars instead of merging — src/mcp/bin.ts
At lines 419-423,
buildPresetsetsspec.sandboxOverrides = { ...(spec.sandboxOverrides ?? {}), env: { OPENAI_API_KEY: routerKey, OPENAI_BASE_URL: routerBaseUrl } }. This preserves other top-level sandboxOverrides properties (name, backend) but completely replaces any pre-existingenvmap from the researcher preset. If a future preset version ships with additional required env vars, they would be silently dropped. ThesandboxOverridestype permitsenv?: Record<string, string>as a top‑level key, so this is a latent footgun. Fix:env: { ...(spec.sandboxOverrides?.env), OPENAI_API_KEY: routerKey, OPENAI_BASE_URL: routerBaseUrl }.
🟠 MEDIUM sandboxOverrides.env replaces rather than merges existing env vars — src/mcp/bin.ts
Lines 420-423:
env: { OPENAI_API_KEY: routerKey, OPENAI_BASE_URL: routerBaseUrl }replaces the entireenvobject onspec.sandboxOverrides. If the researcher profile's defaultagentRunSpecalready carries env vars the in-box agent needs (e.g., tool-specific config, namespace tokens), they are silently dropped. The top-level spread...(spec.sandboxOverrides ?? {})preserves other sandboxOverrides keys but not nested env entries. Fix:env: { ...(spec.sandboxOverrides?.env ?? {}), OPENAI_API_KEY: routerKey, OPENAI_BASE_URL: routerBaseUrl }.
🟡 LOW Empty-string SANDBOX_BASE_URL no longer falls back to default — src/mcp/bin.ts
Line 357:
const baseUrl = process.env.SANDBOX_BASE_URL ?? 'https://sandbox.tangle.tools'. Nullish coalescing only handles null/undefined. The old code at line 356 used a falsy spread guard (...(baseUrl ? { baseUrl } : {})), soSANDBOX_BASE_URL=''would omit baseUrl entirely. Now an empty string passes through to the Sandbox constructor, which could fail to connect. Low probability — nobody intentionally sets it to empty — but a behavior regression.
🟡 LOW No independent router API key override — src/mcp/bin.ts
Line 413:
routerKeyis hardcoded toprocess.env.TANGLE_API_KEYwith no separate override (e.g.,MCP_RESEARCHER_ROUTER_KEY). The comment says 'TANGLE_API_KEY doubles as the router key,' but if the OpenAI-compatible router and the sandbox API use different credentials, there is no way to configure the router key independently. Fix:const routerKey = process.env.MCP_RESEARCHER_ROUTER_KEY ?? process.env.TANGLE_API_KEY.
🟡 LOW Resume message path omits harness/model from singleFactory call — src/mcp/bin.ts
At line 528, the resume driver's
message()function callssingleFactory({ task })without harness/model, unlikebuildPresetwhich passes both. This only affects the display message on the delegation record (not execution —settleSinglecorrectly callsbuildPreset), but if a future profile variant produces a differenttaskToPromptbased on harness/model, the displayed message would diverge from what was actually executed. Currently harmless since execution and display use the same task object.
🟡 LOW resume.message uses default harness/model, inconsistent with buildPreset — src/mcp/bin.ts
Line 528:
singleFactory({ task })in theresume.messagefunction omitsharnessandmodel, whilesettleSingle(line 435) and the delegate (line 453) usebuildPreset(task)with the configured values. If harness/model affecttaskToPrompt, the resumed turn's prompt differs from the original dispatch. Low impact in practice — resume re-attaches to a running session anddriveTurnis idempotent for in-flight turns — but the inconsistenc
tangletools · 2026-06-14T22:42:04Z · trace
tangletools
left a comment
There was a problem hiding this comment.
❌ 1 Blocking Finding — d1adaece
Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision.
Full immutable report for this review: trace
Summary comment for this run: full summary
tangletools · 2026-06-14T22:42:04Z · immutable trace
…visioning Addresses PR #302 audit: - HIGH: variants>1 fanout path now uses a provisionable harness/model and gets router creds injected per agent-run (was: zero LLM calls, same 'no winner'). - MEDIUM: extract resolveResearcherProvisioning + applyRouterEnv into a tested module; applyRouterEnv MERGES box env instead of replacing it. - LOW: empty-string SANDBOX_BASE_URL falls back to default; MCP_RESEARCHER_ROUTER_KEY override; reuse resolveRouterBaseUrl (TANGLE_ROUTER_*) + WORKER_MODEL convention; resume.message uses buildPreset for prompt consistency.
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved PR — 2ad0336c
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T22:49:38Z
tangletools
left a comment
There was a problem hiding this comment.
🟡 Value Audit — sound-with-nits
| Verdict | sound-with-nits |
| Concerns | 4 (4 weak-concern) |
| Heuristic | 0.0s |
| Duplication | 0.0s |
| Interrogation | 184.2s (2 bridge agents) |
| Total | 184.2s |
💰 Value — sound-with-nits
The change makes the MCP server's researcher delegate runnable in a fresh environment by defaulting the sandbox base URL and injecting a provisionable harness/model plus OpenAI-compatible router credentials into every agent-run spec; it's a coherent, scoped fix with only small reuse/doc nits.
- What it does: It does three concrete things: (1) defaults
SANDBOX_BASE_URLtohttps://sandbox.tangle.toolsinloadSandboxClient(src/mcp/bin.ts:363) so the MCP server starts without the env var; (2) adds aresolveResearcherProvisioning()helper (src/mcp/researcher-provisioning.ts:42) that picks a worker harness/model and resolves router credentials from env (MCP_RESEARCHER_HARNESS, `MCP_RESEARCHER - Goals it achieves: It achieves the goal of making
delegate_researchwork out-of-the-box: the server no longer crashes on startup whenSANDBOX_BASE_URLis unset, and the in-box researcher agent no longer makes zero LLM calls due to an unprovisionable default harness and missing provider auth. It also keeps the wiring env-overridable so deployments can override harness/model/router without code changes. - Assessment: Good on its merits. The fix is narrowly scoped to the broken path, extracts testable logic into its own module, reuses the existing
resolveRouterBaseUrlhelper fromsrc/model-resolution.ts, and applies the same provisioning consistently to single-variant, multi-variant fanout, and detached-resume paths. It matches the codebase's pattern of env-driven MCP configuration. - Better / existing approach: No materially better architecture or existing equivalent exists for the researcher provisioning. I looked at
src/mcp/delegation-profile.ts(env forwarding for the MCP child),src/model-resolution.ts(router base resolution),src/mcp/delegates.tsandsrc/profiles/coder.ts(coder harness/model wiring), andsrc/runtime/sandbox-backend.ts/src/runtime/types.ts(sandbox override shape). The
🎯 Usefulness — sound-with-nits
Fixes the out-of-the-box researcher path by defaulting the sandbox base URL and injecting provisionable harness/model plus router creds into the MCP delegate, wired correctly through the existing MCP server.
- Integration: The new behavior is reachable.
main()insrc/mcp/bin.ts:159-162callsloadResearcherSupport()when the researcher tool is enabled; the returned delegate is passed tocreateMcpServer()atbin.ts:187;createMcpServer()registers thedelegate_researchtool whenever a delegate is supplied (src/mcp/server.ts:189-202). The provisioning helper is imported only bybin.tsand is not dead - Fit with existing patterns: It fits established patterns. Router base resolution reuses
resolveRouterBaseUrlfromsrc/model-resolution.ts(researcher-provisioning.ts:12), and the env naming mirrors the existingMCP_*convention inbin.ts(e.g.MCP_CODER_FANOUT_HARNESSESatbin.ts:27vs the newMCP_RESEARCHER_FANOUT_HARNESSESatbin.ts:505). The only duplication is the public sandbox default, which already e - Real-world viability: It should hold up under normal use. Provisioning is resolved once per server load and reused across concurrent
delegate_researchcalls, which is the right granularity for env-driven config. The fanout path applies router creds to every generated agent-run spec (bin.ts:512-514), not just the single-variant path. It handles whitespace/empty env values as unset and falls back through model/router
💰 Value Audit
🟡 Repeated default sandbox base URL literal [duplication] ``
src/mcp/delegation-profile.ts:48already definesDEFAULT_SANDBOX_BASE_URL = 'https://sandbox.tangle.tools'and uses it to forwardSANDBOX_BASE_URLto the MCP child.src/mcp/bin.ts:363now repeats the same literal. Import the existing constant to keep the default in one place.
🟡 New env vars not added to the bin header inventory [maintenance] ``
src/mcp/bin.ts:13-56lists the environment variables the MCP server honors, but it omits the newMCP_RESEARCHER_HARNESS,MCP_RESEARCHER_MODEL,MCP_WORKER_MODEL,MCP_RESEARCHER_ROUTER_KEY,MCP_RESEARCHER_ROUTER_BASE_URL, andMCP_RESEARCHER_FANOUT_HARNESSESvariables introduced by this change. Adding them preserves the header as the canonical discovery surface.
🎯 Usefulness Audit
🟡 Sandbox base URL default is duplicated [ergonomics] ``
src/mcp/delegation-profile.ts:48already centralizesDEFAULT_SANDBOX_BASE_URL = 'https://sandbox.tangle.tools', butsrc/mcp/bin.ts:363hardcodes the same literal. Since both files live undersrc/mcp, export and reuse the constant to keep the two entry points from drifting.
🟡 Researcher fanout lacks per-harness model override [ergonomics] ``
The coder delegate supports
fanoutModelsindexed parallel tofanoutHarnesses(src/mcp/delegates.ts:147), but the new researcher fanout forces every variant to the same resolved model (bin.ts:510:models: fanoutHarnesses.map(() => model)). There is noMCP_RESEARCHER_FANOUT_MODELScounterpart. For now this is fine for the minimal fix, but it is an asymmetry a future caller will likely ask for.
What this audit checks
It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.
| Pass | What it asks |
|---|---|
| Heuristic | Vague title? Whitespace-only or cruft-bearing diff? (content signals only) |
| Duplication | Do added function/class names already exist elsewhere in the repo? |
| Value Audit | What does it do? What goal does it achieve? Is it good? Better architecture or already-exists? |
| Usefulness Audit | Does it integrate and fit? Will it hold up in real use and actually get used? |
Findings are concerns, not blocks — the human reviewer decides what to do with them.
✅ No Blockers —
|
| deepseek | glm | aggregate | |
|---|---|---|---|
| Readiness | 70 | 83 | 70 |
| Confidence | 70 | 70 | 70 |
| Correctness | 70 | 83 | 70 |
| Security | 70 | 83 | 70 |
| Testing | 70 | 83 | 70 |
| Architecture | 70 | 83 | 70 |
Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision.
🟠 MEDIUM Fanout harness count not validated against variants — src/mcp/bin.ts
At lines 504-506,
parseHarnesses(process.env.MCP_RESEARCHER_FANOUT_HARNESSES)may return an array shorter thanvariants. This is passed tofanoutFactory(line 507), which likely creates one AgentRunSpec per harness — fewer than variants. Thenfanout.agentRuns.slice(0, variants)(line 517) produces fewer specs thanmaxIterations: variants([line 526](https://github.com/tangle-network/agent-runtime/blob/2ad0336caacb853291ab318d13f30
🟡 LOW All fanout variants use the same model regardless of harness — src/mcp/bin.ts
Line 510:
models: fanoutHarnesses.map(() => model)— every fanout variant gets the same model even when MCP_RESEARCHER_FANOUT_HARNESSES specifies different harnesses. If different harnesses require different models (e.g., a codex harness needs a different model id), there's no way to configure per-harness models. Impact: limits fanout diversity to harness-only. Acceptable as a v1 design but worth a follow-up env like MCP_RESEARCHER_FANOUT_MODELS.
🟡 LOW Fanout harness count vs variants mismatch has no guard — src/mcp/bin.ts
Lines 504-511: when MCP_RESEARCHER_FANOUT_HARNESSES is set with fewer entries than
variants, fanoutFactory produces fewer agentRuns than variants. Then agentRuns.slice(0, variants) is a no-op (already shorter), but maxIterations is still set tovariants(line 526). The loop may attempt more iterations than there are distinct runs. No validation warns or clamps. Impact: env-misconfiguration silently under-delivers fanout diversity. Fix: clamp maxIterations to Math.min(variants, fanout.agentRuns.length) or validate that fanoutHarnesses.length >= variant
🟡 LOW Missing env variable documentation in bin.ts header — src/mcp/bin.ts
Lines 13-56 document MCP server env variables but omit the 6 new researcher-provisioning variables introduced by this PR: MCP_RESEARCHER_HARNESS, MCP_RESEARCHER_MODEL, MCP_RESEARCHER_FANOUT_HARNESSES, MCP_RESEARCHER_ROUTER_KEY, MCP_RESEARCHER_ROUTER_BASE_URL, and MCP_WORKER_MODEL. Operators configuring researcher behavior have no in-code reference.
🟡 LOW No integration test for buildPreset wrapper in loadResearcherSupport — src/mcp/bin.ts
The
buildPresetclosure (lines 422-426) wrapssingleFactorywith harness/model injection and router env application. It is called in 4 code paths (settleSingle, single-variant delegate, fanout delegate, resume message). The unit tests in tests/mcp/researcher-provisioning.test.ts cover the underlying resolveResearcherProvisioning and applyRouterEnv functions, but no test verifies that buildPreset correctly integrates them — e.g., that the spec passed to runDetachedTurn or runLoop actually carries the provisioned harness/model/router env.
🟡 LOW parseHarnesses reads process.env directly, bypassing the injectable env pattern — src/mcp/bin.ts
Line 505:
parseHarnesses(process.env.MCP_RESEARCHER_FANOUT_HARNESSES)reads env directly, while resolveResearcherProvisioning() (line 421) accepts an injectable env param for testability. This means the fanout harness override cannot be unit-tested without mocking process.env. Fix: thread env through resolveResearcherProvisioning's return or accept it as a parameter to loadResearcherSupport.
🟡 LOW Biome formatting violations will fail CI lint gate — tests/mcp/researcher-provisioning.test.ts
Biome reports: (1) import sort —
type ProvisionableSpecmust come afterapplyRouterEnv(line 2-6); (2) format — multiple assertion lines exceed print width (lines 33, 35-36, 41, 52-53, 57-58, 63). CI runspnpm run lint=biome check src tests examplesper .github/workflows/ci.yml, which will exit non-zero. Fix:pnpm run lint:fixornpx biome check --write tests/mcp/researcher-provisioning.test.ts. Note: the implementation file src/mcp/researcher-provisioning.ts also has one formatting violation
🟡 LOW Missing TANGLE_ROUTER_URL vs TANGLE_ROUTER_BASE_URL precedence test — tests/mcp/researcher-provisioning.test.ts
resolveRouterBaseUrl in model-resolution.ts:42 uses TANGLE_ROUTER_URL ?? TANGLE_ROUTER_BASE_URL, so TANGLE_ROUTER_URL takes priority. No test verifies this precedence when both env vars are set. The 'reuses the repo router base' test at line 50 only sets one at a time. Impact: low — the precedence is a simple ?? operator, unlikely to regress, but it's an untested contract.
🟡 LOW Version-preservation test exercises wrong code path — tests/mcp/researcher-provisioning.test.ts
Line 55-59: The comment says 'already-versioned base is left intact (no double /v1)' but this test passes TANGLE_ROUTER_URL: 'https://r.example.com/v1/'. resolveRouterBaseUrl strips /v1 (via .replace(//v1/?$/, '')) before the researcher-provisioning regex ever sees it. The researcher-provisioning regex then sees 'https://r.example.com' (no version) and appends /v1, coincidentally producing the same result. A direct test of the researcher regex would use MCP_RESEARCHER_ROUTER_BASE_URL: 'https://r.example.com/v1/' which bypasses resolveRouterBaseUrl. The /v2 case IS correctly tested via [line 22](https://github.com/tangle-network/agent-runtime/b
🟡 LOW applyRouterEnv no-op not tested with empty string routerKey — tests/mcp/researcher-provisioning.test.ts
Line 85-89: The no-op test only passes undefined as routerKey. The guard is if (!routerKey) return, which would also catch '' (empty string) and null. No test for routerKey='' — though unlikely in practice from the resolver (trimmed never returns ''), it's an uncovered branch if applyRouterEnv is called directly by other code.
tangletools · 2026-06-14T22:57:08Z · trace
…, lint/tests Round-2 PR #302 audit: - MEDIUM: clamp maxIterations/maxConcurrency to actual run count when the fanout harness list is shorter than variants (no over-claiming iterations). - per-harness fanout models (MCP_RESEARCHER_FANOUT_MODELS); fanout harnesses now resolved via resolveResearcherProvisioning (injectable env, unit-testable). - document the 6 new MCP_RESEARCHER_* env vars in the bin header. - biome-clean; +composition + precedence + empty-key tests (13 total).
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved PR — 7c972aa7
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T23:08:13Z
tangletools
left a comment
There was a problem hiding this comment.
🟡 Value Audit — sound-with-nits
| Verdict | sound-with-nits |
| Concerns | 3 (3 weak-concern) |
| Heuristic | 0.0s |
| Duplication | 0.0s |
| Interrogation | 179.5s (2 bridge agents) |
| Total | 179.5s |
💰 Value — sound-with-nits
Makes delegate_research runnable out-of-the-box by defaulting the sandbox baseUrl and injecting provisionable harness/model/OpenAI router creds into every researcher box env; a coherent, well-tested fix that mostly follows existing patterns.
- What it does: The change does three concrete things in the MCP server: (1) defaults
SANDBOX_BASE_URLtohttps://sandbox.tangle.toolsinloadSandboxClientso the server starts without the env var (src/mcp/bin.ts:372); (2) addssrc/mcp/researcher-provisioning.ts, which resolves a provisionable researcher harness/model and OpenAI-compatible router key/base from env; and (3) wires that provisioning into ` - Goals it achieves: The change eliminates two failure modes that made
delegate_researchunusable in a fresh environment: the MCP server crashing on startup because sandbox ≥0.6 requiresbaseUrl, and researcher runs returning an instant "produced no winner" because the profile's default harness/model are not broadly provisionable and the sandbox SDK does not forwardbackend.model.apiKeyto the in-box OpenAI-comp - Assessment: This is a good change on its merits. It is narrowly scoped to the broken path, extracts the env-resolution logic into a small, testable module, and applies the fix consistently to both single and fanout dispatch. It reuses the repo's existing
resolveRouterBaseUrlfromsrc/model-resolution.ts(src/mcp/researcher-provisioning.ts:12) and respects the codebase's pattern of env-driven, optional-p - Better / existing approach: No materially better architecture exists for the overall fix. The router-cred injection is a researcher-specific workaround caused by the sandbox SDK not wiring
backend.model.apiKeyfor that profile, so generalizing it prematurely would be speculative. I looked for existing equivalents insrc/mcp/delegation-profile.ts,src/mcp/delegates.ts,src/runtime/router-client.ts, and `src/runtime/su
🎯 Usefulness — sound-with-nits
Fixes the MCP researcher's out-of-the-box wiring: it defaults the sandbox base URL and injects a provisionable harness/model plus OpenAI-compatible router creds into the agent-run spec, all reachable through the existing delegate_research tool path.
- Integration: The new behavior is reached through
src/mcp/bin.ts:168-171->loadResearcherSupport->createMcpServer({ researcherDelegate: ... })atsrc/mcp/bin.ts:193-200, which registersdelegate_researchatsrc/mcp/server.ts:189-202. The provisioning helper is unit-tested intests/mcp/researcher-provisioning.test.ts, anddelegate_researchalready appears in the integration surface (`tests/mcp - Fit with existing patterns: It follows established patterns: env-driven harness/model/fanout config (like
MCP_CODER_FANOUT_HARNESSESatsrc/mcp/bin.ts:27andparseHarnessesatsrc/mcp/bin.ts:588), theWORKER_MODELfallback convention, reuse ofresolveRouterBaseUrlfromsrc/model-resolution.ts, and profile factories acceptingharness/modeljust ascoderProfiledoes (src/mcp/delegates.ts:198-202). Keeping - Real-world viability: Defaults let the server start without
SANDBOX_BASE_URLand the researcher run without manual harness/model/auth; concurrency is capped bymaxConcurrency; fanout misconfiguration is clamped (effectiveVariantsatsrc/mcp/bin.ts:528); empty/whitespace env values and already-versioned router bases are handled (src/mcp/researcher-provisioning.ts:36-47,:65). The main caveat is that the cano
💰 Value Audit
🟡 Hand-rolled model precedence chain could reuse resolveChatModel [duplication] ``
The model fallback
MCP_RESEARCHER_MODEL → MCP_WORKER_MODEL → WORKER_MODEL → DEFAULT_MODELinsrc/mcp/researcher-provisioning.ts:58-62duplicates the precedence-resolver pattern thatsrc/model-resolution.ts:89-98(resolveChatModel) was extracted to unify. Sinceresearcher-provisioning.tsalready importsresolveRouterBaseUrlfrom the same module, usingresolveChatModel([{source:'MCP_RESEARCHER_MODEL', model: env.MCP_RESEARCHER_MODEL}, ...], {source:'default', model: DEFAULT_MODEL})w
🟡 Default sandbox base URL literal is duplicated [duplication] ``
src/mcp/bin.ts:372hardcodeshttps://sandbox.tangle.tools, whilesrc/mcp/delegation-profile.ts:48already defines the sameDEFAULT_SANDBOX_BASE_URL. The two live at different layers (MCP server vs. profile composer), but exporting and reusing the constant would prevent drift.
🎯 Usefulness Audit
🟡 Canonical profile builder does not forward the new override env vars [integration] ``
src/mcp/delegation-profile.ts:87-97builds the MCP child'senvmap and forwards onlyTANGLE_API_KEY,SANDBOX_BASE_URL, and OTEL keys. The newMCP_RESEARCHER_HARNESS,MCP_RESEARCHER_MODEL,MCP_RESEARCHER_ROUTER_BASE_URL, andTANGLE_ROUTER_URL/TANGLE_ROUTER_BASE_URLoverrides are therefore not automatically available when the MCP server is launched bycomposeProductionAgentProfile. The chosen defaults make this non-blocking, but a follow-up should forward those keys (or documen
What this audit checks
It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.
| Pass | What it asks |
|---|---|
| Heuristic | Vague title? Whitespace-only or cruft-bearing diff? (content signals only) |
| Duplication | Do added function/class names already exist elsewhere in the repo? |
| Value Audit | What does it do? What goal does it achieve? Is it good? Better architecture or already-exists? |
| Usefulness Audit | Does it integrate and fit? Will it hold up in real use and actually get used? |
Findings are concerns, not blocks — the human reviewer decides what to do with them.
✅ No Blockers —
|
| deepseek | glm | aggregate | |
|---|---|---|---|
| Readiness | 80 | 74 | 74 |
| Confidence | 70 | 70 | 70 |
| Correctness | 80 | 74 | 74 |
| Security | 80 | 74 | 74 |
| Testing | 80 | 74 | 74 |
| Architecture | 80 | 74 | 74 |
Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 3 changed files. Global verifier still owns final merge decision.
🟡 LOW Hardcoded sandbox default duplicates delegation-profile constant — src/mcp/bin.ts
Line 372 hardcodes 'https://sandbox.tangle.tools' inline. The same string is already exported as DEFAULT_SANDBOX_BASE_URL in src/mcp/delegation-profile.ts:48. Importing the constant would prevent drift if the default ever changes. Pure DRY nit; behavior is correct and matches the existing convention.
🟡 LOW No integration test for buildPreset wiring into delegate path — src/mcp/bin.ts
The researcher-provisioning.test.ts 'buildPreset composition' block (lines 132-154) manually simulates the resolve→apply flow on a hand-built spec, but no test exercises the actual loadResearcherSupport delegate with a mock peer to confirm buildPreset injects creds into a real factory-returned agentRunSpec. The single and fanout delegate paths (lines 445, 463, 517-524) are only covered indirectly. Given the PR's stated goal is making delegate_research actually produce a winner, a delegate-research integration test asserting OPENAI_API_KEY appears on the d
🟡 LOW Silent fanout harness/model length mismatch — src/mcp/bin.ts
At line 520, fanoutModels?.[i] ?? model gracefully handles out-of-bounds array access (index beyond MCP_RESEARCHER_FANOUT_MODELS length → undefined → falls back to model). However, when the operator misaligns MCP_RESEARCHER_FANOUT_MODELS length with MCP_RESEARCHER_FANOUT_HARNESSES length, no warning is emitted to stderr or logged. Operators debugging 'why does my fanout use the wrong model for harness 3' get no signal. Fix: add a warn-on-mismatch stanza after csv parsing.
🟡 LOW Structural cast as ProvisionableSpec trusts peer runtime shape — src/mcp/bin.ts
Lines 434, 523: preset.agentRunSpec and fanout specs are cast as ProvisionableSpec. If the @tangle-network/agent-knowledge peer ships an agentRunSpec whose sandboxOverrides field has a different runtime shape (e.g., a class instance, readonly fields, or a differently-named env key), applyRouterEnv would either silently no-op (if sandboxOverrides is a getter returning a frozen object) or write a field the peer never reads — producing the exact 'zero LLM calls, no winner' failure this PR fixes, but without any error. The ProvisionableSpec interface is intentionally permissive (Record<string, unknown>), which hides the coupling. A defensive check (e.g., asserting the spec is a
🟡 LOW Misleading JSDoc: applyRouterEnv says 'never replaces' but overwrites OPENAI_API_KEY/OPENAI_BASE_URL — src/mcp/researcher-provisioning.ts
Line 79-80:
Merges — never replaces — any env the preset already supplied.The implementation at lines 91-94 spreads existing env first, then unconditionally sets OPENAI_API_KEY and OPENAI_BASE_URL. If a preset already provided these keys, they are overwritten. The comment is misleading — the function DOES replace router-specific keys. Behavior is likely intentional (router creds must win) but the comment contradicts the code. No test verifies the overwrite-vs-preserve behavior for these specific keys: the 'merge
🟡 LOW TANGLE_API_KEY silently promoted to sandbox OPENAI_API_KEY — src/mcp/researcher-provisioning.ts
Line 63: routerKey falls back to TANGLE_API_KEY, then applyRouterEnv injects it as OPENAI_API_KEY into the sandbox box env (line 92). A user who sets TANGLE_API_KEY for chat-model router auth may not expect it to be exposed as OPENAI_API_KEY inside researcher sandbox containers. The behavior IS documented in the module header and is the intended BYOK flow, and injection is gated on the key being present (non-empty). But there is no log/telemetry when the fallback fires, so the promotion is invisible. Consider a one-line st
🟡 LOW applyRouterEnv docstring overstates merge semantics — src/mcp/researcher-provisioning.ts
Line 79 docstring says 'merges — never replaces — any env the preset already supplied' but the spread order (existing env, then OPENAI_API_KEY, then OPENAI_BASE_URL) does overwrite those two specific keys if the preset already set them. In practice this is the correct behavior (router creds should win), but the docstring could mislead future readers into thinking existing OPENAI_* keys from the preset survive. Suggest: 'Merges router creds INTO the existing sandbox env (overrides OPENAI_API_KEY / OPENAI_BASE_URL if already present; preserves all other keys).'
🟡 LOW Composition test claims to mirror bin.ts buildPreset but doesn't exercise it — tests/mcp/researcher-provisioning.test.ts
Lines 137-155: The test comment says 'Mirrors what bin.ts buildPreset does' but it manually calls resolveResearcherProvisioning then applyRouterEnv. If bin.ts's actual buildPreset logic diverges from this manual composition (e.g., adds extra env, reorders calls, conditional branching), this test will not catch the regression. Consider either an integration test that calls buildPreset directly, or soften the comment to 'exercises the public surface that buildPreset composes'. Impact: low — the public API contract is still tested; only the integration glue is unverified.
🟡 LOW No direct test for MCP_RESEARCHER_ROUTER_BASE_URL with non-versioned URL — tests/mcp/researcher-provisioning.test.ts
Line 82 (Test 7) only tests MCP_RESEARCHER_ROUTER_BASE_URL with an already-versioned URL ('https://r.example.com/v1/'). The non-versioned case (e.g., 'https://r.example.com') through this specific env var is not tested directly — it is covered indirectly through TANGLE_ROUTER_BASE_URL in Test 5 (line 67), but the two codepaths converge at line 65 of the source only aft
🟡 LOW No test for MCP_RESEARCHER_FANOUT_MODELS set without MCP_RESEARCHER_FANOUT_HARNESSES — tests/mcp/researcher-provisioning.test.ts
Line 83-90: Tests both fanout envs set together. In bin.ts:516-521, fanoutHarnesses and fanoutModels are index-aligned — models without harnesses would produce an empty harness list (undefined → caller defaults to replicating the single harness N times for N variants — bin.ts:516). But if fanoutModels is set alone, the models array is returned by csv() while fanoutHarnesses is undefined, creating an index alignment that produces unexpected model assignments. A test that asserts fanoutModels is undefined when fanoutHarnesses is unset (or vice versa) would validate the index-aligned contract. Current behavior: silent, not crashing — but easy to mi
🟡 LOW Version-normalization regex inconsistency between resolveRouterBaseUrl and researcher path for /vN where N>1 — tests/mcp/researcher-provisioning.test.ts
resolveRouterBaseUrl (src/model-resolution.ts:46) only strips /v1 (regex //v1/?$/), while the researcher normalization (src/mcp/researcher-provisioning.ts:65) checks //v\d+/?$/ (any version). For a TANGLE_ROUTER_URL like 'https://r.example.com/v2', resolveRouterBaseUrl returns it unstripped, but the researcher regex then correctly recognizes /v2 and avoids double-appending. The final result is correct but for non-obvious reasons — the two layers have inconsistent version matching. No test exercises /v2 or higher through either path. Impact: low — end result is correct today; risk is future refactoring that removes one layer without understanding the补偿.
tangletools · 2026-06-14T23:17:58Z · trace
…s, more tests Round-3 PR #302 audit (all LOW): - DRY: export + reuse DEFAULT_SANDBOX_BASE_URL instead of an inline literal. - Docstring accuracy: applyRouterEnv overlays (preserves preset env, sets the two OPENAI_ keys authoritatively) — was misleadingly worded 'never replaces'. - Honest test naming for the resolve→apply composition; +tests for non-versioned router base and fanout-models-without-harnesses.
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved PR — 2fcc8773
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T23:21:06Z
tangletools
left a comment
There was a problem hiding this comment.
🟢 Value Audit — sound
| Verdict | sound |
| Concerns | 0 (none) |
| Heuristic | 0.0s |
| Duplication | 0.0s |
| Interrogation | 111.3s (2 bridge agents) |
| Total | 111.3s |
🎯 Usefulness — sound
The PR fixes a real, reachable breakage in the canonical agent-runtime-mcp path and does so in the grain of the codebase.
- Integration: The changed code is inside
src/mcp/bin.ts, theagent-runtime-mcpbinary entry point (package.json:83-85).main()callsloadResearcherSupportat bin.ts:171 wheneverMCP_DISABLE_RESEARCHERis unset, and the newresolveResearcherProvisioning+applyRouterEnvare invoked at bin.ts:438-441 and 529-530 to build every researcher preset. The resultingresearcherSupport.delegateis passed i - Fit with existing patterns: It fits established patterns rather than competing with them. Router-base resolution reuses the repo's
resolveRouterBaseUrlfromsrc/model-resolution.ts:44-48. Model/key fallback reuses the pervasiveWORKER_MODEL/TANGLE_API_KEYconventions seen acrossbench/andexamples/. Harness/model/fanout env naming mirrors the coder delegate options insrc/mcp/delegates.ts:127-175. Box-env in - Real-world viability: It handles realistic edge cases: empty/whitespace env values are treated as unset (
trimmedat researcher-provisioning.ts:36-39), model resolution falls back throughMCP_RESEARCHER_MODEL→MCP_WORKER_MODEL→WORKER_MODEL→ default (l.58-62), missing router key makesapplyRouterEnva no-op (l.91), and fanout misconfig is bounded byfanout.agentRuns.slice(0, variants)with `effectiveVaria
No concerns — sound change, no better or existing approach found. ✅
What this audit checks
It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.
| Pass | What it asks |
|---|---|
| Heuristic | Vague title? Whitespace-only or cruft-bearing diff? (content signals only) |
| Duplication | Do added function/class names already exist elsewhere in the repo? |
| Value Audit | What does it do? What goal does it achieve? Is it good? Better architecture or already-exists? |
| Usefulness Audit | Does it integrate and fit? Will it hold up in real use and actually get used? |
Findings are concerns, not blocks — the human reviewer decides what to do with them.
✅ No Blockers —
|
| deepseek | glm | aggregate | |
|---|---|---|---|
| Readiness | 83 | 80 | 80 |
| Confidence | 70 | 70 | 70 |
| Correctness | 83 | 80 | 80 |
| Security | 83 | 80 | 80 |
| Testing | 83 | 80 | 80 |
| Architecture | 83 | 80 | 80 |
Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision.
🟡 LOW DEFAULT_SANDBOX_BASE_URL behavioral change for unset SANDBOX_BASE_URL — src/mcp/bin.ts
Line 373: Previously when SANDBOX_BASE_URL was unset, baseUrl was omitted from the Sandbox constructor (SDK used its own internal default). Now DEFAULT_SANDBOX_BASE_URL ('https://sandbox.tangle.tools') is always passed. If the SDK's built-in default ever differed from this value, callers who relied on the SDK default would silently switch endpoints. In practice these should match and the SDK ≥0.6 makes baseUrl required, so this is the correct fix — just flagging the behavioral delta.
🟡 LOW No integration test for buildPreset composition in bin.ts — src/mcp/bin.ts
Lines 439-443: buildPreset composes singleFactory({task, harness, model}) + applyRouterEnv. The unit test (researcher-provisioning.test.ts:148-171) explicitly notes this peer-call half is not exercised. A mock singleFactory stub in a bin-level test would verify the full composition (harness/model passed to factory + router env applied to the resulting spec). Low risk because the individual functions are well-tested and the composition is 3 lines, but the integration seam is unverified.
🟡 LOW Duplicate CSV parsing in parseHarnesses vs csv helper — src/mcp/researcher-provisioning.ts
researcher-provisioning.ts:41-47 defines a
csv()helper used for fanout harness + model lists. bin.ts:595-602 definesparseHarnesses()with identical logic used for coder fanout harnesses. The coder path does not reuse the researcher's csv helper, creating two copies of the same string→array parser. Impact: maintenance only — a behavior change in one won't reach the other. Fix: extract shared csv parser or have parseHarnesses delegate to the provisioning module's csv.
🟡 LOW CSV parser tested only with multi-entry + whitespace entries; single-entry not covered — tests/mcp/researcher-provisioning.test.ts
The csv() helper (researcher-provisioning.ts:41-47) is tested with two comma-separated entries and whitespace-only entries. A single entry without commas (e.g. 'opencode') is implicitly covered by the helper's logic (split still works), but no explicit test asserts the single-entry array result. Low severity — the helper is simple enough that the risk is negligible.
🟡 LOW Composition test skips the singleFactory integration point — tests/mcp/researcher-provisioning.test.ts
The final describe block (line 143) explicitly notes it does not exercise singleFactory(harness, model) — the actual production composition at bin.ts:440-441 where resolveResearcherProvisioning output feeds singleFactory then applyRouterEnv. The test comment is honest about this gap. This is the one path where a shape mismatch between ProvisionableSpec and the real agentRunSpec returned by singleFactory would surface. Not blocking since the type contract is narrow, but a mock singleFactory returning a spec-shaped object would close the last integration seam.
🟡 LOW Missing edge case: base URL with query parameters breaks /v1 normalization — tests/mcp/researcher-provisioning.test.ts
The regex //v\d+/?$/ at researcher-provisioning.ts:65 assumes the version segment is the last path component. A value like MCP_RESEARCHER_ROUTER_BASE_URL=https://r.example.com/v1?format=openai would fail the regex test (ends with '?format=openai', not '/v1'), causing a double-append: 'https://r.example.com/v1?format=openai/v1'. No test covers this. Low severity because the env var name implies a bare base URL, not a request URL with query params.
🟡 LOW No test for applyRouterEnv overwriting pre-existing OPENAI_API_KEY — tests/mcp/researcher-provisioning.test.ts
applyRouterEnv is documented as 'intentionally authoritative' for OPENAI_API_KEY/OPENAI_BASE_URL (researcher-provisioning.ts:73-74). The tests verify that non-conflicting keys (KEEP_ME, NS_TOKEN) are preserved, but no test asserts that a pre-existing OPENAI_API_KEY in the spec's env is overwritten by the router key. This is the one behavioral guarantee that distinguishes 'merge' from 'replace' semantics for these two specific keys. Add: pass a spec with sandboxOverrides.env.OPENAI_API_KEY already set, apply, assert it equals the new routerKey.
🟡 LOW Untested: MCP_RESEARCHER_ROUTER_BASE_URL with trailing slash but no version — tests/mcp/researcher-provisioning.test.ts
The test suite covers trailing-slash with /v1 (line 74-81) and no-trailing-slash without version (line 100-106), but not the combination of trailing slash without a version path (e.g. 'https://r.example.com/'). The source's .replace(//$/, '') handles this correctly, but there's no explicit assertion proving it.
tangletools · 2026-06-14T23:35:27Z · trace
Problem
agent-runtime-mcp'sdelegate_researchcould not complete a run in a fresh environment:SANDBOX_BASE_URL—@tangle-network/sandbox≥0.6 makesbaseUrlrequired, butloadSandboxClientonly passed it when the env var was set.opencode/zai-coding-plan/glm-5.1) isn't broadly provisionable, and the sandbox SDK does not wirebackend.model.apiKeyinto the in-box agent's OpenAI-compatible provider, so the agent made zero LLM calls.Fix
SANDBOX_BASE_URLto the public Tangle sandbox so the server starts.OPENAI_API_KEY/OPENAI_BASE_URL). All env-overridable:MCP_RESEARCHER_HARNESS,MCP_RESEARCHER_MODEL(MCP_WORKER_MODELfallback),MCP_RESEARCHER_ROUTER_BASE_URL;TANGLE_API_KEYdoubles as the router key.Evidence
With these, a researcher run gets past the instant "no winner" and runs a real multi-minute agent turn; the equivalent wiring produced a cited research dossier end-to-end. Typecheck clean.
Note
Full e2e via the sibling-sandbox executor is still gated on worker-box provisioning (the known deployed-sandbox/gateway issue, separate from this PR). This PR fixes the code-level layers (startup, auth, harness/model).
🤖 Generated with Claude Code