docs: agent shell consolidation — one shell, AgentProfile-as-contract#64
Conversation
…le-as-contract
Design spec to lift the 5 still-hand-rolled agent-shell concerns (skill registry +
~/.claude/skills mount, AgentProfile assembly, sandbox provisioning, per-turn model
resolution, system-prompt assembly) into agent-app, so each product (gtm/creative/
tax/legal/insurance) collapses to one defineAgentProfile({...}) + a thin
ShellRuntimeConfig. The shell's input contract IS the sandbox SDK AgentProfile type
(not a new invented seam). Grounded in a 6-repo surface map; documents measured
duplication (~3000 lines x5) and confirmed drift (tax evals grade a 58-line-stale
profile; creative app-tool layer 12.8x gtm). Staged additive rollout, flag-gated.
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved PR — 25d39ef0
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-15T10:37:38Z
tangletools
left a comment
There was a problem hiding this comment.
🟠 Value Audit — better-approach-exists
| Verdict | better-approach-exists |
| Concerns | 4 (1 medium-concern, 3 weak-concern) |
| Heuristic | 0.0s |
| Duplication | 0.0s |
| Interrogation | 180.4s (2 bridge agents) |
| Total | 180.4s |
💰 Value — better-approach-exists
Adds a well-grounded design spec to lift 5 duplicated agent-shell concerns into agent-app using the sandbox SDK's AgentProfile as the contract; the plan is coherent but does not reconcile with the existing AgentAppConfig product contract, so a human should decide the single source of truth before im
- What it does: This PR adds
docs/agent-shell-consolidation.md(265 lines), a staff architecture proposal. It identifies ~3,000 lines of duplicated plumbing across gtm/creative/tax/legal/insurance (skill registry +~/.claude/skillsmounts, AgentProfile assembly, sandbox provisioning, per-turn model resolution, system-prompt assembly) and proposes lifting each concern into@tangle-network/agent-appas additi - Goals it achieves: 1) Eliminate measured duplication and drift (e.g., tax eval scoring a 58-line stale profile, creative's 2,345-line hand-rolled app-tool layer, byte-identical
model-resolution.tscopies). 2) Make the shell's input contract the existing sandbox SDKAgentProfiletype rather than inventing a new seam. 3) Enable incremental, additive, non-breaking rollout so products migrate one concern at a time b - Assessment: The analysis is coherent and grounded in real repo evidence: it correctly maps the existing lifted pieces (
/runtimetool loop/model catalog,/toolscapability auth,/delegationMCP,/configdata contract, etc.) and accurately identifies the gaps (no skill mount, no AgentProfile composer, no sandbox provisioning, no per-turn model picker, no prompt assembler). It respects the engine/shell l - Better / existing approach: The codebase already has a product config contract:
AgentAppConfigdefined insrc/config/index.ts:164-200and exported viadefineAgentApp(src/config/index.ts:198-200), with tests intests/config.test.tsand a scaffolder template increate-agent-app/template/agent.config.ts. The proposed design makes products author a separatedefineAgentProfile({...})+ShellRuntimeConfiginstead
🎯 Usefulness — sound-with-nits
A coherent, well-grounded design proposal that identifies real fleet-wide duplication and fits agent-app's additive-subpath/structural style; only minor concerns about reconciling it with the existing AgentAppConfig surface and declaring the optional sandbox peer dependency.
- Integration: This PR is docs-only: it adds only docs/agent-shell-consolidation.md and no code. The proposed subpaths (@tangle-network/agent-app/skills, /profile, /sandbox, /model-resolution, /prompt) do not exist in package.json exports (package.json:34-184), tsup.config.ts entries (tsup.config.ts:4-35), or src/index.ts re-exports (src/index.ts:9-36). Searches in src/ for the proposed key symbols (skillRegistr
- Fit with existing patterns: The design aligns with how agent-app is built. Existing modules already avoid direct @tangle-network/sandbox imports by staying structural (src/delegation/index.ts:11-14; src/tools/mcp.ts:15-17 reference AgentProfileMcpServer structurally), and the codebase already exports additive subpaths for runtime, tools, delegation, etc. The gaps named in the doc are real: there is no skill-mount loader, no
- Real-world viability: The proposal explicitly addresses realistic failure modes: fail-closed model allowlisting and catalog validation, severed-stream detection lifted from creative, workspace- vs user-bound capability tokens, and a hard ordering constraint to repoint tax evals before deleting the dead api-worker package. It also proposes additive subpaths so non-sandbox consumers are not forced to adopt the sandbox pe
🎯 Usefulness Audit
🟡 Design does not reconcile with the existing AgentAppConfig surface [problem-fit] ``
agent-app already ships AgentAppConfig / defineAgentApp (src/config/index.ts:164-200) as the canonical product declaration, used by create-agent-app/template/agent.config.ts:15-17 and tests/config.test.ts. The doc’s example product collapses to defineAgentProfile + ShellRuntimeConfig without showing how it relates to AgentAppConfig. Confirm before implementation whether AgentAppConfig becomes the source of truth that a composer turns into AgentProfile, or whether the greenfield template should s
🟡 Sandbox subpath needs an optional peer-dependency declaration [integration] ``
The doc plans an agent-app/sandbox subpath that imports @tangle-network/sandbox, but package.json:216-249 does not list @tangle-network/sandbox in peerDependencies or peerDependenciesMeta. To keep agent-app usable without the sandbox SDK for edge/browser paths (as src/runtime/agent.ts:24-26 stays substrate-free), add it as an optional peer dependency when that subpath lands.
🟡 Vite import.meta.glob loader needs a testable Node fallback [robustness] ``
The proposed loadMarkdownCorpus must preserve the literal import.meta.glob string for Vite static analysis (docs/agent-shell-consolidation.md:143). agent-app currently has no build-time glob pattern; ensure the Node fs fallback is exercised in vitest so the same source runs in tests and non-Vite consumers without a runtime fork per environment.
💰 Value Audit
🟠 Proposed AgentProfile contract is parallel to existing AgentAppConfig product surface [better-architecture] ``
The repo already ships
AgentAppConfig/defineAgentApp(src/config/index.ts:164-200) as the declarative product contract, validated bytests/config.test.tsand thecreate-agent-appscaffolder (create-agent-app/template/agent.config.ts). The design doc proposesdefineAgentProfile({...})+ShellRuntimeConfigas the new product surface without reconciling these two contracts. This risks two config languages, divergent scaffolder guidance, and confused future agents. A better approach
What this audit checks
It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.
| Pass | What it asks |
|---|---|
| Heuristic | Vague title? Whitespace-only or cruft-bearing diff? (content signals only) |
| Duplication | Do added function/class names already exist elsewhere in the repo? |
| Value Audit | What does it do? What goal does it achieve? Is it good? Better architecture or already-exists? |
| Usefulness Audit | Does it integrate and fit? Will it hold up in real use and actually get used? |
Findings are concerns, not blocks — the human reviewer decides what to do with them.
✅ No Blockers —
|
| deepseek | glm | aggregate | |
|---|---|---|---|
| Readiness | 89 | 92 | 89 |
| Confidence | 65 | 65 | 65 |
| Correctness | 89 | 92 | 89 |
| Security | 89 | 92 | 89 |
| Testing | 89 | 92 | 89 |
| Architecture | 89 | 92 | 89 |
Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 1/1 planned shots over 1 changed files. Global verifier still owns final merge decision.
🟡 LOW Hashed dist filenames will rot on next build — docs/agent-shell-consolidation.md
Multiple references cite tsup-generated hashed dist filenames (e.g. 'dist/sandbox-Dyf07Ckv.d.ts:190' at line 58, 'dist/model-CKzniMMr.d.ts:108' at line 38). These hashes change on every build and will be stale within one release. For a proposal doc this is harmless, but if the doc is expected to persist as architectural reference, consider citing source file paths or stable export names instead.
🟡 LOW No trailing newline at EOF — docs/agent-shell-consolidation.md
The file ends with
\ No newline at end of file(git diff line 265). POSIX convention and most editors/linters expect a final newline. No prettier/markdownlint config exists in this repo to enforce it, so non-blocking. Fix: append a single newline. One-character change.
🟡 LOW Terminology conflict: 'shell default' vs 'opt-in' for severed-stream classifier — docs/agent-shell-consolidation.md
§4.3 (line 169) says 'lift creative's severed-stream + model-call-failure classifiers ... as a shell default' (implying opt-out / always-on), but §7 (lines 244-245) says 'ship it opt-in via ShellRuntimeConfig.streamFailureClassifier defaulting to creative's implementation.' The words 'opt-in' and 'default-on' conflict — if the field defaults to creative's implementation, it's opt-out, not opt-in. The mechanism described (a configurable ShellRuntimeConfig field) is clear enough, but stakeholders reading this doc will interpret
tangletools · 2026-06-15T10:42:38Z · trace
Design spec (review-only, no code) for making all 5 agent products the same shell. Drew chose plan-first; this is that plan.
The finding
agent-app already owns most of the shell (chat tool-loop, model-config resolution, capability auth, billing, hub, SSO, side-channel tools,
defineAgentAppconfig seam). Five core-loop concerns are still hand-rolled in every product (~3,000 lines copied 5×, already drifting):~/.claude/skillsmountensureWorkspaceSandbox)The key architectural decision (Drew's call)
The shell's input contract is the sandbox SDK's
AgentProfiletype — not a new invented seam. VerifiedAgentProfilealready carries every field needed (prompt/model/permissions/tools/mcp/subagents/resources.files/hooks/extensions). So skills+knowledge →resources.files, specialists →subagents, model hints →model. The shell becomesshell(profile: AgentProfile, runtimeConfig). A product collapses to onedefineAgentProfile({...})+ a thinShellRuntimeConfig(~10 lines + config).Evidence the doc surfaced (drift is real, not theoretical)
packages/api-worker's profile, which diverged 58 lines from the deployedapps/webcopy — evals score a profile users never get.sandbox/index.tsis 1474 L and the app-tool layer 2345 L (12.8× gtm) purely because it's behind on the lift, not structurally heavier.What's in it
Grounded in a 6-repo surface map (agent-app + 5 products). Sections: problem + measured duplication → already-provided vs the gap → target architecture (AgentProfile contract + the ~10-line product shape) → per-concern lift plan in dependency order (skill-mount first, tax monorepo last) → per-product migration notes + outliers (gtm specialists/Intelligence, creative design-canvas, tax monorepo, insurance market-pack corpus) → additive flag-gated rollout → risks/out-of-scope → decisions to confirm.
Review the doc, mark up §8 (decisions to confirm) and the lift order, and I'll turn it into the substrate-release + per-product migration work.