feat(tools): proposal_created carries the proposal body (content) by drewstone · Pull Request #55 · tangle-network/agent-app

drewstone · 2026-06-14T11:18:54Z

Why

AppToolProducedEvent's proposal_created variant carried only {proposalId, title, status} — its artifact variant already carries content. So submit_proposal's description (the deliverable body) was dropped at the framework boundary, forcing every product on agent-app to re-fetch the body from its own DB to grade the proposal's content in produced-state evals.

Pairs with agent-runtime#292 (the proposal_created RuntimeStreamEvent gains the same content field). Together: the proposal body flows in-band end to end — produced event → runtime event → extractProducedState → verifyCompletion — so no consumer reaches into the product database for it.

What

types.ts — AppToolProducedEvent proposal_created gains content?: string.
dispatch.ts — emit content: description ?? undefined at the side-effect site (the body is already in scope from validation).
eval/index.ts — producedFromToolEvents threads content onto the RuntimeEventLike the completion oracle reads.
dispatch.test.ts — pins the in-band flow: dispatch emits content, a title-only proposal omits it, producedFromToolEvents threads it through.

Additive + backward-compatible. Full suite green.

The AppToolProducedEvent proposal variant carried only {proposalId, title, status}; its artifact variant already carries content. So submit_proposal's `description` (the deliverable body) was dropped at the framework boundary, forcing every product to re-fetch it from its own DB for produced-state grading. - types.ts: AppToolProducedEvent proposal_created gains content?: string. - dispatch.ts: emit content: description ?? undefined at the side-effect site (the body is already in scope). - eval/index.ts: producedFromToolEvents threads content onto the RuntimeEventLike the completion oracle reads. - dispatch.test.ts: pins the in-band flow — dispatch emits content, a title-only proposal omits it, and producedFromToolEvents threads it through. Additive + backward-compatible.

tangletools

✅ Auto-approved PR — `49e91018`

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

_{tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T11:19:01Z}

tangletools

🟡 Value Audit — sound-with-nits


Verdict	sound-with-nits
Concerns	3 (1 medium-concern, 2 weak-concern)
Heuristic	0.0s
Duplication	0.0s
Interrogation	456.1s (2 bridge agents)
Total	456.1s

💰 Value — sound-with-nits

The change threads the proposal body (submit_proposal description) through the app-tool produced event and into the agent-eval RuntimeEventLike shape, mirroring the existing artifact pattern and removing the need for products to re-fetch proposal content from their DB during produced-state grading.

What it does: Adds an optional content?: string field to AppToolProducedEvent's proposal_created variant (src/tools/types.ts:125), populates it from the already-validated description in dispatchAppTool (src/tools/dispatch.ts:74), and forwards it through producedFromToolEvents into the RuntimeEventLike that agent-eval's extractProducedState/verifyCompletion consumes (src/eval/index.ts:47). A ne
Goals it achieves: 1) Keep the assessable proposal deliverable in-band with the produced event so completion oracles can grade proposal content without an out-of-band product DB read. 2) Align proposal_created with the artifact variant, which already carried content. 3) Pair with the corresponding runtime event change (agent-runtime#292) so the body flows end-to-end.
Assessment: Good change. It is small, additive, backward-compatible (content is optional), and consistent with the codebase's substrate-free bridge design and the existing artifact event pattern. The seam stays clean: /tools knows nothing about agent-eval, and /eval only adds the bridge mapping.
Better / existing approach: none for the functional design — this is the right approach. One organizational improvement: the new src/tools/dispatch.test.ts overlaps with existing coverage. tests/tools.test.ts already exercises submit_proposal produced events (lines 82–93, 95–113) and tests/eval.test.ts already exercises producedFromToolEvents (lines 16–21). The in-band flow assertion could have been added to one of

🎯 Usefulness — sound-with-nits

A small, grain-aligned addition that carries the proposal body in-band through the app-shell boundary, but the locked engine peers don't consume it yet and one existing test expectation needs updating.

Integration: The new content field is emitted at the single side-effect site in dispatchAppTool (src/tools/dispatch.ts:69-75), bridged onto RuntimeEventLike in producedFromToolEvents (src/eval/index.ts:44-49), and surfaced through createAgentRuntime/createAppToolRuntimeExecutor via onProduced (src/runtime/agent.ts:96,142; src/tools/runtime.ts:24-25). So the wiring is reachable. The caveat is that
Fit with existing patterns: It mirrors the existing artifact variant, which already carries content (src/tools/types.ts:127). It stays in the app-shell eval-bridge role described in AGENTS.md (engine = peer, app-shell owns the side-channel→RuntimeEventLike mapping) and does not duplicate engine scoring logic. No established pattern competes with it.
Real-world viability: Happy path and title-only edge are covered (src/tools/dispatch.test.ts:28-56). description is coerced to string and emitted as content only when present (src/tools/dispatch.ts:51,74). Error paths are unchanged: a handler throw skips onProduced entirely. The callback is synchronous and receives a fresh object, so concurrency is fine. The realistic limitation is that products on the current lo

💰 Value Audit

🟡 New test file overlaps existing dispatch and bridge coverage [duplication] ``

src/tools/dispatch.test.ts tests dispatch-produced events and producedFromToolEvents, but tests/tools.test.ts already covers submit_proposal produced events (lines 82–113) and tests/eval.test.ts already covers producedFromToolEvents (lines 16–21). The new in-band flow test is valuable, but it could have extended an existing test file rather than adding another file to the test matrix.

🎯 Usefulness Audit

🟠 Locked engine peers ignore proposal content, so the in-band grading claim is incomplete without a peer bump [integration] ``

The PR's stated end-to-end path — producedFromToolEvents → extractProducedState → verifyCompletion — needs the engine to read content from proposal_created. The current checkout's @tangle-network/agent-eval 0.83.0 extractProducedState pushes only {id, title, status} (node_modules/@tangle-network/agent-eval/dist/chunk-YGYXHNAQ.js:267-269), and @tangle-network/agent-runtime 0.52.0's proposal_created type has no content field (node_modules/@tangle-network/agent-runtime/dist/ty

🟡 Existing tools test asserts the pre-change proposal_created shape [integration] ``

tests/tools.test.ts:92 expects [{ type: 'proposal_created', proposalId: 'prop-1', title: 'Proposal A', status: 'pending' }] but now receives content: 'body' as well, causing a failing assertion. Update the expectation to include the new optional field so the suite stays green.

What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass	What it asks
Heuristic	Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication	Do added function/class names already exist elsewhere in the repo?
Value Audit	What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit	Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

_{value-audit · 20260614T113317Z}

tangletools · 2026-06-14T11:40:43Z

❌ Needs Work — `49e91018`

Readiness 44/100 · Confidence 70/100 · 7 findings (1 high, 3 medium, 3 low)

	deepseek	glm	aggregate
Readiness	82	44	44
Confidence	70	70	70
Correctness	82	44	44
Security	82	44	44
Testing	82	44	44
Architecture	82	44	44

Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision.

Blocking

🔴 HIGH Existing test regression: tests/tools.test.ts:92 toEqual fails on new content field — src/tools/dispatch.ts

dispatch.ts:74 adds content: description ?? undefined to the proposal_created produced event. The existing test at tests/tools.test.ts:82-93 passes description: 'body' (line 87) and asserts expect(produced).toEqual([{ type: 'proposal_created', proposalId: 'prop-1', title: 'Proposal A', status: 'pending' }]) (line 92). The actual now includes content: 'body', causing toEqual to fail: 'expected [{type:'proposal_created', …(4)}] to deeply equal [{type:'proposal_created', …(3)}]'. Verified: this test passes at the base commit and fails at head. The

Other

🟠 MEDIUM Bridge wires proposal content, but engine drops it — feature incomplete — src/eval/index.ts

Line 47 adds content: e.content to the proposal_created mapping, which correctly threads the proposal body from AppToolProducedEvent into RuntimeEventLike. However, agent-eval's extractProducedState (v0.83+, confirmed in 0.85.0) ignores content on proposal events — it only populates ProducedProposal with { id, title, status }, leaving content always undefined. proposalCandidates() then reads p.content ?? '' which is always '' because content is never populated, causing correctness to always resolve as 'not assessed — matched item carries no content'. The bridge is forward-compatible (correct wiring), but the PR as shipped doesn't actually enable proposal body assessment by the

🟠 MEDIUM Proposal content is silently dropped by extractProducedState — feature is functionally dead — src/eval/index.ts

Line 47 adds content: e.content to the proposal_created event mapping. But agent-eval@0.83.0's extractProducedState (compiled source chunk-YGYXHNAQ.js:269) builds proposals as { id: p.proposalId, title: p.title, status: p.status ?? "pending" } — it does not read content from the event. The ProposalEventLike type interface also lacks content, so the field passes typecheck only via the { type: string } catch-all in RuntimeEventLike. Impact: the proposal body never reaches ProducedProposal.content, so verifyCompletion's proposalCandidates ([line 65](

agent-app/src/eval/index.ts

Line 6 in 49e9101

* `createLlmCorrectnessChecker`, and the `CompletionRequirement` / `TaskGold` /

🟠 MEDIUM Proposal content silently dropped by agent-eval's extractProducedState — feature goal unmet — src/tools/types.ts

types.ts:122-124 comment states content is 'carried in-band so produced-state grading reads it from the event, not the product database.' Verified against installed agent-eval source (dist/chunk-YGYXHNAQ.js): extractProducedState builds proposals as { id: p.proposalId, title: p.title, status: p.status ?? 'pending' } — content is not extracted. The ProposalEventLike type (agent-profile-D0PBIWlV.d.ts:309-314) has no content field. Meanwhile ProducedProposal (d.ts:339-344) DOES have content?: string ('Optional persisted body — when present, enables a correctness check') and proposalCandidates reads p.content ?? ''. So the pipeline has a dead leg: agent-app emits content → producedFromToolEvents carries it → extractProducedState drops it → proposalCandidates gets undefined → correctness

🟡 LOW No end-to-end test proving proposal content reaches verifyCompletion — src/eval/index.ts

The dispatch.test.ts:59-68 test validates that producedFromToolEvents maps content onto the output object shape, and eval.test.ts:23-37 tests the artifact path through extractProducedState → verifyCompletion. But no test exercises the proposal path end-to-end with a satisfiedBy: 'proposal' requirement and proposal content. Had such a test existed, it would have caught that extractProducedState drops the content (and that proposalCandidates requires status === 'approved', which agent-app never produces). Note also that eval.test.ts:12 uses an event without content, and line 19's toEqual still passes because vitest ignores undefined properties — so the pre-ex

🟡 LOW Test description overstates end-to-end coverage — src/tools/dispatch.test.ts

dispatch.test.ts:60 — describe block 'producedFromToolEvents — body threads to the runtime event shape' and test name 'maps proposal content onto the RuntimeEventLike the completion oracle reads.' The test only verifies that producedFromToolEvents carries content to RuntimeEventLike (lines 65-67). It does not verify extractProducedState or verifyCompletion reads it — and per the installed agent-eval, extractProducedState drops it (see medium finding). The test name implies the completion oracle reads the content, which is currently false. Recommend renaming to accurately reflect scope, e.g. 'maps proposal content onto the RuntimeEventLike shape' without the 'compl

🟡 LOW Test doesn't cover explicit description: null input — src/tools/dispatch.test.ts

The 'omits content for a title-only proposal' test (line 46) sends { type: 'other', title: 'Bare filing' } (missing key) but the SubmitProposalArgs type also allows description: null. The == null guard at dispatch.ts:51 makes both equivalent, so this is not a real bug, but a covered null case would prevent future regressions if the guard shape changes. Suggested: add a test case with { type: 'other', title: 'x', description: null } asserting content is undefined.

_{tangletools · 2026-06-14T11:40:41Z · trace}

tangletools

❌ 1 Blocking Finding — `49e91018`

Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision.

Full immutable report for this review: trace

Summary comment for this run: full summary

_{tangletools · 2026-06-14T11:40:41Z · immutable trace}

The proposal_created produced event now carries content (the submit_proposal description). Update the two deep-equal assertions that pinned the old 3-field shape: drive a real description through the runtime tool loop (agent.test) and the dispatch executor (tools.test) and assert it threads as content end-to-end.

tangletools

✅ Auto-approved PR — `e054f96c`

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

_{tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T12:55:42Z}

…dy (content) (#57) Releases #55: proposal_created produced event gains content (the submit_proposal description), threaded at dispatch + through producedFromToolEvents. Additive.

tangletools approved these changes Jun 14, 2026

View reviewed changes

tangletools reviewed Jun 14, 2026

View reviewed changes

tangletools requested changes Jun 14, 2026

View reviewed changes

tangletools approved these changes Jun 14, 2026

View reviewed changes

drewstone merged commit d270659 into main Jun 14, 2026
1 check passed

drewstone mentioned this pull request Jun 14, 2026

chore(release): agent-app 0.15.0 — AppToolProducedEvent body #57

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): proposal_created carries the proposal body (content)#55

feat(tools): proposal_created carries the proposal body (content)#55
drewstone merged 2 commits into
mainfrom
feat/proposal-body-inband

drewstone commented Jun 14, 2026

Uh oh!

tangletools left a comment

Uh oh!

tangletools left a comment

Uh oh!

tangletools commented Jun 14, 2026

Uh oh!

tangletools left a comment

Uh oh!

tangletools left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

drewstone commented Jun 14, 2026

Why

What

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Auto-approved PR — 49e91018

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

🟡 Value Audit — sound-with-nits

💰 Value — sound-with-nits

🎯 Usefulness — sound-with-nits

💰 Value Audit

🎯 Usefulness Audit

Uh oh!

tangletools commented Jun 14, 2026

❌ Needs Work — 49e91018

Blocking

Other

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

❌ 1 Blocking Finding — 49e91018

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Auto-approved PR — e054f96c

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✅ Auto-approved PR — `49e91018`

❌ Needs Work — `49e91018`

❌ 1 Blocking Finding — `49e91018`

✅ Auto-approved PR — `e054f96c`