feat(tools): proposal_created carries the proposal body (content)#55
Conversation
The AppToolProducedEvent proposal variant carried only {proposalId, title,
status}; its artifact variant already carries content. So submit_proposal's
`description` (the deliverable body) was dropped at the framework boundary,
forcing every product to re-fetch it from its own DB for produced-state grading.
- types.ts: AppToolProducedEvent proposal_created gains content?: string.
- dispatch.ts: emit content: description ?? undefined at the side-effect site
(the body is already in scope).
- eval/index.ts: producedFromToolEvents threads content onto the RuntimeEventLike
the completion oracle reads.
- dispatch.test.ts: pins the in-band flow — dispatch emits content, a title-only
proposal omits it, and producedFromToolEvents threads it through.
Additive + backward-compatible.
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved PR — 49e91018
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T11:19:01Z
tangletools
left a comment
There was a problem hiding this comment.
🟡 Value Audit — sound-with-nits
| Verdict | sound-with-nits |
| Concerns | 3 (1 medium-concern, 2 weak-concern) |
| Heuristic | 0.0s |
| Duplication | 0.0s |
| Interrogation | 456.1s (2 bridge agents) |
| Total | 456.1s |
💰 Value — sound-with-nits
The change threads the proposal body (submit_proposal description) through the app-tool produced event and into the agent-eval RuntimeEventLike shape, mirroring the existing artifact pattern and removing the need for products to re-fetch proposal content from their DB during produced-state grading.
- What it does: Adds an optional
content?: stringfield toAppToolProducedEvent'sproposal_createdvariant (src/tools/types.ts:125), populates it from the already-validateddescriptionindispatchAppTool(src/tools/dispatch.ts:74), and forwards it throughproducedFromToolEventsinto theRuntimeEventLikethat agent-eval'sextractProducedState/verifyCompletionconsumes (src/eval/index.ts:47). A ne - Goals it achieves: 1) Keep the assessable proposal deliverable in-band with the produced event so completion oracles can grade proposal content without an out-of-band product DB read. 2) Align
proposal_createdwith theartifactvariant, which already carriedcontent. 3) Pair with the corresponding runtime event change (agent-runtime#292) so the body flows end-to-end. - Assessment: Good change. It is small, additive, backward-compatible (content is optional), and consistent with the codebase's substrate-free bridge design and the existing artifact event pattern. The seam stays clean:
/toolsknows nothing about agent-eval, and/evalonly adds the bridge mapping. - Better / existing approach: none for the functional design — this is the right approach. One organizational improvement: the new
src/tools/dispatch.test.tsoverlaps with existing coverage.tests/tools.test.tsalready exercisessubmit_proposalproduced events (lines 82–93, 95–113) andtests/eval.test.tsalready exercisesproducedFromToolEvents(lines 16–21). The in-band flow assertion could have been added to one of
🎯 Usefulness — sound-with-nits
A small, grain-aligned addition that carries the proposal body in-band through the app-shell boundary, but the locked engine peers don't consume it yet and one existing test expectation needs updating.
- Integration: The new
contentfield is emitted at the single side-effect site indispatchAppTool(src/tools/dispatch.ts:69-75), bridged ontoRuntimeEventLikeinproducedFromToolEvents(src/eval/index.ts:44-49), and surfaced throughcreateAgentRuntime/createAppToolRuntimeExecutorviaonProduced(src/runtime/agent.ts:96,142; src/tools/runtime.ts:24-25). So the wiring is reachable. The caveat is that - Fit with existing patterns: It mirrors the existing
artifactvariant, which already carriescontent(src/tools/types.ts:127). It stays in the app-shell eval-bridge role described in AGENTS.md (engine = peer, app-shell owns the side-channel→RuntimeEventLike mapping) and does not duplicate engine scoring logic. No established pattern competes with it. - Real-world viability: Happy path and title-only edge are covered (src/tools/dispatch.test.ts:28-56).
descriptionis coerced to string and emitted ascontentonly when present (src/tools/dispatch.ts:51,74). Error paths are unchanged: a handler throw skipsonProducedentirely. The callback is synchronous and receives a fresh object, so concurrency is fine. The realistic limitation is that products on the current lo
💰 Value Audit
🟡 New test file overlaps existing dispatch and bridge coverage [duplication] ``
src/tools/dispatch.test.ts tests dispatch-produced events and producedFromToolEvents, but tests/tools.test.ts already covers submit_proposal produced events (lines 82–113) and tests/eval.test.ts already covers producedFromToolEvents (lines 16–21). The new in-band flow test is valuable, but it could have extended an existing test file rather than adding another file to the test matrix.
🎯 Usefulness Audit
🟠 Locked engine peers ignore proposal content, so the in-band grading claim is incomplete without a peer bump [integration] ``
The PR's stated end-to-end path —
producedFromToolEvents→extractProducedState→verifyCompletion— needs the engine to readcontentfromproposal_created. The current checkout's@tangle-network/agent-eval0.83.0extractProducedStatepushes only{id, title, status}(node_modules/@tangle-network/agent-eval/dist/chunk-YGYXHNAQ.js:267-269), and@tangle-network/agent-runtime0.52.0'sproposal_createdtype has nocontentfield (node_modules/@tangle-network/agent-runtime/dist/ty
🟡 Existing tools test asserts the pre-change proposal_created shape [integration] ``
tests/tools.test.ts:92expects[{ type: 'proposal_created', proposalId: 'prop-1', title: 'Proposal A', status: 'pending' }]but now receivescontent: 'body'as well, causing a failing assertion. Update the expectation to include the new optional field so the suite stays green.
What this audit checks
It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.
| Pass | What it asks |
|---|---|
| Heuristic | Vague title? Whitespace-only or cruft-bearing diff? (content signals only) |
| Duplication | Do added function/class names already exist elsewhere in the repo? |
| Value Audit | What does it do? What goal does it achieve? Is it good? Better architecture or already-exists? |
| Usefulness Audit | Does it integrate and fit? Will it hold up in real use and actually get used? |
Findings are concerns, not blocks — the human reviewer decides what to do with them.
❌ Needs Work —
|
| deepseek | glm | aggregate | |
|---|---|---|---|
| Readiness | 82 | 44 | 44 |
| Confidence | 70 | 70 | 70 |
| Correctness | 82 | 44 | 44 |
| Security | 82 | 44 | 44 |
| Testing | 82 | 44 | 44 |
| Architecture | 82 | 44 | 44 |
Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision.
Blocking
🔴 HIGH Existing test regression: tests/tools.test.ts:92 toEqual fails on new content field — src/tools/dispatch.ts
dispatch.ts:74 adds
content: description ?? undefinedto the proposal_created produced event. The existing test at tests/tools.test.ts:82-93 passesdescription: 'body'(line 87) and assertsexpect(produced).toEqual([{ type: 'proposal_created', proposalId: 'prop-1', title: 'Proposal A', status: 'pending' }])(line 92). The actual now includescontent: 'body', causing toEqual to fail: 'expected [{type:'proposal_created', …(4)}] to deeply equal [{type:'proposal_created', …(3)}]'. Verified: this test passes at the base commit and fails at head. The
Other
🟠 MEDIUM Bridge wires proposal content, but engine drops it — feature incomplete — src/eval/index.ts
Line 47 adds content: e.content to the proposal_created mapping, which correctly threads the proposal body from AppToolProducedEvent into RuntimeEventLike. However, agent-eval's extractProducedState (v0.83+, confirmed in 0.85.0) ignores content on proposal events — it only populates ProducedProposal with { id, title, status }, leaving content always undefined. proposalCandidates() then reads p.content ?? '' which is always '' because content is never populated, causing correctness to always resolve as 'not assessed — matched item carries no content'. The bridge is forward-compatible (correct wiring), but the PR as shipped doesn't actually enable proposal body assessment by the
🟠 MEDIUM Proposal content is silently dropped by extractProducedState — feature is functionally dead — src/eval/index.ts
Line 47 adds
content: e.contentto the proposal_created event mapping. But agent-eval@0.83.0'sextractProducedState(compiled source chunk-YGYXHNAQ.js:269) builds proposals as{ id: p.proposalId, title: p.title, status: p.status ?? "pending" }— it does not readcontentfrom the event. TheProposalEventLiketype interface also lackscontent, so the field passes typecheck only via the{ type: string }catch-all inRuntimeEventLike. Impact: the proposal body never reachesProducedProposal.content, soverifyCompletion'sproposalCandidates([line 65](Line 6 in 49e9101
🟠 MEDIUM Proposal content silently dropped by agent-eval's extractProducedState — feature goal unmet — src/tools/types.ts
types.ts:122-124 comment states content is 'carried in-band so produced-state grading reads it from the event, not the product database.' Verified against installed agent-eval source (dist/chunk-YGYXHNAQ.js): extractProducedState builds proposals as
{ id: p.proposalId, title: p.title, status: p.status ?? 'pending' }— content is not extracted. The ProposalEventLike type (agent-profile-D0PBIWlV.d.ts:309-314) has no content field. Meanwhile ProducedProposal (d.ts:339-344) DOES havecontent?: string('Optional persisted body — when present, enables a correctness check') and proposalCandidates readsp.content ?? ''. So the pipeline has a dead leg: agent-app emits content → producedFromToolEvents carries it → extractProducedState drops it → proposalCandidates gets undefined → correctness
🟡 LOW No end-to-end test proving proposal content reaches verifyCompletion — src/eval/index.ts
The dispatch.test.ts:59-68 test validates that
producedFromToolEventsmapscontentonto the output object shape, and eval.test.ts:23-37 tests the artifact path throughextractProducedState → verifyCompletion. But no test exercises the proposal path end-to-end with asatisfiedBy: 'proposal'requirement and proposal content. Had such a test existed, it would have caught thatextractProducedStatedrops the content (and thatproposalCandidatesrequiresstatus === 'approved', which agent-app never produces). Note also that eval.test.ts:12 uses an event withoutcontent, and line 19'stoEqualstill passes because vitest ignoresundefinedproperties — so the pre-ex
🟡 LOW Test description overstates end-to-end coverage — src/tools/dispatch.test.ts
dispatch.test.ts:60 — describe block 'producedFromToolEvents — body threads to the runtime event shape' and test name 'maps proposal content onto the RuntimeEventLike the completion oracle reads.' The test only verifies that producedFromToolEvents carries content to RuntimeEventLike (lines 65-67). It does not verify extractProducedState or verifyCompletion reads it — and per the installed agent-eval, extractProducedState drops it (see medium finding). The test name implies the completion oracle reads the content, which is currently false. Recommend renaming to accurately reflect scope, e.g. 'maps proposal content onto the RuntimeEventLike shape' without the 'compl
🟡 LOW Test doesn't cover explicit description: null input — src/tools/dispatch.test.ts
The 'omits content for a title-only proposal' test (line 46) sends
{ type: 'other', title: 'Bare filing' }(missing key) but theSubmitProposalArgstype also allowsdescription: null. The== nullguard at dispatch.ts:51 makes both equivalent, so this is not a real bug, but a covered null case would prevent future regressions if the guard shape changes. Suggested: add a test case with{ type: 'other', title: 'x', description: null }assertingcontentisundefined.
tangletools · 2026-06-14T11:40:41Z · trace
tangletools
left a comment
There was a problem hiding this comment.
❌ 1 Blocking Finding — 49e91018
Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision.
Full immutable report for this review: trace
Summary comment for this run: full summary
tangletools · 2026-06-14T11:40:41Z · immutable trace
The proposal_created produced event now carries content (the submit_proposal description). Update the two deep-equal assertions that pinned the old 3-field shape: drive a real description through the runtime tool loop (agent.test) and the dispatch executor (tools.test) and assert it threads as content end-to-end.
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved PR — e054f96c
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T12:55:42Z
Why
AppToolProducedEvent'sproposal_createdvariant carried only{proposalId, title, status}— itsartifactvariant already carriescontent. Sosubmit_proposal'sdescription(the deliverable body) was dropped at the framework boundary, forcing every product on agent-app to re-fetch the body from its own DB to grade the proposal's content in produced-state evals.Pairs with agent-runtime#292 (the
proposal_createdRuntimeStreamEventgains the samecontentfield). Together: the proposal body flows in-band end to end — produced event → runtime event →extractProducedState→verifyCompletion— so no consumer reaches into the product database for it.What
AppToolProducedEventproposal_createdgainscontent?: string.content: description ?? undefinedat the side-effect site (the body is already in scope from validation).producedFromToolEventsthreadscontentonto theRuntimeEventLikethe completion oracle reads.producedFromToolEventsthreads it through.Additive + backward-compatible. Full suite green.