Skip to content

feat(tools): proposal_created carries the proposal body (content)#55

Merged
drewstone merged 2 commits into
mainfrom
feat/proposal-body-inband
Jun 14, 2026
Merged

feat(tools): proposal_created carries the proposal body (content)#55
drewstone merged 2 commits into
mainfrom
feat/proposal-body-inband

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

Why

AppToolProducedEvent's proposal_created variant carried only {proposalId, title, status} — its artifact variant already carries content. So submit_proposal's description (the deliverable body) was dropped at the framework boundary, forcing every product on agent-app to re-fetch the body from its own DB to grade the proposal's content in produced-state evals.

Pairs with agent-runtime#292 (the proposal_created RuntimeStreamEvent gains the same content field). Together: the proposal body flows in-band end to end — produced event → runtime event → extractProducedStateverifyCompletion — so no consumer reaches into the product database for it.

What

  • types.tsAppToolProducedEvent proposal_created gains content?: string.
  • dispatch.ts — emit content: description ?? undefined at the side-effect site (the body is already in scope from validation).
  • eval/index.tsproducedFromToolEvents threads content onto the RuntimeEventLike the completion oracle reads.
  • dispatch.test.ts — pins the in-band flow: dispatch emits content, a title-only proposal omits it, producedFromToolEvents threads it through.

Additive + backward-compatible. Full suite green.

The AppToolProducedEvent proposal variant carried only {proposalId, title,
status}; its artifact variant already carries content. So submit_proposal's
`description` (the deliverable body) was dropped at the framework boundary,
forcing every product to re-fetch it from its own DB for produced-state grading.

- types.ts: AppToolProducedEvent proposal_created gains content?: string.
- dispatch.ts: emit content: description ?? undefined at the side-effect site
  (the body is already in scope).
- eval/index.ts: producedFromToolEvents threads content onto the RuntimeEventLike
  the completion oracle reads.
- dispatch.test.ts: pins the in-band flow — dispatch emits content, a title-only
  proposal omits it, and producedFromToolEvents threads it through.

Additive + backward-compatible.

@tangletools tangletools left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — 49e91018

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T11:19:01Z

@tangletools tangletools left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Value Audit — sound-with-nits

Verdict sound-with-nits
Concerns 3 (1 medium-concern, 2 weak-concern)
Heuristic 0.0s
Duplication 0.0s
Interrogation 456.1s (2 bridge agents)
Total 456.1s

💰 Value — sound-with-nits

The change threads the proposal body (submit_proposal description) through the app-tool produced event and into the agent-eval RuntimeEventLike shape, mirroring the existing artifact pattern and removing the need for products to re-fetch proposal content from their DB during produced-state grading.

  • What it does: Adds an optional content?: string field to AppToolProducedEvent's proposal_created variant (src/tools/types.ts:125), populates it from the already-validated description in dispatchAppTool (src/tools/dispatch.ts:74), and forwards it through producedFromToolEvents into the RuntimeEventLike that agent-eval's extractProducedState/verifyCompletion consumes (src/eval/index.ts:47). A ne
  • Goals it achieves: 1) Keep the assessable proposal deliverable in-band with the produced event so completion oracles can grade proposal content without an out-of-band product DB read. 2) Align proposal_created with the artifact variant, which already carried content. 3) Pair with the corresponding runtime event change (agent-runtime#292) so the body flows end-to-end.
  • Assessment: Good change. It is small, additive, backward-compatible (content is optional), and consistent with the codebase's substrate-free bridge design and the existing artifact event pattern. The seam stays clean: /tools knows nothing about agent-eval, and /eval only adds the bridge mapping.
  • Better / existing approach: none for the functional design — this is the right approach. One organizational improvement: the new src/tools/dispatch.test.ts overlaps with existing coverage. tests/tools.test.ts already exercises submit_proposal produced events (lines 82–93, 95–113) and tests/eval.test.ts already exercises producedFromToolEvents (lines 16–21). The in-band flow assertion could have been added to one of

🎯 Usefulness — sound-with-nits

A small, grain-aligned addition that carries the proposal body in-band through the app-shell boundary, but the locked engine peers don't consume it yet and one existing test expectation needs updating.

  • Integration: The new content field is emitted at the single side-effect site in dispatchAppTool (src/tools/dispatch.ts:69-75), bridged onto RuntimeEventLike in producedFromToolEvents (src/eval/index.ts:44-49), and surfaced through createAgentRuntime/createAppToolRuntimeExecutor via onProduced (src/runtime/agent.ts:96,142; src/tools/runtime.ts:24-25). So the wiring is reachable. The caveat is that
  • Fit with existing patterns: It mirrors the existing artifact variant, which already carries content (src/tools/types.ts:127). It stays in the app-shell eval-bridge role described in AGENTS.md (engine = peer, app-shell owns the side-channel→RuntimeEventLike mapping) and does not duplicate engine scoring logic. No established pattern competes with it.
  • Real-world viability: Happy path and title-only edge are covered (src/tools/dispatch.test.ts:28-56). description is coerced to string and emitted as content only when present (src/tools/dispatch.ts:51,74). Error paths are unchanged: a handler throw skips onProduced entirely. The callback is synchronous and receives a fresh object, so concurrency is fine. The realistic limitation is that products on the current lo

💰 Value Audit

🟡 New test file overlaps existing dispatch and bridge coverage [duplication] ``

src/tools/dispatch.test.ts tests dispatch-produced events and producedFromToolEvents, but tests/tools.test.ts already covers submit_proposal produced events (lines 82–113) and tests/eval.test.ts already covers producedFromToolEvents (lines 16–21). The new in-band flow test is valuable, but it could have extended an existing test file rather than adding another file to the test matrix.

🎯 Usefulness Audit

🟠 Locked engine peers ignore proposal content, so the in-band grading claim is incomplete without a peer bump [integration] ``

The PR's stated end-to-end path — producedFromToolEventsextractProducedStateverifyCompletion — needs the engine to read content from proposal_created. The current checkout's @tangle-network/agent-eval 0.83.0 extractProducedState pushes only {id, title, status} (node_modules/@tangle-network/agent-eval/dist/chunk-YGYXHNAQ.js:267-269), and @tangle-network/agent-runtime 0.52.0's proposal_created type has no content field (node_modules/@tangle-network/agent-runtime/dist/ty

🟡 Existing tools test asserts the pre-change proposal_created shape [integration] ``

tests/tools.test.ts:92 expects [{ type: 'proposal_created', proposalId: 'prop-1', title: 'Proposal A', status: 'pending' }] but now receives content: 'body' as well, causing a failing assertion. Update the expectation to include the new optional field so the suite stays green.


What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass What it asks
Heuristic Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication Do added function/class names already exist elsewhere in the repo?
Value Audit What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

value-audit · 20260614T113317Z

@tangletools

Copy link
Copy Markdown

❌ Needs Work — 49e91018

Readiness 44/100 · Confidence 70/100 · 7 findings (1 high, 3 medium, 3 low)

deepseek glm aggregate
Readiness 82 44 44
Confidence 70 70 70
Correctness 82 44 44
Security 82 44 44
Testing 82 44 44
Architecture 82 44 44

Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision.

Blocking

🔴 HIGH Existing test regression: tests/tools.test.ts:92 toEqual fails on new content field — src/tools/dispatch.ts

dispatch.ts:74 adds content: description ?? undefined to the proposal_created produced event. The existing test at tests/tools.test.ts:82-93 passes description: 'body' (line 87) and asserts expect(produced).toEqual([{ type: 'proposal_created', proposalId: 'prop-1', title: 'Proposal A', status: 'pending' }]) (line 92). The actual now includes content: 'body', causing toEqual to fail: 'expected [{type:'proposal_created', …(4)}] to deeply equal [{type:'proposal_created', …(3)}]'. Verified: this test passes at the base commit and fails at head. The

Other

🟠 MEDIUM Bridge wires proposal content, but engine drops it — feature incomplete — src/eval/index.ts

Line 47 adds content: e.content to the proposal_created mapping, which correctly threads the proposal body from AppToolProducedEvent into RuntimeEventLike. However, agent-eval's extractProducedState (v0.83+, confirmed in 0.85.0) ignores content on proposal events — it only populates ProducedProposal with { id, title, status }, leaving content always undefined. proposalCandidates() then reads p.content ?? '' which is always '' because content is never populated, causing correctness to always resolve as 'not assessed — matched item carries no content'. The bridge is forward-compatible (correct wiring), but the PR as shipped doesn't actually enable proposal body assessment by the

🟠 MEDIUM Proposal content is silently dropped by extractProducedState — feature is functionally dead — src/eval/index.ts

Line 47 adds content: e.content to the proposal_created event mapping. But agent-eval@0.83.0's extractProducedState (compiled source chunk-YGYXHNAQ.js:269) builds proposals as { id: p.proposalId, title: p.title, status: p.status ?? "pending" } — it does not read content from the event. The ProposalEventLike type interface also lacks content, so the field passes typecheck only via the { type: string } catch-all in RuntimeEventLike. Impact: the proposal body never reaches ProducedProposal.content, so verifyCompletion's proposalCandidates ([line 65](

* `createLlmCorrectnessChecker`, and the `CompletionRequirement` / `TaskGold` /

🟠 MEDIUM Proposal content silently dropped by agent-eval's extractProducedState — feature goal unmet — src/tools/types.ts

types.ts:122-124 comment states content is 'carried in-band so produced-state grading reads it from the event, not the product database.' Verified against installed agent-eval source (dist/chunk-YGYXHNAQ.js): extractProducedState builds proposals as { id: p.proposalId, title: p.title, status: p.status ?? 'pending' } — content is not extracted. The ProposalEventLike type (agent-profile-D0PBIWlV.d.ts:309-314) has no content field. Meanwhile ProducedProposal (d.ts:339-344) DOES have content?: string ('Optional persisted body — when present, enables a correctness check') and proposalCandidates reads p.content ?? ''. So the pipeline has a dead leg: agent-app emits content → producedFromToolEvents carries it → extractProducedState drops it → proposalCandidates gets undefined → correctness

🟡 LOW No end-to-end test proving proposal content reaches verifyCompletion — src/eval/index.ts

The dispatch.test.ts:59-68 test validates that producedFromToolEvents maps content onto the output object shape, and eval.test.ts:23-37 tests the artifact path through extractProducedState → verifyCompletion. But no test exercises the proposal path end-to-end with a satisfiedBy: 'proposal' requirement and proposal content. Had such a test existed, it would have caught that extractProducedState drops the content (and that proposalCandidates requires status === 'approved', which agent-app never produces). Note also that eval.test.ts:12 uses an event without content, and line 19's toEqual still passes because vitest ignores undefined properties — so the pre-ex

🟡 LOW Test description overstates end-to-end coverage — src/tools/dispatch.test.ts

dispatch.test.ts:60 — describe block 'producedFromToolEvents — body threads to the runtime event shape' and test name 'maps proposal content onto the RuntimeEventLike the completion oracle reads.' The test only verifies that producedFromToolEvents carries content to RuntimeEventLike (lines 65-67). It does not verify extractProducedState or verifyCompletion reads it — and per the installed agent-eval, extractProducedState drops it (see medium finding). The test name implies the completion oracle reads the content, which is currently false. Recommend renaming to accurately reflect scope, e.g. 'maps proposal content onto the RuntimeEventLike shape' without the 'compl

🟡 LOW Test doesn't cover explicit description: null input — src/tools/dispatch.test.ts

The 'omits content for a title-only proposal' test (line 46) sends { type: 'other', title: 'Bare filing' } (missing key) but the SubmitProposalArgs type also allows description: null. The == null guard at dispatch.ts:51 makes both equivalent, so this is not a real bug, but a covered null case would prevent future regressions if the guard shape changes. Suggested: add a test case with { type: 'other', title: 'x', description: null } asserting content is undefined.


tangletools · 2026-06-14T11:40:41Z · trace

@tangletools tangletools left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ 1 Blocking Finding — 49e91018

Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 2/2 planned shots over 4 changed files. Global verifier still owns final merge decision.

Full immutable report for this review: trace

Summary comment for this run: full summary


tangletools · 2026-06-14T11:40:41Z · immutable trace

The proposal_created produced event now carries content (the submit_proposal
description). Update the two deep-equal assertions that pinned the old 3-field
shape: drive a real description through the runtime tool loop (agent.test) and
the dispatch executor (tools.test) and assert it threads as content end-to-end.

@tangletools tangletools left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — e054f96c

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-14T12:55:42Z

@drewstone drewstone merged commit d270659 into main Jun 14, 2026
1 check passed
drewstone added a commit that referenced this pull request Jun 14, 2026
…dy (content) (#57)

Releases #55: proposal_created produced event gains content (the submit_proposal
description), threaded at dispatch + through producedFromToolEvents. Additive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants