Skip to content

docs(eval): clarify eval authoring contracts#1545

Merged
christso merged 2 commits into
mainfrom
docs-eval-authoring-skill-data
Jun 27, 2026
Merged

docs(eval): clarify eval authoring contracts#1545
christso merged 2 commits into
mainfrom
docs-eval-authoring-skill-data

Conversation

@christso

Copy link
Copy Markdown
Collaborator

Summary

Eval authors now get the corrected contract shape before they draft repo-state self-evals: assertion strings can carry the grading contract without a duplicate criteria field, expected_output is framed as a golden/reference answer, and historical repo-state tests are shown with a real workspace.repos[] commit pin.

The agent-facing skill data now teaches and reviews the same conventions, including the issue #18 pattern of using shorthand assertions instead of redundant named llm-grader blocks.

Related handoff: https://github.com/EntityProcess/agentv-beads/issues/18

Verification

  • git diff --check
  • bunx biome check apps/web/src/content/docs/docs/evaluation/eval-cases.mdx apps/web/src/content/docs/docs/evaluation/eval-files.mdx apps/web/src/content/docs/docs/guides/eval-authoring.mdx skills-data/agentv-bench/references/eval-yaml-spec.md skills-data/agentv-eval-review/SKILL.md skills-data/agentv-eval-writer/SKILL.md
  • No eval YAML files were changed, so eval schema validation was not applicable.

Additional repo-local CLI readback was attempted but blocked by local build setup: bun apps/cli/src/cli.ts skills get agentv-eval-writer hit a stale @agentv/core export, and bun run build could not proceed because tsup is not installed in this worktree.


Compound Engineering
GPT_5

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 27, 2026

Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: a0c90cc
Status: ✅  Deploy successful!
Preview URL: https://36296e5d.agentv.pages.dev
Branch Preview URL: https://docs-eval-authoring-skill-da.agentv.pages.dev

View logs

@christso

Copy link
Copy Markdown
Collaborator Author

Coordinator review follow-up:

I found one docs consistency issue before merge: apps/web/src/content/docs/docs/tools/validate.mdx still said validation required criteria, while this PR correctly updates the eval/test docs to say criteria is conditional when expected_output, assertions, or turns are present.

Fixed on the PR branch in a0c90cc5 (docs(eval): align validate field requirements). Local verification after the fix:

  • git diff --check
  • bunx biome check on the touched docs and skill-data files

Waiting for refreshed GitHub Actions before merge.

@christso christso merged commit f2d11a4 into main Jun 27, 2026
8 checks passed
@christso christso deleted the docs-eval-authoring-skill-data branch June 27, 2026 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant