Skip to content

docs(results): codify eval result identity contract#1539

Merged
christso merged 2 commits into
mainfrom
docs-contract-after-result-dir
Jun 27, 2026
Merged

docs(results): codify eval result identity contract#1539
christso merged 2 commits into
mainfrom
docs-contract-after-result-dir

Conversation

@christso

Copy link
Copy Markdown
Collaborator

Summary

  • add ADR 0009 for the default experiment bucket and eval_path result identity contract
  • update CONCEPTS.md with result source identity and result_dir terminology
  • update running-evals docs with experiment precedence and manifest-driven result paths
  • mark ADR 0006 as superseded for result bucket/path identity by ADR 0009

Dependency

This PR documents the final result_dir field name and should land after the av-504.2 result_dir rename branch/PR. No code changes are included.

Verification

  • bun install
  • bun --filter @agentv/web build
  • git diff --check
  • git diff --cached --check
  • rg -n "artifact_dir" CONCEPTS.md apps/web/src/content/docs/docs/evaluation/running-evals.mdx docs/adr/0009-eval-path-result-identity-and-default-experiment.md (no matches)

Biome note: bunx biome check <changed md/mdx files> reports no files processed because Markdown/MDX are outside the configured Biome scope.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 27, 2026

Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: da48f18
Status: ✅  Deploy successful!
Preview URL: https://56cdeb06.agentv.pages.dev
Branch Preview URL: https://docs-contract-after-result-d.agentv.pages.dev

View logs

@christso christso left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings:

  • P2 docs terminology drift: apps/web/src/content/docs/docs/evaluation/running-evals.mdx:113 still describes the generated task bundle as living inside a "per-test artifact directory", and apps/web/src/content/docs/docs/evaluation/running-evals.mdx:363 says the wizard remembers the last run's "artifact directory". This PR is codifying result_dir as the opaque per-row allocation and the run directory as the resume/output boundary, while avoiding artifact_dir as the public term. These two public-doc phrases keep teaching workers to use "artifact directory" for exactly the directories the new contract is trying to name as result/run directories. Suggested fix: change line 113 to "per-test result directory" or "result directory", and line 363 to "last run directory".

Verification performed:

  • Fetched current main and PR head with the provided fallback refs (origin/main, refs/remotes/pr/1539), without rewriting remotes.
  • Reviewed diff against origin/main for CONCEPTS.md, running-evals.mdx, ADR 0006, and ADR 0009.
  • Checked against AGENTS.md, STRATEGY.md, ROADMAP.md, .agents/product-boundary.md, .agents/workflow.md, and .agents/verification.md.
  • Ran git diff --check origin/main..refs/remotes/pr/1539.
  • Ran git grep -n "artifact_dir" refs/remotes/pr/1539 -- CONCEPTS.md apps/web/src/content/docs/docs/evaluation/running-evals.mdx docs/adr/0009-eval-path-result-identity-and-default-experiment.md docs/adr/0006-separate-experiments-from-eval-definitions.md and found no exact artifact_dir matches in the changed docs.
  • Checked GitHub Actions for PR #1539: Build, Typecheck, Lint, Test, Check Links, Validate Evals, Validate Marketplace, and Cloudflare Pages are passing.

Verdict: Ready with a small docs terminology fix; not merging while the finding is unresolved.

@christso

Copy link
Copy Markdown
Collaborator Author

Unblocked: PR #1540 / av-504.2 has merged into main as c053a9e after live-provider + real LLM-grader dogfood passed.

Please refresh/rebase this branch against current main, rerun checks if GitHub does not do so automatically, and move out of draft when review/verification is complete.

@christso christso marked this pull request as ready for review June 27, 2026 09:11
@christso christso merged commit d8cf3d9 into main Jun 27, 2026
8 checks passed
@christso christso deleted the docs-contract-after-result-dir branch June 27, 2026 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant