Skip to content

fix(eval): stop surfacing provider staging logs#1561

Merged
christso merged 1 commit into
result-row-id-sidecarsfrom
provider-stream-logs
Jun 29, 2026
Merged

fix(eval): stop surfacing provider staging logs#1561
christso merged 1 commit into
result-row-id-sidecarsfrom
provider-stream-logs

Conversation

@christso

@christso christso commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

Stacked on #1558 (result-row-id-sidecars) because av-dmob depends on the target-aware result layout work.

  • Copy provider-native stream evidence into run-N/transcript-raw.jsonl and then clean up AgentV-owned staging files under /tmp/agentv-provider-streams after a successful artifact write.
  • Stop CLI progress from printing temporary Provider log: staging paths; durable raw evidence remains discoverable through transcript_raw_path.
  • Preserve legacy/imported raw provider log pointer behavior by only cleaning AgentV-owned staging paths.

Validation

  • bun test packages/core/test/evaluation/orchestrator.test.ts
  • bun test apps/cli/test/commands/eval/artifact-writer.test.ts
  • bun test apps/cli/test/commands/eval/progress-display.test.ts
  • bun run build
  • Live dogfood: bun apps/cli/src/cli.ts eval run /tmp/agentv-av-dmob-dogfood.eval.yaml --targets /tmp/agentv-av-dmob-targets.yaml --target codex-live --workers 1 passed 1/1 against the local OpenAI-compatible proxy after correcting the temp target/grader base URL to http://127.0.0.1:10531/v1.

Dogfood artifact inspected: .agentv/results/av-dmob-provider-log-dogfood/2026-06-29T04-48-25-104Z/codex-live/index.jsonl points to transcript_raw_path, omits raw_provider_log_path, has no durable provider.log, and transcript-raw.jsonl starts with the Codex SDK stream log header. No /tmp/agentv-provider-streams file remained for that passing timestamp.

@christso christso changed the base branch from main to result-row-id-sidecars June 29, 2026 04:51
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 29, 2026

Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: e5571ba
Status: ✅  Deploy successful!
Preview URL: https://52863a1b.agentv.pages.dev
Branch Preview URL: https://provider-stream-logs.agentv.pages.dev

View logs

@christso christso force-pushed the provider-stream-logs branch from e93f0c7 to e5571ba Compare June 29, 2026 04:54
@christso christso marked this pull request as ready for review June 29, 2026 05:01
@christso christso merged commit f656fd4 into result-row-id-sidecars Jun 29, 2026
1 check passed
@christso christso deleted the provider-stream-logs branch June 29, 2026 05:02
christso added a commit that referenced this pull request Jun 29, 2026
* fix(results): isolate row sidecars by target bundle

* fix(dashboard): split run experiment and target columns

* feat(dashboard): add hierarchical category taxonomy

Merge PR #1560 for Bead av-k0e after independent read-only code review reported no actionable issues and verification passed.

* fix(eval): stop surfacing provider staging logs (#1561)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant