Build Layers 5-8 Safety, Execution, and Governance Rail#100
Merged
Conversation
…ts (#97) - port _verification_checks: VERIFIED iff sort_removed AND selected_index_used AND metric_improved, else APPROVED; richer verify-trace summary - add tests/unit/test_verification_rail.py: VERIFIED conjunction, failure path (APPROVED + FAILED trace + ledger outcome=failed), index-evidenced-in-plan, metric-improved, deterministic ESR wins over a conflicting agent proposal, write-tools phase-gated (diagnose/approve blocked, verify allowed), stale-ticket guard (no apply on hash mismatch) - align test_orchestrator + test_ledger_store with the three-check rail
…ed audit approver (#98) - Ask the agent navigates to /runs/{run_id} for the produced run; real /run when the backend is configured, clearly-labeled local SIMULATION fallback (sim-001, read-only diagnosed pack) when not — never fake-live - PackSource gains "simulation"; /api/run returns the sim pack with simulated:true and the token untouched on that path; fixtures add FIXTURE_SIMULATION - audit page sources the approver from approval_gate.approver, no EvidencePack v1 change - docs/safety-boundary-decisions.md records the deferred v1 items (policy-check records, Decision approver/timestamp) and the optional read-only tools, with the safe interim taken for each
…ent route
Re-running on /runs/{id} where the produced run_id equals the current route made router.push a no-op, leaving the spinner stuck. When the target path matches the current path, refresh in place and clear the running state. Scoped to AgentRunView.onAsk.
…the PR docs/ is gitignored as internal/local, which silently dropped the safety-boundary decisions log from the branch. Un-ignore just that one governance artifact so the deferred-EvidencePack-v1 record (policy-check records, Decision approver/timestamp, optional read-only tools) ships with the Layers 5-8 PR.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Research-first pass on Layers 5–8 (#95–98). An assessment found the runtime is ~80% already implemented, so this PR delivers the genuine gaps rather than rebuilding shipped code:
_verification_checks=sort_removedANDselected_index_usedANDmetric_improved;VERIFIEDiff all pass, elseAPPROVED) and added lock-in tests. PreviouslymaingatedVERIFIEDon sort-removal alone./runs/{run_id}; Overview/History reflect it via the existingGET /packs.sim-001pack (simulated: true) with a distinct SIMULATION badge — never fake "live".approval_gate.approver(fixes a blank approver without changing the contract).docs/safety-boundary-decisions.md): policy-check records andDecision.approved_by/at— both require EvidencePack v1 changes (a hard-stop); plus the optional read-only inspection tools (low priority;history_lookupneeds its own ledger-access review).Safety boundaries preserved
RUN_API_TOKENstays server-only (now sourced from Secret Manager — see deployment note). Agents stay read-only; the simulation only produces a read-only DIAGNOSED pack and never applies anything.VERIFIEDis never derived in the client.Test output
uv run pytest: green (exit 0). Addstests/unit/test_verification_rail.py— VERIFIED conjunction; failure path →APPROVED+ FAILED verify trace + ledgeroutcome=failed; recommended index evidenced in the after-plan; metric improved; deterministic ESR wins over a conflicting agent proposal; write-tools phase-gated (diagnose/approve blocked, verify allowed); stale-ticket guard (noapply_indexon hash mismatch). Alignstest_orchestrator+test_ledger_storewith the rail.tsc --noEmitclean ·npm run lintclean ·npm run buildsuccess (/,/audit,/history,/run-review,/runs/[run_id],/system-mapdynamic;/intakestatic).Browser QA (headless, fallback/simulation mode)
POST /api/run(no backend) →sim-001, statusdiagnosed,simulated: true./run-review→ navigates to/runs/sim-001; source pill renders simulation (data-source="simulation"), not live./runs/sim-001renders the labeled SIMULATION run (pending approval, full evidence hash, 5-stage indicator, 3-roles/4-tools grouped trace); no horizontal overflow at 1440px.sim-001;/audit?run_id=fixture-verifiedshows approverdashboard-operatorsourced from the gate.Live vs simulation
The deployed site is live-capable (
API_URL+RUN_API_TOKENconfigured), so Ask the agent runs a real diagnosis there. The simulation path exists only for local / credential-less contexts and is always labeled SIMULATION — it never claims to be live.Deployment note
No separate deploy from this branch. The recent
RUN_API_TOKENrotation was config-only (moved to Secret Manager, dedicated dashboard SA, verified healthy). Per plan, production gets one combined deploy ofmainafter this PR is reviewed and merged — which also makes PR #99's multipage console + verification-failed fix live, instead of two partial deploys.Not claimed
Agents do not mutate the database and do not mark runs
VERIFIED. Winner selection and verification are deterministic / controller-only; mutation is backend-only after a hash-bound human approval.Closes #97
Closes #98
Note: #95 (specialist agents) and #96 (read-only tools) are already implemented and verified in the deployed runtime (see the assessment summary above); close them once you've confirmed.