Skip to content

feat: real two-persona workload to DBRE triage flow#104

Merged
d3v07 merged 3 commits into
mainfrom
feat/two-persona-workload
Jun 7, 2026
Merged

feat: real two-persona workload to DBRE triage flow#104
d3v07 merged 3 commits into
mainfrom
feat/two-persona-workload

Conversation

@d3v07

@d3v07 d3v07 commented Jun 7, 2026

Copy link
Copy Markdown
Owner

Summary

Replaces the single hardcoded demo query with a real two-persona loop:

  • Users (Dev Trivedi, Aakash Singh) sign in and run guided, read-only MongoDB workloads from a console. Each query's real explain evidence is captured and attributed to whoever ran it.
  • A DBRE signs in and triages the actual slowest captured queries — ranked by explain evidence (blocking sort, collection scan, over-scan ratio), not wall-clock — diagnoses one through the existing ESR analyzer, and approves a hash-bound index fix that the controller applies and verifies.

The queries the DBRE fixes are the ones users really ran. EvidencePack v1 is unchanged, agents/tools stay read-only, and index mutation happens only after a matching hash-bound approval — with the approver identity derived from the verified session, never the browser.

Key changes

  • Auth & roles: seeded role-based login (stdlib scrypt + HS256 httpOnly session cookie); middleware confines the user persona to the workload console and the DBRE to the triage + review planes. The read API re-verifies the session bearer on every data call.
  • Workload capture: guided, allowlist-validated, read-only query builder; real explain capture to a new query_log collection (kept separate from the diagnose-run ledger).
  • Triage queue: evidence-ranked GET /workload/slow-queries, attributed per user, bounded + sorted Mongo-side.
  • Diagnose: POST /run accepts a captured_query_id and diagnoses that query's natural plan (no forced index hint); the captured filter is re-validated before it reaches the backend. Approver is taken from the verified DBRE session.
  • Seed: seed/seed_workload.py resets a baseline index set so trap shapes stay genuinely slow and the ESR fix verifies as a real improvement (re-run reset between demos). seed/seed_users.py seeds the three accounts.

Deploy notes (read before deploying)

  • SESSION_SECRET must be byte-identical on BOTH services (read API + dashboard). Without it on the dashboard, middleware bounces every request to /login even after a successful sign-in — login silently fails. Both deploy paths now require it (deploy/deploy_cloudrun.sh + dashboard/DEPLOY.md).
  • The read API also fails closed at startup if SESSION_SECRET is unset in production.
  • Seed the accounts (seed_users.py) and the workload baseline (seed_workload.py verify) against the cluster before the demo.

Post-deploy smoke test

The local E2E ran the deterministic controller. Add one check at the deploy gate: diagnose a captured query through the Agent Engine path (production runs the three split Vertex roles with the new current_index=None) and confirm the DIAGNOSED → approve → VERIFIED loop.

Test plan / verification

  • Unit + contract suite green; 100% coverage on controller/auth.py + api/auth.py (auth is security-critical).
  • Live integration (gated on a Mongo connection): preset explain contract (6 trap shapes blocking-sort, 3 healthy served in index order), the capture→queue service roundtrip, and a full captured-query diagnose→apply→verify with a measured improvement.
  • Full browser E2E: signed in as Dev → ran workloads → signed in as the DBRE → evidence-ranked queue → Diagnose → Approve → VERIFIED (before 100,073 docs examined → after 25; approver came from the session, not a client-supplied value).
  • A mappingproxy defect in apply+verify (only reachable through the live engine, bypassed by the orchestrator unit tests) was caught by the browser E2E and fixed with a regression test.

d3v07 added 3 commits June 7, 2026 17:33
Users run guided, read-only MongoDB workloads from a console; each query's real explain evidence is captured and attributed. The DBRE triages the actual slowest captured queries (ranked by evidence, not wall-clock), diagnoses one through the existing ESR analyzer, and approves a hash-bound index fix the controller applies and verifies - replacing the hardcoded demo query as the primary path.

Seeded role-based login (scrypt + HS256 httpOnly session); user vs DBRE planes. Guided query builder + evidence capture to query_log; evidence-ranked queue. POST /run accepts a captured query (natural-plan diagnosis); approver derives from the verified session. Workload-baseline seed keeps trap shapes slow so the ESR fix verifies. EvidencePack v1 unchanged; agents read-only; mutation backend-only after approval.
…age range

deploy/deploy_cloudrun.sh: write RUN_API_TOKEN + SESSION_SECRET into Secret Manager and reference via --set-secrets (no plaintext in the Cloud Run env-var config); grant the SA on all three secrets.

deploy/cloudrun.md: smoke tests obtain a DBRE bearer and diagnose a captured query; note the approver comes from the verified session.

controller/workload.py: assert_safe_query validates customer.age range bounds are ints within 16..75. Format tests/unit/test_auth.py.
@d3v07 d3v07 merged commit 2de7b7c into main Jun 7, 2026
4 checks passed
@d3v07 d3v07 deleted the feat/two-persona-workload branch June 7, 2026 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant