Multi-Agent Workflow — an operator hub for AI-driven business + personal task automation. Severity-routed LLM analysis on a streaming event pipeline, backed by a TOTP-auth'd web control panel.
Canonical specs live in
CLAUDE.md(rules for AI sessions editing the repo) andARCHITECTURE.md(the target architecture, 8 rendered Mermaid diagrams, multi-tenant schema, Day 1/2/3 build sequence). This README is the orientation for humans landing on the repo for the first time. Project labels ([ONLINE_MAW]onMaster,[LOCAL-MAW]onmain) are defined inPROJECTS.md.
Delta-9 MAW ingests events from a transactional source (today: Loyverse POS), classifies them by severity (P0–P3), and routes each to the right analysis tier — cheapest model first, paid model as last resort. The operator interacts via a single web control panel (TOTP login + remote PTY) that observes the pipeline rather than driving it. The design is sized for a single-node K3s homelab today and a hybrid GCP deployment once volume justifies it.
The system is governed by 7 Unbreakable Laws (see CLAUDE.md),
chief among them token thrift (Law #1), read-only-by-default operator
surface (Law #6), and cross-agent coordination via Redis ProjectState
(Law #7).
Two agent stacks under one observer panel — Delta-9 MAW runs a
severity-routed anomaly pipeline (core/) and a Brain/Hands SDK
pipeline (src/agents/) side by side; both are observed read-only
by delta9_panel/. See the workflow diagram in
ARCHITECTURE.md §2.0a.
- Seven Unbreakable Laws govern every change (
CLAUDE.md). Token thrift, Redis-as-state-transport, role partitioning, unattended automation, engine accuracy, read-only operator surface, and cross-agent coordination — enforced on every PR byscripts/audit.sh(laws-drift check). - Anomaly pipeline routes by severity, paid model as last resort
(
core/orchestrator.py,_dispatch_result()+ the severity switch in the consume loop). P0 → Claude Sonnet viaengine.js; P1 → Gemini Flash → Hermes/Ollama → Haiku cascade; P2/P3 → log only (zero LLM tokens). - Brain/Hands SDK has a single LLM-call interception point
(
src/agents/agents/base.py).BaseAgent._call()auto-attachescache_control: ephemeraland records anAgentCalltoMetricsTracker— no agent calls Anthropic directly. - Idempotency + dedup happen before any LLM tokens are spent.
IdempotencyStore(SHA-256 in-memory or Redis backend) andcore/ingest_v2.py'sSETNX delta9:seen:{txn_id}both reject duplicates with zero model spend. - Context retrieval is pluggable
(
src/agents/context/layer.py).ContextLayeraccepts a constructorembedder=callable with a bounded LRU query cache and a Redis-backedt:{tenant}:emb:7-day cache. - Per-agent metrics + cache-hit ratio surface to the panel.
MetricsTracker.summary()returns per-agent latency, tokens in/out,cache_read/cache_write, success rate — the panel's Pipeline Activity view consumes it. - Caching is layered (§8A): prompt-cache → Redis → LRU/ETag → Cloudflare. Anthropic ephemeral cache on every system block, Redis namespace cache for embeddings + monthly budget counters, LRU+ETag on hot panel routes, Cloudflare Cache Rules + R2 at the edge.
- Multi-tenant from day 1 with Postgres RLS
(
infra/sql/004_rls.sql). Every tenant-scoped table hastenant_id UUID NOT NULL+tenant_isolationpolicy readingcurrent_setting('app.current_tenant'). - Operator panel is read-only by default (6th Law)
(
delta9_panel/). FastAPI + Vite/React + MUI v6 + Tailwind v4, TOTP login; onlyrole=admin+ an authenticated PTY session can mutate host state. - Both stacks share one observation surface. Severity events
stream to
progress.md+ Sheets + Chat webhook + panel SSE; SDK runs land inagent_runs+audit_log; the panel renders both timelines side by side.
Delta-9 MAW's long-term goal is an Ultimate Personal & Business Assistant (UPBA): a unified system where the owner interacts naturally across phone, voice, web, and IDE, and the system routes each request to the right AI agent, workspace, and cloud service. The aim is to give back time — the system handles routine business analysis and personal automation while the owner does the high-value work.
Practically, that means:
- Catching anomalies in business data (POS receipts → fraud / refund / chargeback signals) before they require manual intervention.
- Producing structured operator reports under
/relib/<category>/that the owner can scan in a few minutes per day. - Costing as little per month as possible while preserving accuracy — hence Law #1 (token thrift) and the P0/P1/P2/P3 severity tiers.
| Subsystem | Path | Stack | Status |
|---|---|---|---|
| Anomaly pipeline | core/ |
Python + Node.js, Redis | shipped |
| Multi-agent SDK | src/agents/ |
Python + Anthropic SDK | shipped (Brain/Hands) |
| LLM-as-judge eval | evaluation/ |
Python + Anthropic SDK | shipped |
| Data definition | infra/sql/ |
Postgres 16 + SQLite | shipped (17 DDL files) |
| Legacy operator dashboard | web/ |
Express + EJS + SQLite | shipped — retiring |
| Operator Control Panel | delta9_panel/ |
FastAPI + Vite/React/TS + MUI v6 + Tailwind v4 | Day 1 shipped (PR #15); Tailwind palette layered on (PR #17) |
| Legacy dashboard polish | web/ |
Express + EJS + SQLite + SSE | Pipeline / Reports / Audit views shipped (PR #12); now permanent LAN companion to the panel (PR #19) |
| Pushed-event ingest | services/webhook-receiver/ |
Node.js 20 + Express + ioredis | HMAC-validated webhook receiver → Redis Streams (PR #19) |
| K3s deploy | deploy/ |
YAML + systemd | planned |
| GKE / Cloud Run deploy | k8s/, scripts/setup-gcp.sh |
YAML + bash | shipped |
| Member storefront (orthogonal) | backend/ + frontend/ |
FastAPI + Next.js | live on d9bkk.com — KYC, Wallet, PromptPay, TCG, marketplace, pawn, gasha |
| Pre-merge audit | scripts/audit.sh |
bash | shipped (0 LLM tokens) |
The eight target-architecture diagrams live in ARCHITECTURE.md: subsystem
map, event lifecycle, Brain/Hands SDK flow, TOTP+PTY auth, severity
routing, multi-tenant ER, outbox→DLQ flow, CI/CD pipeline.
| Component | Version | Purpose |
|---|---|---|
| Python | 3.11+ | All Python services (core/, src/agents/, backend/, evaluation/, delta9_panel/) |
| Node.js | 20+ | core/engine.js, web/ Express, frontend/ Vite/Next.js, delta9_panel/web/ Vite |
| Redis | 7+ | Dedup, work queue, pub/sub, embedding cache, idempotency, budget counters |
| SQLite | 3.40+ | Day-1 panel panel.db, legacy web/db/delta9.db, shop demo shop.db |
| PostgreSQL | 16 + pgvector | Day-30+ panel system-of-record; partitioning + RLS |
| Docker | 24+ | Local Ollama, optional Postgres for SQL validation |
| MUI | v6 | Operator panel component library |
| Tailwind | v4 | CSS-first design tokens layered alongside MUI (skip preflight; share palette via @theme) |
| Provider | Model | Used by | Severity tier |
|---|---|---|---|
| Anthropic | claude-sonnet-4-6 |
core/engine.js |
P0 (critical) |
| Anthropic | claude-haiku-4-5-20251001 |
core/engine.js |
P1 fallback |
gemini-2.0-flash |
core/gemini_engine.py |
P1 primary (cheapest) | |
| Ollama | nous-hermes2 (local) |
core/hermes.py |
P1 first fallback (free) |
| Anthropic | claude-sonnet-4-6 (Brain) + claude-haiku-4-5-20251001 (Hands) |
src/agents/ |
SDK pipeline |
All Anthropic calls use cache_control: {"type": "ephemeral"} on the
system prompt (Law #1). Per-model monthly budget caps are enforced
through the t:{tenant}:budget:{model}:{yyyy-mm} Redis counter.
| Component | Service | Replaces |
|---|---|---|
| Redis broker | Cloud Memorystore | self-hosted Redis PVC |
| Raw / staged / progress storage | Cloud Filestore (NFS) | self-hosted PVC |
| Engine | Cloud Run | GKE-hosted engine deployment |
| Object store | Google Cloud Storage | filesystem fallback |
| Secrets | Secret Manager | k8s/secrets.yaml |
| Observability (planned) | Cloud Logging + Cloud Trace + Managed Prometheus | none today |
| Image registry (planned) | Artifact Registry | none today |
Provisioned via scripts/setup-gcp.sh. See ARCHITECTURE.md §14 for
the Day-30+ migration items (Terraform IaC, KEDA, ESO, OpenTelemetry).
| Component | Service | Notes |
|---|---|---|
| Reverse proxy + TLS | Traefik v3 | Already bundled with K3s |
| Network | Tailscale | Tailnet hostname delta9.<tailnet>.ts.net |
| Process supervisor (Day 1) | systemd | PTY needs /dev/pts, host bash |
| Process supervisor (Day 30+) | K3s Deployment |
Pod-based PTY needs careful security review |
| Cert resolver | Tailscale certs / Let's Encrypt | Per-environment |
| Tool | Purpose |
|---|---|
| pytest 8+ | Python tests (698 collected today; pytest --collect-only -q) |
| flake8 | Lint (E9,F63,F7,F82 fatal; rest warn) |
scripts/audit.sh |
Pre-merge audit — 10 deterministic checks, 0 LLM tokens |
| Alembic (panel-skeleton phase) | SQL migrations generated from infra/sql/ |
| PM2 | Local dev process manager (ecosystem.config.cjs) |
| GitHub Actions | CI: flake8 + pytest on every PR |
| Mermaid CLI (optional) | Validate Mermaid blocks in ARCHITECTURE.md |
| Docker Compose | deploy/optional/docker-compose.hermes.yml for local Ollama |
- Severity-routed anomaly pipeline —
core/orchestrator.pyconsumes a Redis-backed work queue and dispatches by severity. P0 → Claude Sonnet; P1 → Gemini Flash → Hermes/Ollama → Claude Haiku cascade; P2/P3 → log only (no LLM). All paid calls cache-controlled. - Brain/Hands SDK pipeline (
src/agents/) —BaseAgent._callis the single LLM-call interception point; auto-attachescache_controland recordsMetricsTrackerentries.IdempotencyStoreandContextLayerare pluggable for the prototype. - Operator Control Panel — Day 1 (
delta9_panel/) — FastAPI + Vite/React/TS + MUI v6 + Tailwind v4. TOTP login (pyotp + Fernet encryption-at-rest), JWT cookies, mobile-responsive AppShell with a collapsingDrawer, Business tab (Pipeline / Reports / Audit log placeholders wired), Personal tab (PTY placeholder — real PTY lands in Day 3). One-time-login test admin (scripts/seed_test_admin.sh) for local dev; SQLite trigger burns the password after the first successful login. - Caching architecture §8A (5 layers) — Anthropic prompt cache
metrics (
cache_read/cache_writerecorded on everyAgentCall), Redis namespace cache for embeddings + monthly budget counters (t:{tenant}:emb:,t:{tenant}:budget:{model}:{yyyy-mm}) with in-memory LRU fallback whenREDIS_URLunset, app-level LRU + ETag on hot panel routes, Cloudflare Cache Rules + R2 backup scaffolds underdeploy/cloudflare/. MCP server stub atdelta9_panel/mcp/. - Legacy dashboard polish (
web/) — Pipeline view, Reports browser over/relib/, Audit log search, all live-updated via SSE.web/services/shared data layer;web/public/css/polish.cssadds ~260 lines of additive CSS keeping the existing EJS templates intact. - Multi-tenant schema source-of-truth (
infra/sql/) — 17 hand-written DDL files (Postgres 16 + SQLite Day-1) coveringtenants,users,events(partitioned),events_outbox,agent_runs/notes,documents(pgvector),reports,token_budgets, POS balance-cache + txn, webhook deliveries, ops queue, and loyalty geofence. Row-Level Security on every tenant-scoped table; RLS isolation tested against real Postgres. - GCP-managed infrastructure — Memorystore, Filestore, Cloud Run,
GCS lifecycle, Secret Manager + Workload Identity. One-shot via
scripts/setup-gcp.sh. - Legacy operator dashboards —
web/Express + EJS + SQLite (port 4000) serves live event push fromcore/orchestrator.pyto a server-rendered HTML view;frontend/Next.js (port 3000) consumes the Express/api/*for the modern dashboard. - Member storefront (
backend/FastAPI +frontend/Next.js, live on d9bkk.com) — grew from a shop demo into the production member surface: products + cart + orders + admin, plus a KYC gate (ID upload + staff review) that locks payment/sell/auction/pawn behind verification, a D9-Wallet hub (/account) with top-up, PromptPay QR payments with automatic slip verification, the TCG Lottery Pack (anti pack-search in-store draw), and marketplace / trades / pawn / gasha / tournaments. PromptPay is configured once in/admin/economy(site_settings.promptpay_id+ static QR) and shared by wallet top-up and the lottery; the landing page is bilingual EN/TH. Orthogonal to the anomaly pipeline; shares no state (nocore/↔backend/cross-import). SeeCLAUDE.md→ "Storefront". Backend tests need Python 3.11 (.venv311). - LLM-as-judge evaluator (
evaluation/judge.py) — 5-dimension rubric scoring ofsrc/agents/outputs. - Pre-merge audit (
scripts/audit.sh) — 10 deterministic checks (banned word, stale imports, env-var dedup, cross-references, pytest, gitignore, secrets, TODOs, branch sanity, SQL DDL syntax). 0 LLM tokens, ~5s default / ~12s with--validate-sql. - Tunneling helper (
scripts/tunnel.sh) — auto-detects Tailscale / cloudflared / ngrok / SSH-reverse for one-command public URL exposure during development. - 8 Mermaid workflow diagrams in
ARCHITECTURE.mdrendering natively on GitHub, alongside the original ASCII versions.
See ROADMAP.md for the full milestone list (v1.4.0
operator surfaces + secure PTY, v1.5.0 source-agnostic adapters,
v1.6.0 Postgres system of record, v2.0.0 hybrid runtime) and explicit
non-goals.
# Install Python deps
pip install -r requirements/test.txt -r requirements/orchestrator.txt
# Run the test suite (Python — ~700 tests)
python -m pytest --tb=short -q
# Run the Node engine tests
node --test tests/engine.test.js tests/engine.advanced.test.js tests/schemas.test.js
# Run the pre-merge audit (zero LLM tokens)
bash scripts/audit.sh # ~12s, full audit incl. DDL validation
bash scripts/audit.sh --no-validate-sql # ~5s, skip DDL (tight dev loops only)
# Local Redis (anomaly pipeline)
docker run -d -p 6379:6379 redis:alpine
# Run the orchestrator + engine locally
ANTHROPIC_API_KEY=… node core/engine.js # port 3000
REDIS_HOST=localhost ENGINE_URL=http://localhost:3000/analyze python core/orchestrator.py
# Run the Express dashboard + Next.js frontend together (PM2)
pm2 start ecosystem.config.cjs # web on :4000, frontend on :3000
# Run the operator panel (Day 1) — FastAPI on :8080, Vite dev on :5173
cd delta9_panel && pip install -e . && uvicorn delta9_panel.main:app --host 127.0.0.1 --port 8080
cd delta9_panel/web && npm install && npm run dev
# First-admin TOTP enrollment (writes QR to console)
python delta9_panel/scripts/bootstrap_admin.py
# Optional: one-time-login test admin for the legacy web/ dashboard
bash scripts/seed_test_admin.sh # creates tester/test; password burns on first login
# Optional: expose any local server publicly during a debugging session
bash scripts/tunnel.sh 4000 # auto-picks tailscale / cloudflared / ngrok / sshFor the full command reference see CLAUDE.md → "Build, Test, and Run
Commands".
Releases follow a loose SemVer cadence. The full release history —
with PR-by-PR notes, env-var changes, and test-suite growth — lives in
CHANGELOG.md. The list below is the at-a-glance
table; click through for detail.
| Version | State | Highlights |
|---|---|---|
payments-unify |
2026-06-19 | Unified PromptPay config — one backend site_settings.promptpay_id (number + static QR) set in /admin/economy, read backend-first by the TCG Lottery Pack (no more two-place drift). Static QR now renders via the /api/promptpay-qr-image proxy (render fix). EasySlip slip auto-credit for wallet top-ups; referral-milestone delete; bilingual EN/TH landing page. |
storefront |
2026-06-16 | d9bkk.com member surface — KYC gate (ID + staff review) over payment/sell/auction/pawn, D9-Wallet hub, operator-uploadable PromptPay QR + slip verification, TCG Lottery Pack (anti pack-search), marketplace/pawn/gasha. ADR-003 TCG pricing engine. |
v1.4.0 |
2026-06-03 | MAW v1.0 Release — Tiered Audit Gateway (ADR-002), LINE escalation valve, cross-service Audit Bus, manual agent trigger UI. |
v1.3.0 |
2026-05-30 | Operator panel Days 1–3, caching architecture §8A, webhook receiver, Console tab, cross-agent ProjectState. |
v1.2.0 |
Unreleased — PR stack #9/#10/#11 | Canonical ARCHITECTURE.md, multi-tenant infra/sql/ + RLS, pre-merge audit, runtime optimization, 8 Mermaid diagrams. |
v1.1.0 |
2026-05-12 | Security hardening (helmet CSP, rate limits, JWT cookies), PM2 ecosystem, session-start hook, CI workflow. |
v1.0.0 |
2026-04-26 | Initial unified release across anomaly pipeline, agent SDK, evaluator, shop demo, dashboards, GCP scaffolding. |
v0.1.0 |
Early 2026 | First prototype — Python orchestrator + Node engine + Redis broker; severity routing established. |
Upcoming milestones (v1.4.0 → v2.0.0) live in
ROADMAP.md.
This is currently a single-operator project. PRs welcome but please:
- Run
bash scripts/audit.shbefore opening a PR. - Keep new LLM calls cached (
cache_control: {"type": "ephemeral"}). - Don't reintroduce the banned word — the audit will block.
- Read
CLAUDE.mdend-to-end if you're an AI session editing this repo; readARCHITECTURE.mdif you're a human extending it.
TBD.