diff --git a/vision.md b/vision.md
index a73c687dc..780169eec 100644
--- a/vision.md
+++ b/vision.md
@@ -1,60 +1,390 @@
-# Introduction
+# Vision
 
-> This document is human written, meant to communicate how the project works and what its goals are
+> The product vision for Executor — what it is, the model it's built on, everything it
+> should eventually do, and the discipline for getting there. Implementations churn; this is
+> the destination they're heading toward, applied as incremental changes to the existing
+> codebase, not a rewrite.
 
-The vision of Executor is to be an open source layer for your integrations.
+## What Executor is
 
-It is not AI specific
+Executor is an **open-source layer for your integrations** — a catalog of every operation
+across a company's software, plus the auth, scoping, policy, history, and oversight to act on
+it safely.
 
-It is not code mode specific
+- It is **not AI-specific.** Agents are a primary consumer, but the layer is a general way to
+  represent and interop between sources of capability.
+- It is **not code-mode-specific.** Code mode is one way to call tools; it isn't the point.
+- It is a category of **source** and a way to **interop between them** through one set of
+  concepts.
 
-It is a category of source and a way to interop between them.
+The bet: as work shifts toward agents acting through software, every company needs one layer
+where anything a human could do is callable — auditable, across every account — and where new
+capability is *composed* on a shared substrate, not rebuilt per integration. One primitive
+unlocks the whole slice: catalog + auth + tools + scoping + UI + workflows + storage.
 
-Executor exposes 4 core concepts:
+It runs the same two ways from one codebase: an **in-process SDK** (no server, single-player,
+instant) and a **hosted service** (multiplayer, cloud). Single-player must feel great;
+multiplayer must be powerful; neither carries the other's weight.
 
-1. Tool
+## Core concepts
 
-This is represented via an id, optionally an input schema, and optionally an output schema.
+- **Tool** — an id, an optional input schema, and an optional output schema (JSON Schema today;
+  may grow richer types). The unit of capability. Addressed as
+  `<source>.<scope>.<connection>.<tool>`.
+- **Source** — contains tools. A source is produced by a plugin from some config (an OpenAPI
+  spec, a GraphQL endpoint, an MCP server, a CLI's install/run). The core never sees the raw
+  config — only a normalized manifest.
+- **Connection** — a credential, born wired, identified by `(scope, source, name)`. The `name`
+  is required and load-bearing (the account: `work`, `personal`, `prod`). One source holds many
+  connections.
+- **Secret** — never lives in Executor and never reaches the agent. A connection holds a
+  `SecretRef` (a pointer — `op://`, `keychain://`, `env://`, `vault://`); a provider resolves it
+  at call time, in trusted space, behind a proxy.
+- **Scope** — placement and identity. An ordered, merged set (see *Scopes*). Every record is
+  placed in a scope.
+- **Policy** — gates execution (allow / require-approval / block). Attached to sources and
+  connections (see *Policies*).
+- **Plugin** — registers sources, tools, providers, storage, surfaces. The one open extension
+  seam (see *The model*).
+- **Manager / Invoker** — the core runtime roles: a manager owns a source's tools and config; an
+  invoker executes a tool through the right connection and proxy.
 
-Input and Output schemas are JSON schemas, this may evolve in the future to support more complex data types
+## The model — core, first-party, one seam, surfaces
 
-Input and Output schemas are
+Executor is a **tiny core**, a set of **first-party capabilities** built on it, **one open
+extension seam**, and **surfaces** that expose it.
 
-3. Source
+**The core (the floor):** `execute(path, args)` (the one verb); the **catalog** of tools;
+**connections**; **scopes**; **policies/guardrails**; and `scope()` — narrow the executor to a
+subset, the seam everything composes through.
 
-A source contains tools
+**First-party capabilities** are a fixed, known set, included or omitted as modules — *not* a
+generic plugin registry: toolkits, the MCP host, generative UI, workflows, storage, skills,
+audit/runs, the internal-apps catalog. They're separable but they don't go through an abstract
+plugin contract, because nobody outside you needs to add a new *kind* of them.
 
-4. Secret
+**One open plugin seam: sources.** There are thousands of APIs; you'll never enumerate them; you
+want others to add them. That is where an open contract (`resolveTools` / `invokeTool`) earns its
+complexity. Secret **providers** are the one plausible second seam, for the same reason.
 
-Tools and secrets
+> The discipline: do not make "everything a plugin." Keep first-party capabilities first-party,
+> and keep exactly one open seam (sources, + maybe providers). Reversible — extract a new seam
+> only when several first-party capabilities demand the same one.
 
-5. Manager
+**Surfaces / hosts** serve the executor over a wire: MCP, HTTP, CLI, stdio, triggers, the web
+app. The SDK works fully in-process with none of them.
 
-6. Plugin
+## Composition — artifacts × surfaces, and the axes
 
-A plugin can register tools, sources, adapters
+Everything you build is a point in a small space of orthogonal axes. New things become
+coordinates, not bespoke subsystems — this is what keeps the breadth tractable.
 
-7. Invokers, Managers,
+- **kind** — tool / view (UI) / workflow / skill / store / connection
+- **lifetime** — ephemeral (a turn/session) / durable (deployed, addressable, re-openable)
+- **origin** — imported (a spec) / authored (code) / invoked-inline (produced by a call)
+- **surface** — in-process / MCP / HTTP / CLI / trigger / web
+- **delivery** — inline value / handle / embedded resource / deep link *(negotiated by client
+  capability)*
 
-## Features to ship
+Two rules fall out and dissolve most "how does X compose" confusion:
 
-- Dynamic plugin support
-  Run plugins in a v8 isolate, let your agent write whatever it needs, ship instructions as part of executor
-- Integrations registry
-- Configure executor via executor
-- MCP apps with dynamic UI
-  Use react flight / RSCs + code mode, enables
-- Scope merging
-- Workflows
-  Ideally built ontop of "use workflow"
-- SDK
-- Internal apps catalog
-- Store custom UI snippets
-- MCP channels support like how Claude Code over Discord works to enable talking back to the agent mid tool call
-- Storage
-  Every chat gets a temporary KV, SQLite, Filesystem the agent can use to interact with. The agent can also create these on scopes to persist data between tool calls
-- Scope merging
-  I should be able to add tools at a global, workspace, account level, override the secrets on them per one, override policies, create temporary scopes etc
-- Configure executor via executor
+1. **Artifacts vs surfaces.** What you *build* (kind) is separate from how it's *reached*
+   (surface). Build once; a surface projects it. A workflow defined once is reachable via cron,
+   MCP, or in-process. A view defined once renders standalone or returns over MCP. Not everything
+   fits every surface (a workflow isn't a slow tool; a UI isn't an HTTP handler).
+2. **Rich outputs negotiate delivery.** Too big or too rich for the channel → return a
+   *reference* the client resolves per its capability. A large result → a **storage handle**. A
+   UI → an **embedded UI resource** if the client renders it (MCP apps), else a **deep link**
+   into the web app. UI is not special; it's a rich output under the same rules as everything.
 
-The focus of Executor should be to ship the primitives that build an extendable product. Everything needs to be written with that in mind
+## Scopes (scope merging)
+
+Placement is an **ordered, merged set of scopes**, outer (authority) → inner (actor). You add
+tools, connections, and policies at a global, workspace, or account level; override secrets and
+policies per scope; and create temporary scopes. Single-player is one scope; multiplayer is the
+same model with more entries.
+
+Three fixed merge rules, nothing configurable:
+- **visibility = union** (you see everything placed in any scope you hold);
+- **guardrails/policy = deny-union, outer-wins** (an outer scope's `block` can't be weakened
+  inward; mandatory flows in);
+- **scalars/config/secret-overrides = innermost-wins** (the nearest scope to the actor wins).
+
+`org | user` is just the **two-element case** of this. The current codebase collapsed scopes to
+fixed org/user; the direction is to model it as an *ordered list that today has two entries*, so
+the resolver is already the general one and inserting a "team" or "environment" level later is
+additive, not a rewrite — added demand-driven, only when a real case needs the middle level.
+
+**Why scopes are a dimension, not indirection.** A past mistake (the credential-binding model:
+credential / slot / binding as separate objects wired together) caused an overcorrection that
+flattened scopes along with killing bindings. The lesson: **fuse indirection, keep dimensions.**
+A binding is a layer whose only job is to *connect* two things — fuse it away (born-wired
+connections did this). A scope is a property that varies independently and meaningfully — keep
+it; flatten it and it leaks back as workarounds. Placement stays a **direct, inline property** on
+each record (the way `scope` already sits on a connection); never reintroduce an object whose job
+is to *attach* a record to a scope.
+
+## Connections, secrets & auth
+
+- **Connection = a credential, born wired** (secret + account fused; no slot/binding split),
+  identified by `(scope, source, name)` with `name` required.
+- **Multiple accounts per source per scope** (distinct names). Per-member auth (each connects
+  their own) and team-level shared auth (one credential placed in an outer scope) are both just
+  placement.
+- **Auth template vs credential.** A source declares *how* it authenticates (OAuth / bearer /
+  API key — a discriminated set); each connection fills that template.
+- **Secrets never reach the agent.** Resolved by a provider at call time, in trusted space,
+  behind a **tool-proxy** — credentials never enter agent or sandbox code. **No escape hatch:** a
+  `SecretRef` never appears in a tool's I/O schema or any MCP/host response.
+- **Heterogeneous backends per member** (1Password / keychain / env / vault), declared
+  separately. **Credential lifecycle** (refresh, rotation, revocation, expiry) handled below the
+  agent.
+
+## Sources & source kinds
+
+Each kind is added through the **one open seam** (`resolveTools` + `invokeTool`): **OpenAPI**,
+**GraphQL**, **MCP servers**, **CLI** (install + run, no spec), **direct API / raw-fetch**, and
+later **direct database** and **email**. The plugin owns the opaque config and raw spec; the core
+reasons over a **normalized manifest** (schemas, side-effect class, auth template, data labels,
+required capabilities, an LLM-authored description). **Spec auto-refresh** tracks upstream drift
+and **must never auto-expand authority** — new operations appear *ungranted until reviewed*.
+
+An **integrations registry** lets people discover and add sources.
+
+## Policies
+
+Policies gate **execution** — `approve` / `require_approval` / `block` — and are distinct from
+toolkits (which gate visibility). Resolution is *most-restrictive-wins*; **plugin defaults** let
+a source ship a tool pre-marked `require_approval` (destructive detection), with `approve` as the
+override when an import wrongly flags a safe action.
+
+How people actually use them today (real usage): glob patterns over the address, where *every*
+rule targets a **source** and wildcards the scope/connection (`stripe_api.*.*.disputes.*`,
+`pscale.*.*.execute_write_query`) — i.e. faking structural targeting because there's nowhere to
+attach it. The direction:
+
+- **Kill the global glob list. Attach policy to the source and connection records** — the two
+  structural levels. **Connection overrides source** (innermost-wins), with **scope authority
+  layered on top** (outer/org mandatory, can't be weakened inward).
+- Fixing a mis-flagged tool becomes `approve` *on that source/connection*, directly.
+- **Toolkits carry no policy** — they're pure curation and inherit whatever source/connection
+  policies apply to the tools they expose. This dissolves the old global-vs-toolkit merge
+  conflict.
+
+## Toolkits & scoping
+
+A **toolkit is pure curation** — a named subset of tools (a `ToolSet`) that yields a **scoped
+view** of the executor. Used for **per-agent capability control**: lock down exactly what an
+added agent can reach (a background firewall-monitor agent gets *only* a monitor tool and an
+update-firewall tool, and nothing that builds). **`toolkits.list` feeds a consent screen** — the
+grantable slices an agent or OAuth flow requests. Single-player keeps toolkits too.
+
+## Authorization & capabilities
+
+One model over a **typed capability namespace**, three layers kept apart: **grants** (positive
+authority; toolkits are grant bundles), **policy/guardrails** (deny/approval), and **runtime**
+(object-capability handles only — no ambient executor, no raw secret, no unrestricted fetch).
+
+The tool axis stays a **predicate** (`ToolSet`); **meta-capability kinds are typed per-kind** —
+`author-tools` (effects ceiling), `generate-UI`, `allocate-storage` (quota), `deploy`,
+`trigger-register`, `egress`. **Capabilities compose transitively, enforced by the
+`scope()`-narrowed executor as a membrane:** an authored tool runs against its author's captured
+scope, so a toolkit-X agent's tool calling toolkit-Y fails because Y was never in scope. `scope()`
+strictly intersects (never widens); delegation routes through a granter; one enforcement surface,
+no advisory mode — the local isolate and the cloud worker enforce the same membrane.
+
+## Surfaces
+
+You build once; surfaces project.
+
+- **In-process** — `execute`, the SDK path. No server.
+- **MCP** — a scoped endpoint (`/mcp?toolkit=...`) over a *scoped executor*; the host
+  authenticates → resolves the toolkit → `scope()` → hands the narrowed executor to the MCP
+  server, which never sees a toolkit. Default shape is **meta-tools** (`search` + `describe` +
+  `execute` + `run_code`) so a huge catalog doesn't blow context, with a direct-tools opt-out.
+  **Authoring tools** (`render`, `create_workflow`, `author_tool`, `skills.create`) appear only
+  when the scope grants the matching meta-capability.
+- **MCP apps with dynamic UI** — React-flight / RSC + code mode for rich UI returned over MCP
+  (an embedded UI resource on capable clients, a deep link otherwise).
+- **MCP channels** — talk back to the agent mid-tool-call (the "Claude Code over Discord"
+  pattern), for human-in-the-loop and elicitation.
+- **HTTP** — any capability projected as a route, after policy/auth/versioning/CORS.
+- **CLI** — the same operations from the terminal (see *Distribution*).
+- **Triggers** — a schedule (cron) or an event that fires an artifact.
+- **Web app** — the UI-capable surface where views render and deep links land.
+
+## What you build
+
+- **Custom tools & API gateway.** Write a tool as a function and deploy it; it joins the catalog.
+  **Account-parametric** — a custom function calls a catalog op without baking in an account; the
+  account is passed at invocation. Executor doubles as an **API gateway** with a **generated typed
+  SDK** for everything added.
+- **Workflows.** Multi-step sequences of tool calls with **durable run semantics**, built ideally
+  on `use workflow` (or Cloudflare Workflows). User-defined entry points; **cron + event
+  triggers**; infrastructure-as-code. A step *is* a catalog tool call.
+- **Generative UI.** A **view** governed by lifetime × delivery: ephemeral (the agent calls
+  `render`) or durable (an authored dashboard). Rich UI runs in a sandbox; it calls only the
+  tools its scope grants, proxied server-side, never holding credentials.
+- **Skills.** Shared, code-authored knowledge — the company knowledge base as code — that agents
+  draw on, that **surfaces in tool search**, and can **run tools inline** to cut round-trips.
+  Served off MCP and from a `.well-known`. File-backed (see *Authoring model*).
+- **Internal apps catalog & custom UI snippets.** Store and share custom UI and internal apps.
+- **Configure Executor via Executor** — managing sources, connections, scopes, policies,
+  toolkits is itself done through tools.
+
+## Authoring model — two paths
+
+Every authored artifact (skill, custom tool, workflow, UI) has **two first-class authoring
+paths**, matched to two contexts:
+
+- **File + deploy** (developer, with the CLI): `executor/skills/`, `executor/tools/`, … authored
+  in a repo, `executor deploy` pushes them. Git gives versioning, review, rollback. The
+  canonical, org-shared tier.
+- **A tool over MCP** (an agent with no CLI, e.g. in a desktop client): `skills.create`,
+  `author_tool`, `create_workflow` write to the runtime directly. Zero friction, no filesystem.
+
+These aren't two sources of truth: each artifact has **one master** distinguished by origin and
+**scope/tier** (a runtime-authored artifact is personal/inner-scope; a deployed one is
+org/outer-scope, git-backed). **Promotion** is an explicit handoff — `executor pull` materializes
+a good runtime artifact into files for review and merge, graduating it to canonical. With a
+git-backed store (below), runtime authoring is literally a commit-from-a-worker and promotion is
+a branch/namespace merge.
+
+## Storage — two substrates
+
+Split by lifetime/kind, not one confused store:
+
+- **Authored artifacts** (skills, custom tools, workflows, UI — durable, versioned, reviewed) →
+  **git-backed**: local Git, and in the cloud **Cloudflare Artifacts** (git-over-HTTPS, push
+  commits from a Worker, read at runtime via a Workers binding / ArtifactFS, namespaces ≈ scopes,
+  versioning/pinning/rollback from git for free). "Git stands in for Cloudflare Artifacts" is
+  literal — local == remote with no translation. Both authoring paths commit to one repo.
+- **Agent state** (data — reactive, high-frequency) → **KV / SQLite / filesystem**. Every chat
+  gets a temporary KV + SQLite + filesystem the agent can use; the agent can also create stores
+  on scopes to persist between calls. Accessed **as ordinary tools** (no side-channel API).
+  Addressed by **handles** + a search-across-stores utility. **Reactive** (Convex-style: a
+  workflow writes, a UI/chat sees it live, with access rules). **Large results land in storage**
+  and return a handle — which is how the MCP context-overflow problem is actually solved.
+
+## The Run model
+
+Every execution, on any surface, produces or attaches to a **Run**: id, parent run, artifact
+version, principal chain, capability set, policy version, input hash, output handle, logs, spans,
+approvals, retries, cancellation, status. One record **collapses four features into views over
+it**: audit log, human-in-the-loop approvals, workflow runs, and resumability/debugging.
+
+## Human-in-the-loop
+
+Pause a call and require a human before it proceeds: **MCP elicitation**, a **gated `resume`
+tool** (the user simply doesn't always allow it), a **resume URL**, or an **MCP channel**
+talk-back mid-call. A gated call returns a *pending* Run with a resume reference.
+
+## Sandbox & runtime
+
+Untrusted code — user plugins, agent-authored tools, generative-UI bundles — runs sandboxed:
+locally a **V8 isolate** (and the quickjs / dynamic-worker / deno-subprocess runtimes), in the
+cloud a **Cloudflare Dynamic Worker**. **Default-deny.** Both enforce the **same capability
+membrane** (the scope-narrowed executor; `env` is the capability set; outbound closed except
+through the tool-execution binding). Credentials never enter the sandbox. Let the agent write
+whatever it needs; ship instructions (skills) as part of Executor.
+
+## Distribution, the CLI & remotes
+
+The `executor` CLI is **three things at once**: the **local runtime** (run a full instance and
+start using it), the **control plane** (local *and* remote, same commands), and an
+**agent-facing surface**. Distribution surfaces are all "local": `executor` (CLI), `executor
+service install` (daemon), `executor web` (web app) — one instance, delivered differently.
+
+- **Remotes ("cores").** Add remote instances; one shared sign-in; call them identically.
+  Execution **runs where the credential lives** — invoke from machine B, it runs on A with A's
+  credential; nothing crosses machines.
+- **Merge local + remote** (the "SSH for tools" / OpenTunnel model) — a tool on one machine is
+  callable from anywhere.
+- **Deploy & dev.** `deploy` (with `--check`) pushes authored artifacts; git-backed `sync`;
+  hot-reload `dev`; local filesystem access. **Configure Executor via Executor** from the CLI.
+
+## Cloud & user-generated extensions
+
+The hosted product runs on a **Cloudflare substrate** and lets users **write plugins and load
+them dynamically, sandboxed** — each in a Worker isolate with a **declared capability manifest**
+(a plugin asking for specific scopes can do exactly that and nothing else). User plugins ride the
+**source seam** and the capability membrane, opened **last and curated**. Local == remote via
+**substrate substitution**: a local V8 isolate stands in for the dynamic worker, Git for
+Cloudflare Artifacts, local SQLite for hosted storage.
+
+## Cross-cutting concerns
+
+- **Data provenance / tainting** — label data handles (source, connection, scope, sensitivity,
+  retention, allowed readers/egress) so policy can reason as data flows tool → storage → UI →
+  chat.
+- **Egress / allowed-hosts** — prevents **data** exfiltration (credentials are already safe via
+  the proxy), not credential leakage. A guardrail facet.
+- **Simulation / emulation** — run tools against the `@executor-js/emulate` emulators
+  (wire-level, stateful, real OAuth + credentials + request ledger) for side-effect-free testing.
+- **Multiple language targets** — JS now, Python later (data/analysis work).
+
+## Architecture
+
+Built on **Effect**. The same SDK is in-memory *and* client/server; SDK-only users need no
+server; serving is a standard Web `Request → Response` handler. The discipline that keeps breadth
+cheap is the **seam discipline**: the platform owns generic concerns (auth, caching, retries,
+schema, rendering, transport, the Run); a plugin owns only the source-specific resolve/invoke.
+
+## Principles
+
+- **One unifying primitive** — everything reduces to "invoke tool T → typed output."
+- **Tiny core; compose the rest.** Ship the primitives that build an extendable product.
+- **No silent defaults; explicit addressing.** The account is always in the path.
+- **Type the human surface, not the agent surface.**
+- **Secrets never reach the agent; authority only narrows.**
+- **Visibility ≠ permission** — curation (toolkits) is separate from governance (policy).
+- **Share the substrate aggressively; split the surfaces deliberately.** The failure mode of a
+  broad product is too many features in one undifferentiated human surface — keep that surface
+  minimal.
+- **Fuse indirection; keep dimensions.** (Born-wired connections, not credential bindings;
+  stacked scopes, not flattened ones.)
+- **One master per record** — file-backed or runtime-authored, never two live masters.
+
+## How we build it
+
+- **Don't rewrite — evolve the existing codebase.** This vision is a map you steer toward with
+  incremental, git-shaped changes, not a greenfield rebuild. The wedge already exists.
+- **The wedge:** be best-in-class at the catalog of imported tools + connections/secrets, exposed
+  through one invocation interface — with zero-token-exposure, the Run/audit record, human
+  approval, the capability membrane, and code-mode/meta-tool search baked in. Then sequence
+  nearest-to-substrate first: MCP → storage → workflows → generative UI → user plugins (curated,
+  last). Three substrate invariants keep composition cheap: one typed invocation primitive;
+  authorization inherited from the substrate, never re-implemented; a hard seam between
+  substrate-generic and source-specific code.
+- **Vision mode vs build mode.** This document is vision mode — generativity ("and then you'd
+  want X") is the point. In **build mode, every "and then you'd want X" is a YAGNI to defer**, not
+  a dependency to satisfy. Build the smallest thing useful on its own; let the next be *pulled* by
+  real pain, not *pushed* by imagined composition. The tell that you've slipped back to vision
+  mode: the next step needs two other things built first. (Example: v1 skills = read a folder,
+  expose `skills.search` + `skills.get`. No scopes, toolkits, environments, or git required.)
+
+## North-star scenarios
+
+- **The shared-data loop.** "Every morning at 9am, load my GitHub issues into SQLite": a
+  scheduled workflow writes to storage; a generative-UI front end reads it live; you chat with the
+  agent over the same data. One substrate, three readers.
+- **The scoped agent.** "An agent I can message that can *only* update my calendar." A toolkit
+  exposes a one-tool slice; the agent connects to a scoped MCP endpoint; it physically cannot
+  reach anything else.
+
+## Positioning
+
+The open-source alternative to closed integration layers; interops with adjacent projects; one
+primitive serving both backend-for-customers and internal-team use. The model is to nail a wedge,
+then grow many composing capabilities on a shared substrate — without letting the human-facing
+surface bloat with it.
+
+## Open questions
+
+- Data provenance/tainting — the label model and how policy consumes it.
+- Skills ↔ custom functions ↔ specs — where the boundaries sit and how the capability gate
+  follows references.
+- Typegen ownership — leaning "don't own it; expose as a utility."
+- Reconciliation UX for merged local+remote catalogs; filesystem overlay vs CLI-routed.
+- Storage allocation default — which class is the default and how handles are named.
+- Raw-fetch manifest — how an untyped tool kind declares its capabilities.