Skip to content

DreamLab-AI/agentbox

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2,238 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentbox

Agentbox

A manifest-driven, reproducible runtime for sovereign software agents.

Build License Nix Multi-arch

One TOML manifest. One Nix flake. One runtime contract.

Maintainer: John O'Hare · Upstream IP: Melvin Carvalho (JSS, DID:Nostr) · MAINTAINERS.md

Quickstart · Why Agentbox · Capabilities · Sovereign Architecture · Docs


What is Agentbox?

Agentbox is a hardened, fully reproducible Linux container environment built specifically to host, orchestrate, and trace autonomous AI agents.

Instead of juggling custom Dockerfiles, scattered API keys, and brittle dependency scripts, everything in Agentbox is driven by a single agentbox.toml manifest. You declare the agents you want, the tools they need — from browser automation to 3D rendering — and the storage backends they use. Agentbox builds a byte-for-byte reproducible image using Nix, spins up the environment, and routes durable agent writes (memory, pods, beads, events) through a local privacy-redaction filter and cryptographic audit trails.

Why Agentbox?

Most agent runtimes are just a collection of tools with no provenance, privacy, or reproducible state. Agentbox is built differently:

  • 🚀 Batteries Included (via MCP): Out-of-the-box support for Claude Code, Codex, Gemini, DeepSeek, and ruflo. Instantly equip them with 90+ skills including Playwright, ComfyUI, QGIS, Blender, LaTeX, and Jupyter via the Model Context Protocol (MCP).
  • 🔒 Privacy by Default: An embedded openai/privacy-filter sidecar sits in the adapter-dispatch path, redacting PII and secrets before durable writes hit memory or pods. Policy is per slotstrict (redact-then-write, fail-closed) for memory and pods, soft for events/beads, off for the orchestrator control plane. It is not a universal interceptor on every tool call; see ADR-008.
  • 🛡️ Hardened & Reproducible: Built with Nix flakes. The pg Node module is baked into the image (no npm install pg at boot); a small set of npx -y CLI aliases is the one remaining runtime-fetch path, pending SRI pinning (tracked in lib/npm-cli.nix). Runs as non-root (uid 1000) with a read-only root filesystem, cap_drop: ALL, no-new-privileges:true, and a supplemental seccomp denylist (47 high-risk syscall denials layered on Docker's default profile — not a replacement allowlist; the container runtime is the security boundary). Published ports bind host-loopback only (ADR-027).
  • 🔗 Sovereign Data & Auditability: Agents own their data cryptographically. Every generated file, memory, and action is stamped with a did:nostr identity and stored in an embedded Solid Pod (solid-pod-rs). As of the solid-pod-rs 0.5.0-alpha.0 provenance release, the pod substrate makes agent actions traceable by construction: every write is a git-mark (write-as-commit + PROV-O sidecar), and high-value or disputed records can become block-trails — tamper-evident, hash-chained provenance trails with an optional Bitcoin (taproot) anchor. See The Sovereign Data Stack.
  • 🔌 Pluggable Adapters: Run entirely standalone on a laptop (SQLite + local JSONL), or effortlessly federate into a cloud mesh (Postgres pgvector + HTTP event sinks) by flipping a TOML switch.

Quickstart

Interactive onboarding (recommended)

Use the browser-based setup wizard to configure your manifest, select your tools, and boot the container:

git clone https://github.com/DreamLab-AI/agentbox.git
cd agentbox
./scripts/start-agentbox.sh

The wizard opens in your default browser — no dependencies beyond Python 3 (for the local HTTP server). It renders all agentbox.toml sections with schema-validated form controls and the DreamLab glassmorphism design system. Pass --tui to use the legacy terminal wizard instead.

Setup Wizard
Browser-based configuration wizard with schema-driven form controls

Fast path (pre-built image)

export AGENTBOX_IMAGE_REF=ghcr.io/dreamlab-ai/agentbox:latest
docker pull "$AGENTBOX_IMAGE_REF"
./agentbox.sh up --registry
./agentbox.sh health
./agentbox.sh shell

Build from source

git clone https://github.com/DreamLab-AI/agentbox.git
cd agentbox
./scripts/agentbox-config-validate.sh
./agentbox.sh up --build
./agentbox.sh health

Next steps:


Included Capabilities

Your agentbox.toml manifest toggles capabilities on or off. Disabled features add zero bloat to your final image.

Category Highlights
Agent toolchains claude-code, ruflo, antigravity (agy), agentic-qe, openai-codex
Consultants Meta-router for named external consultations: DeepSeek, Perplexity, Z.AI, Antigravity
Browser and web External browsercontainer sidecar (chrome-devtools-mcp, Chrome Beta 149+, GPU-accelerated)
Media and design Local ComfyUI (or external URL), ImageMagick, FFmpeg
Spatial and 3D QGIS geospatial analysis, Blender modelling, 3D Gaussian Splatting
Data science and docs PyTorch, Jupyter Lab, LaTeX, Mermaid rendering
Code-as-Harness Persistent Python kernel MCP, ExpeL post-task lesson distillation, Voyager verified-skill library, SWE-agent ACI MCP, execution-gated tree-search (PRD-008)
Governance Agent Control Surface Protocol (kinds 31400-31405) — cross-repo human-in-the-loop integration with the DreamLab forum and the host project's broker via the embedded relay. The agentbox producer (management-api/lib/agent-control-surface.js) mints and publishes the panel events; see sovereign mesh.
Consumer Economy Governed outbound payment pipeline (PRD-015 Phase 1): lib/pay402.js pure 402-scheme classifier (agentbox-ledger/x402/l402/unknown), spend-policy middleware (fail-closed caps + allowlist), native payer (NIP-98 ledger debit, idempotent single retry), receipt + activity URNs minted on every spend attempt, /.well-known/x402.json discovery manifest, additive accepts[] in 402 challenges, skills/payment-router skill. Lightning-first: NWC/L402 is the only planned real-money rail; no native EVM/USDC. ADR-032, PRD-015.
Embodied agent loop Bi-directional /wss/agent-events channel (ADR-014) — agents emit a canonical agent_action signal (identity preserved per ADR-013) that a host project renders as a live agent actor (coloured beam + transient attractive edge), and consume inbound user-interaction events so agents become user-aware. A privacy-safe memory-flash beacon (env-gated on VISIONCLAW_API_URL) fires the host's embedding-cloud visual on every RuVector access. See ADR-014, ADR-026, PRD-014.
Operations OTLP tracing, Prometheus metrics (:9091/metrics), Tailscale VPN integration

Consumer Economy Pipeline (PRD-015 Phase 1)

When an agent encounters an HTTP 402 from a peer node or external service, the consumer pipeline takes over: lib/pay402.js classifies the challenge as agentbox-ledger, x402, l402, or unknown — a pure function with fail-closed semantics (attacker-controlled bytes never become money). A spend-policy middleware checks per-call caps, daily budgets, origin allowlists, and approval thresholds from [payments.consumer] in the manifest before any rail is invoked. For agentbox-ledger challenges (in-mesh, Phase 1), the native payer debits the Web Ledger via NIP-98 with an idempotency key and retries the original request once. A receipt URN and a PROV-O activity URN are minted through lib/uris.js on every spend attempt — paid, denied, failed, or pending — so the audit trail has no gaps. The broadcast side emits an additive accepts[] block in 402 challenges (byte-compatible with existing clients) and generates /.well-known/x402.json at boot so external crawlers can discover gated services. The skills/payment-router skill wraps all of this as payFetch() — a 402-aware drop-in for fetch(). Configuration:

[payments.consumer]
enabled = true
max_sats_per_call = 100
daily_budget_sats = 1000
approval_threshold_sats = 50

[payments.broadcast]
well_known = true
accepts_block = true

[skills.payment_router]
enabled = true

Phase 3 adds Lightning settlement via NWC (NIP-47) for L402 invoices — the only planned real-money rail. See economy-loop.md and PRD-015.

Code-as-Harness (PRD-008)

A persistent IPython kernel MCP exposes six tools (kernel.exec, kernel.list_vars, kernel.inspect, kernel.reset, kernel.interrupt, kernel.install_pkg) so that variable state, imported modules, and computed DataFrames survive across tool calls within a session. An ExpeL post-task hook distils completed trajectories into reusable DistilledLesson records in RuVector. A Voyager verified-skill library accumulates assertion-passing Python functions for retrieval and injection at future task start. A SWE-agent-style ACI MCP provides bounded file viewing, compact-diff editing, budget-capped search, structured test execution, and task submission for autonomous repo-level bug-fixing. An execution-gated tree-search skill generates N candidates, executes each in a fresh kernel session, and scores by assertion-pass rate. Multi-tier memory uses OWL2-typed RuVector namespaces (semantic / procedural / episodic) with no schema changes. All records carry did:nostr identity and PROV-O action receipts. Phase 1 surfaces (code_interpreter, codeact, expel_lesson_extraction) are opt-in; Phase 2 surfaces (voyager_skill_library, aci_shell, tree_search_coder) are scaffolded and default off. See docs/developer/code-as-harness.md for the operator guide.


The Sovereign Data Stack

The core differentiator of Agentbox is the Identity and Tracing Mesh.

When an agent acts, how do you know which agent did it? How do you prove it later? Without an identity root, audit logs are meaningless.

Agentbox solves this by generating a BIP-340 secp256k1 keypair at bootstrap. The agent's public key becomes a did:nostr:<hex-pubkey> identity. Every resource, action, and event in the system is rooted in this cryptographic identity.

From that single root, 18 kinds of urn:agentbox:<kind>:[<scope>:]<local> identifiers name every entity: pods, credentials, receipts, activities, events, memories, skills, architecture docs, and more. Owner-scoped kinds embed the hex pubkey — urn:agentbox:credential:<hex-pubkey>:<sha256-12-…> means that credential was issued by that agent and no other. Content-addressed kinds are deterministic: the same payload always produces the same URN, so re-emitting never double-counts and signed credentials keep a stable @id across JCS canonicalisation.

Identity root diagram
flowchart TB
    KP[secp256k1 keypair\nBIP-340 x-only]
    HEX[64-char hex pubkey]
    DID[did:nostr:hex-pubkey\nPrimary agent DID]
    KP --> HEX
    HEX --> DID

    subgraph identity["Identity surfaces"]
        POD_ID[Solid pod identity\nWAC agent field]
        RELAY_ID[Nostr relay NIP-42\nNIP-98 HTTP auth]
        DID_DOC[DID Document\nGET /.well-known/did.json]
    end

    DID --> POD_ID
    DID --> RELAY_ID
    DID --> DID_DOC

    subgraph owned["Owner-scoped URNs — hex pubkey in scope"]
        CRED[urn:agentbox:credential\nhex-pubkey:sha256-12-...]
        RECEIPT[urn:agentbox:receipt\nhex-pubkey:sha256-12-...]
        ACTIVITY[urn:agentbox:activity\nhex-pubkey:sha256-12-...]
        BEAD[urn:agentbox:bead\nhex-pubkey:sha256-12-...]
        EVENT[urn:agentbox:event\nhex-pubkey:sha256-12-...]
        MANDATE[urn:agentbox:mandate\nhex-pubkey:sha256-12-...]
        AGENT[urn:agentbox:agent\nhex-pubkey:sha256-12-...]
        ENVELOP[urn:agentbox:envelope\nhex-pubkey:sha256-12-...]
    end

    DID --> CRED
    DID --> RECEIPT
    DID --> ACTIVITY
    DID --> BEAD
    DID --> EVENT
    DID --> MANDATE
    DID --> AGENT
    DID --> ENVELOP
Loading
Request lifecycle and adapter dispatch pipeline

Every request through the management API follows a rigorous lifecycle: identity verification → adapter routing → privacy redaction → JSON-LD encoding → OTLP tracing.

Note: illustrative composite. Each stage (NIP-98 verification, the privacy filter, uris.mint, the JSON-LD encoder, OTLP spans) is implemented as adapter middleware, but POST /v1/pods/:id/resources is not itself a management-api route — pod resource writes go to solid-pod-rs directly or through the pods adapter from other routes.

sequenceDiagram
    participant AG as Agent did:nostr:hex
    participant MA as management-api
    participant AR as adapter resolver
    participant PF as privacy filter
    participant UM as uris.mint
    participant PO as solid-pod-rs
    participant OT as OTLP exporter

    AG->>MA: POST /v1/pods/:id/resources NIP-98 signed
    MA->>OT: span open agentbox.adapter.pods.write
    MA->>AR: resolve slot=pods
    AR->>PF: write(slot=pods payload=data)
    PF->>PF: policy=strict redact via opf-router
    PF-->>AR: redacted payload
    AR->>UM: mint kind=pod pubkey=hex payload=redacted
    UM-->>AR: urn:agentbox:pod:hex:sha256-12-abc
    AR->>PO: PUT resource atomic rename
    PO-->>AR: 201 ETag
    AR->>UM: mint kind=activity pubkey=hex action=write
    UM-->>AR: urn:agentbox:activity:hex:sha256-12-def
    MA->>OT: span close resource-urn=urn:agentbox:pod:...
    MA-->>AG: 201 JSON-LD @id=urn:agentbox:pod:hex:sha256-12-abc
Loading
Full URN kind taxonomy (18 kinds)
flowchart LR
    subgraph identity_k["Identity"]
        POD_K[pod]
        AGENT_K[agent]
    end

    subgraph comms["Communications"]
        ENVELOPE_K[envelope]
        EVENT_K[event]
        RECEIPT_K[receipt]
    end

    subgraph state["Durable state"]
        BEAD_K[bead]
        MEMORY_K[memory]
        DATASET_K[dataset]
        THING_K[thing]
    end

    subgraph auth["Auth and trust"]
        CRED_K[credential]
        MANDATE_K[mandate]
        MCP_K[mcp]
    end

    subgraph trace["Tracing"]
        ACTIVITY_K[activity]
        SKILL_K[skill]
    end

    subgraph docs["Governance docs"]
        ADR_K[adr]
        PRD_K[prd]
        DDD_K[ddd]
        META_K[meta]
    end
Loading
Kind Owner-scoped Content-addressed Example URN
pod yes yes urn:agentbox:pod:hex:sha256-12-abc
envelope yes yes urn:agentbox:envelope:hex:sha256-12-abc
credential yes yes urn:agentbox:credential:hex:sha256-12-abc
mandate yes yes urn:agentbox:mandate:hex:sha256-12-abc
receipt yes yes urn:agentbox:receipt:hex:sha256-12-abc
activity yes yes urn:agentbox:activity:hex:sha256-12-abc
event yes yes urn:agentbox:event:hex:sha256-12-abc
bead yes yes urn:agentbox:bead:hex:sha256-12-abc
agent yes no urn:agentbox:agent:hex:agent-name
mcp no no urn:agentbox:mcp:server-slug
memory yes no urn:agentbox:memory:hex:name
skill no no urn:agentbox:skill:slug
dataset yes no urn:agentbox:dataset:hex:name
thing yes no urn:agentbox:thing:hex:name
adr no no urn:agentbox:adr:ADR-013
prd no no urn:agentbox:prd:PRD-006
ddd no no urn:agentbox:ddd:DDD-004
meta no no urn:agentbox:meta:slug

Because Agentbox uses canonical URIs and Linked Data (JSON-LD), you can spin up the built-in Linked-Data browser at /lo/* to navigate the graph of your agent's memories, architectural decisions, and credentials. The /v1/uri/<urn> resolver maps any URN to its current HTTP representation.

Verifiable provenance and value transfer (substrate)

The solid-pod-rs 0.5.0-alpha.0 provenance release upgrades the pod backend into a global trust ledger for the agentic mesh. Two substrate capabilities are now available beneath the identity layer:

  • git-marks — every pod write lands as a commit, with a PROV-O sidecar recording who wrote what, when. Provenance is the default, not an afterthought.
  • block-trails — tamper-evident, hash-chained provenance trails. The cheap git-mark/hash-chain holds always; a Bitcoin (taproot) anchor is opt-in for high-value or disputed records, so traceability scales from free to settlement-grade per record.

This makes the owner-scoped agent URNs minted by management-api/lib/uris.jsurn:agentbox:activity (what an agent did), urn:agentbox:receipt (what it was paid for), urn:agentbox:credential (what it was authorised to do) — eligible to become trail states: cheap by default, Bitcoin-anchored on demand. Value transfer across the mesh rides the same substrate — the sovereign, Bitcoin-settled (sats / Lightning, no EVM) 402 economy (PRD-015 / ADR-032) now settles through the pod's routed web-ledger / order-book / AMM with replay protection.

The substrate capability is available now via the pin. The deeper wiring — receipts carrying git_commit_sha + block-height trailers, and crossing those trailers across the host-graph boundary — is the next increment, tracked in economy-loop.md.

Deeper reading:


Federation Transports

Agentbox participates in all three DreamLab federation transport strata. Each stratum is independently enabled via agentbox.toml and .env configuration.

graph LR
    subgraph "This Agentbox"
        TS["Tailscale\nuserspace-networking"]
        NR["nostr-rs-relay\n:7777"]
        MA["management-api\n:9090"]
    end

    TS <-->|"WireGuard\nMagicDNS"| OTHER["Other Agentboxes\nsolid-pod-rs hosts"]
    NR <-->|"NIP-01 WS"| RELAY["Private/Public\nNostr Relays"]
    MA -->|"CF Tunnel\nHTTPS"| CF["Cloudflare Edge"]
Loading

Stratum 1 — Tailscale (Private Mesh)

Each agentbox container joins the tailnet with its own identity using --tun=userspace-networking (no /dev/net/tun needed). The container's MagicDNS hostname (configured via [networking].hostname in agentbox.toml) becomes the service discovery address for other mesh participants.

# agentbox.toml
[networking]
tailscale = true
hostname = "agentbox-london"

# .env
TAILSCALE_AUTHKEY=tskey-auth-...

Security: Tailscale runs inside the container, isolated from host networking. Tailscale ACLs control access — did:nostr signatures are not evaluated at this layer.

Stratum 2 — Nostr Relays (All Components)

The embedded nostr-rs-relay (:7777) serves as both a local event store and a mesh relay. Peer relays are configured in agentbox.toml:

[mesh]
peer_relays = [
    "ws://agentbox-paris.tailnet-name.ts.net:7777",   # Tailscale peer
    "wss://relay.damus.io",                             # Public relay
]

All relay traffic is authenticated via NIP-98/NIP-42 did:nostr Schnorr signatures. Private relays keep governance events (kinds 31400-31405) within the organisation. Public relays provide censorship-resistant message passing when private infrastructure is unavailable.

Stratum 3 — Cloudflare Tunnels (Edge ↔ Local)

A Cloudflare tunnel exposes the management API and solid-pod-rs to CF Workers services (nostr-rust-forum, dreamlab-ai-website) without opening ports to the public internet. Configure via:

# .env
CLOUDFLARE_TUNNEL_TOKEN=eyJ...
AGENTBOX_PUBLIC_URL=https://pods-native.dreamlab-ai.com

CF Workers reach the local agentbox through the tunnel for pod provisioning, resource access, and NIP-05 federated resolution.

See Tailscale guide · Mesh deployment · Identity mesh


Documentation

For operators

For sovereign data and linked data

For developers

Canonical specs


Platforms

Target Build Run Notes
Linux x86_64 Native Native Full support, richest local feature set
Linux aarch64 Native Native Supported, subject to feature-specific gates
macOS Compose/dev tooling Docker Desktop/OrbStack/Colima CPU or remote-GPU paths
Windows Compose/dev tooling Docker Desktop + WSL2 WSL2 is the practical path
Remote Linux Native or registry Native OCI/Fly/Hetzner/bare workflows supported

Contributing

  1. Read docs/developer/architecture.md.
  2. Validate the manifest before changing build or runtime behavior.
  3. Prefer manifest-gated additions over ad hoc runtime mutation.
  4. Treat hardening, probe semantics, URI grammar, and linked-data surfaces as architectural changes — propose them via an ADR.

Part of VisionFlow

Agentbox is the harness engineering substrate of the VisionFlow coordination platform — a federated architecture for human–AI intelligence built on did:nostr identity, OWL 2 EL reasoning, and Nostr message passing. agentbox runs the agents; VisionClaw renders the embodied agent loop; solid-pod-rs stores sovereignly; the forum and website provide governance and operator surfaces.

Substrate Repository Role
VisionFlow DreamLab-AI/VisionFlow Umbrella canon — ecosystem guide and coordination architecture
VisionClaw DreamLab-AI/VisionClaw Knowledge engineering — OWL 2 EL, 92 CUDA kernels, XR; renders the embodied agent loop
Agentbox DreamLab-AI/agentbox Harness engineering — Nix, 90+ skills, sovereign pods; runs the agents
solid-pod-rs DreamLab-AI/solid-pod-rs Cryptographic foundation — JSS Rust port, DID:Nostr
nostr-rust-forum DreamLab-AI/nostr-rust-forum Forum kit — passkey auth, governance events
dreamlab-ai-website DreamLab-AI/dreamlab-ai-website Branded deployment — React, WASM, Cloudflare Workers

Deeper reading: Ecosystem integration guide


License

Core project: AGPL-3.0.

Using agentbox as a hosted service — including running it on behalf of other users — requires you to make the full source (including any modifications) available to those users. Self-hosted and internal use carry no additional obligations beyond the standard copyleft terms.

Optional components (linkedobjects/browser, solid-pod-rs) are also AGPL-3.0 and therefore consistent with the project license. Other bundled components are MIT or Apache-2.0. See Licensing details for the full matrix.


Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • JavaScript 46.1%
  • Python 21.0%
  • Shell 12.7%
  • TypeScript 11.4%
  • Nix 4.8%
  • HTML 1.4%
  • Other 2.6%