Skip to content

surroundapps/agent-provenance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agent Provenance

Agent credentials and tamper-evident provenance replay — a reference implementation.

As AI agents take real actions, two questions stop being academic: what is this agent allowed to do, and who said so? and how do I know the record of what it did wasn't edited afterward? Agent Provenance is a small, runnable answer to both. An agent performs a task; every action it takes is authorized by a third-party-issued, revocable credential and sealed into a hash-chained, tamper-evident log; and you can replay the run — seeing each step, the credential that authorized it, and a cryptographic check proving the record is intact.

This is infrastructure, published as a reference. It is not a product.

It is built on a shared trust primitive, provenance-core — revocable verifiable credentials, a tamper-evident hash-chained ledger, and two-log reconciliation — vendored into this repo at core/. The same primitive underlies two sibling implementations: one for the good that moves (a checkpoint authority verifying provenance offline) and one for the person who presents (a privacy-preserving, revocable eligibility credential). This repository is the primitive applied to the agent that acts.

The boundary that matters. Agent Provenance is the credential and the receipt — the thing a zero-trust system consumes. It is not an access-control engine. It records and proves; it does not decide who-may-do-what at runtime. Everything here is about the record: issued, verifiable, revocable, and honest about what it can and cannot prove.

The 30-second story

  1. Run a document-verification agent. It walks five steps, presenting a capability credential for each and signing every action into the ledger.
  2. Replay the run: the chain verifies, and every step reads verified.
  3. Revoke the verify:signature credential (say the operator key was compromised). Replay again — the two steps that used it now read credential_since_revoked, while the chain still verifies. The record was never rewritten to hide the revocation. That is the whole thesis in one move: you can hold an agent accountable for a withdrawn authorization and still prove exactly what happened.
  4. Tamper with the record itself (a sandbox-only button) and the chain verification catches it, pointing at the exact entry where trust ends.
  5. Go offline (v0.1). A second node loses contact mid-task, keeps acting on a credential cached before the disconnection, and logs locally — while that credential is revoked on the issuer side. On reconnect, the two chains reconcile into one timeline that flags the actions taken in the dark on the since-revoked credential, while both records still verify. Revocation is a property of the connected world; this is what it looks like when an agent can't phone home.

Quickstart

Never used a terminal? Follow the step-by-step guide instead — it assumes no prior experience and takes about five minutes.

make dev          # editable install + test deps
make test         # 40 tests here (core ships its own 29), no services required
make run          # serve on http://localhost:8000
# in another terminal:
make demo         # drive the full flow above via curl

Then open http://localhost:8000/ for the replay UI — a guided walkthrough leads you through each step. Run the agent, watch the timeline build, revoke a credential and watch the affected steps flip while the record stays intact, and tamper with a step to watch the chain break. Below that, the offline scenario runs the two-node, revoke-during-blackout case and renders the disconnected region on a merged timeline. Or drive it from the terminal:

curl -X POST localhost:8000/run                 # returns run_id + issued credentials
curl localhost:8000/log/<run_id>                # signed log + per-entry replay status
curl -X POST localhost:8000/verify -H 'content-type: application/json' \
     -d '{"run_id":"<run_id>"}'                 # verify the chain
curl -X POST localhost:8000/revoke -H 'content-type: application/json' \
     -d '{"credential_id":"<id>"}'              # forward-acting revocation
curl -X POST localhost:8000/scenario/offline    # v0.1: stage + reconcile the blackout
curl localhost:8000/reconcile/<scenario_id>     # v0.1: re-reconcile vs current revocations
curl localhost:8000/api/info                     # service info as JSON

How it works

Credentials (provenance_core.credentials) follow the W3C Verifiable Credentials data model and are signed with Ed25519. Each grants one named capability (verify:signature, read:document, …). Three properties make a credential more than an assertion, and all three are implemented: it is issued by a key the holder does not control, independently verifiable offline against the issuer's public key, and revocable. Credentials are verified at the moment of action, not at deploy time — validity is re-decided per action, never assumed.

The ledger (provenance_core.ledger) is an append-only hash chain. Each entry carries the previous entry's hash; altering any past entry breaks every hash after it. This gives tamper-evidence, not immutability — anyone with write access can still edit a row, but not without the chain catching it. (See core ADR-0002.)

Replay status is computed on two independent axes — chain integrity and current credential standing — so a forged record and a genuine-but-revoked authorization are reported as the different things they are. (See ADR-0003.)

The agent (agent_provenance/agent.py) performs a deliberately shallow document-verification task. The verification logic is set dressing — the credentials, the signed log, and the replay are the actual subject. It runs deterministically with no LLM dependency so the demo is reproducible; there is a marked seam where a model call would slot in.

Reconciliation (provenance_core.reconcile, staged by agent_provenance/offline.py) is the v0.1 addition. Two nodes each grow their own hash chain; one goes offline mid-task and keeps acting on a cached credential that is revoked, on the issuer side, during the blackout. reconcile verifies both chains independently, merges them into one deterministically-ordered timeline, and refines the two-axis status above into four cases — the headline being acted_during_blackout_on_revoked: an action that was already unauthorized when taken, by a node that could not have known. It is surfaced, not laundered, and both source records still verify. The merge re-signs nothing and rewrites nothing — it is a read-time projection over two real chains. (See core ADR-0003.)

POST /run  ──▶  agent runs, signs each step  ──▶  hash-chained ledger
                          │                                │
GET /events/{id} ◀────────┘ (SSE replay)                   │
GET /log/{id} ◀────────────────────────────────────────────┘  (+ replay status)
POST /verify   credential signature + revocation, and/or chain integrity
POST /revoke   forward-acting; never edits the record

POST /scenario/offline ──▶ two nodes, one goes dark, credential revoked in the gap
                                          │
GET /reconcile/{id} ◀─────────────────────┘  merge two chains → one timeline,
                                              surfacing the blackout actions

Scope

v0 — the single-node slice: single agent, single log, single issuer, all online; the full issue → act → sign → replay → revoke loop; tamper-evidence.

v0.1 — offline operation and reconciliation: a second agent goes offline mid-run, keeps acting on a credential it cached before losing contact, and logs locally. While it is offline, that credential is revoked on the issuer side. On reconnect, its local hash chain and the online node's chain are reconciled into one verifiable timeline that surfaces the blackout actions which ran on the since-revoked credential — rather than letting the gap launder them. This is the genuinely hard case (revocation is a property of the connected world; it breaks exactly when a node can't phone home), and it is the beat a connected-only demo can never show. (See core ADR-0003.)

The boundary v0.1 holds, on purpose. Reconciliation here is a deterministic merge of exactly two logs, not distributed consensus: two nodes and one disconnection, full stop. No N-node generalization, no consensus protocol (Raft/Paxos/BFT, leader election, Byzantine tolerance), no CRDT machinery. There are no conflicting writes to arbitrate — each node appends to its own chain, and the two chains are disjoint sequences of distinct actions, not competing edits to shared state. The merged timeline is a read-time projection over two real, independently-verifiable chains; it re-signs nothing and rewrites nothing.

v0.2 — forensic inspection (this release): the verifier already recomputed every hash and checked every signature, but the API and UI only ever exposed the verdict — a status label and a truncated hash — which asks to be trusted, the one thing a trust demonstration must not do. v0.2 exposes the intermediate evidence the verifier already computes and previously discarded: the exact signed bytes, the recomputed hash next to the stored hash character-for-character so a mismatch is visible rather than asserted, and field-level before/after on a tampered entry. It adds nothing to the trust model and makes no trust decision of its own — every value is read from the store or recomputed by calling the same provenance_core functions the verifier uses. Where an original cannot be reconstructed it says so rather than inventing a plausible "before." (See docs/DESIGN.md.)

Deliberately out of scope: sophisticated verification logic, multi-issuer trust hierarchies, key rotation, DID resolution, durable WORM storage, bounded clock-skew handling, a real transport for exchanging chains on reconnect, and — the hard line — anything resembling an access-control gate. Agent Provenance is the receipt, not the gate.

Project layout

core/                          provenance-core, vendored via git subtree
  provenance_core/
    canonical.py     deterministic JSON bytes for signing + hashing
    keys.py          Ed25519 keys, signatures, did:key derivation
    credentials.py   W3C VC issue / verify / revocation check (subject-type generic)
    ledger.py        append-only hash chain + verification
    reconcile.py     deterministic two-log merge + blackout findings
  docs/adr/          core decisions: canonicalization, ledger, reconciliation
  tests/             29 tests — the primitive in isolation, no services
agent_provenance/
  offline.py       v0.1: the two-node, revoke-during-blackout scenario
  storage.py       SQLite persistence
  agent.py         the document-verification fixture agent
  app.py           FastAPI — the endpoints + replay status, serves the UI
  fixtures/        static demo documents
web/               single-page replay UI (vanilla JS, no build step)
  index.html       markup, including the guided walkthrough
  styles.css       house design system, two themes
  app.js           SSE streaming, revoke/tamper replays, reconciliation, walkthrough
tests/             40 tests — the agent, the API, offline, forensics, and pinned invariants
  test_invariants.py  the trust guarantees, pinned as named regression checks
docs/
  DESIGN.md        architecture narrative, threat model, "why not a blockchain"
  adr/             ADR-0003: two-axis replay status (this implementation's own)
RUNNING.md         step-by-step run guide for a non-technical first-time user
scripts/demo.sh    terminal walkthrough of the full flow (online + offline)

The five trust primitives live in core/ (provenance-core) because they are domain-neutral; this repository builds the agent implementation on top of them. core/ is vendored with git subtree, so git clone of this repo is self-contained — there is no submodule init step, and make install installs the core from the local path before this package.

Documentation

  • RUNNING.md — step by step, assumes no terminal experience.
  • docs/DESIGN.md — the architecture narrative: what it solves, the threat model (what it defends and what it does not), the scope discipline, and a direct treatment of why this is a hash chain and not a blockchain — and exactly where a private or public ledger anchor would add trust.
  • docs/adr/ — the four architecture decision records, the authoritative per-decision detail.
  • tests/test_invariants.py — the trust guarantees the system promises, written as named, runnable regression checks (a forgery and a revocation are reported distinctly; revocation never mutates the record; reconciliation never rewrites a source chain; …).

Where this sits

The agent-accountability space consolidated through late 2025 and into 2026 — the Microsoft Agent Governance Toolkit ships Ed25519 agent identity with cascade revocation; the NIST AI Agent Standards Initiative is standardizing the shape of agent identity and logging; the EU AI Act's logging obligations begin to bite from August 2026. In US banking, the April 2026 interagency revision of model-risk guidance (superseding SR 11-7) takes the opposite tack and places generative and agentic AI explicitly out of scope as too novel to govern prescriptively — leaving institutions to extend their own controls to exactly the systems this primitive is about. This reference deliberately implements the primitive directly rather than wiring any one vendor's SDK, for two reasons. First, the point is to make the mechanism legible — the credential, the signed record, the merge — not to demonstrate integration. Second, a lean artifact (git clonemake run, no external services) is a clearer signal than a dependency tree.

Where the artifact earns its keep is the edge the consolidated stacks mostly don't dramatize: clean, connected revocation is well covered; revocation that happens while a node is disconnected — and is surfaced honestly on reconnect rather than hidden — is the harder, more interesting case, and it is the one v0.1 builds. The primitive is horizontal on purpose; an agent asserting a compliance fact about goods crossing a border is one illustration among several (CI/CD attestation, instrument-data integrity, and financial reconciliation are others), not the thing the tool is about.

The family

Four repositories, one shared trust primitive — the same credential, ledger, and verification logic applied to each thing that shows up at a checkpoint:

  • provenance-core — the shared primitive: revocable verifiable credentials, a tamper-evident hash-chained ledger, two-log reconciliation. Vendored into each implementation.
  • agent-provenance (this repository)the agent that acts: every action authorized by a revocable credential and sealed into a tamper-evident log, with offline reconciliation and forensic replay.
  • border-authoritythe good that moves: a checkpoint that verifies provenance offline and stays honest under a committed posture when the issuer is unreachable.
  • human-credentialthe person who presents: prove one attribute, reveal nothing else, consent by signing, checked at use time.

License

Apache-2.0. Copyright 2026 SurroundApps, Inc. Author: Zeeshan Khan.

About

An AI agent whose every action is authorized by a revocable credential and sealed into a tamper-evident log — and that reconciles actions taken offline on a credential revoked while it couldn't phone home. With a forensics view that shows the recomputed bytes, not just a verdict. A reference implementation on provenance-core.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors