The AI agent memory layer you can audit. Local-first memory governance for AI agents — every context item cited, every exclusion explained, every mutation reversible.
Most AI agent memory systems compete on recall — remember more, retrieve better. openclaw-mem competes on a different axis: governance. It captures agent activity as durable local records (SQLite + JSONL, no external database), then assembles bounded ContextPack bundles where every included memory carries a citation, every excluded memory carries a written reason, and every memory mutation ships with a rollback receipt.
Built sidecar-first for OpenClaw, usable with Claude, Codex, Gemini, and generic agent harnesses.
Not bigger memory — safer, explainable context.
Long-running agents don't just forget. Their memory degrades silently:
- Stale notes still match queries long after they stop being true.
- Untrusted or hostile content — tool output, scraped web text, injected instructions — retrieves well and slips into the prompt. This is the memory poisoning path of prompt injection, and similarity search alone cannot stop it.
- Context bloat: prompts swell into unbounded memory dumps nobody can review.
- No accountability: when the agent goes wrong, nothing explains why a memory was included.
Recall-focused memory layers make these failures more likely as they get better at retrieving. openclaw-mem adds the missing control layer: trust policies decide what may enter context, receipts prove why, and rollback undoes what shouldn't have happened.
pip install openclaw-context-pack
openclaw-mem --db /tmp/openclaw-mem-demo.sqlite status --jsonThe PyPI distribution remains openclaw-context-pack; it installs the openclaw-mem CLI plus openclaw-mem-mcp, openclaw-mem-channel-a, and openclaw-mem-hooks for agent integration.
Or run the reproducible trust-policy proof from the repo — no OpenClaw config, no real memory store, synthetic fixture only:
git clone https://github.com/phenomenoner/openclaw-mem.git
cd openclaw-mem
uv sync --locked
uv run --python 3.13 --frozen -- python benchmarks/trust_policy_synthetic_proof.py --jsonThe proof runs the same query twice against the same synthetic memory:
- Vanilla pack — retrieval without a trust policy → a quarantined row gets selected, because its text matches the query.
- Trust-aware pack (
--pack-trust-policy exclude_quarantined_fail_open) → the quarantined row is excluded, with an explicit receipt reason, while citation coverage stays intact.
The assertion block it must pass:
{
"synthetic_fixture_only": true,
"no_real_memory_paths_used": true,
"quarantined_removed": true,
"citation_coverage_preserved": true,
"trust_policy_explains_exclusion": true
}That is the product in one JSON object: same memory, same query — but governed context, with evidence. Details: trust-policy synthetic proof.
flowchart LR
A[Agent activity] -->|store / ingest| DB[(SQLite + JSONL<br/>local, durable)]
DB -->|"pack (trust policy + citations)"| CP["ContextPack v1<br/>bounded bundle + receipts"]
CP -->|inject| P[Agent prompt]
DB -->|search / timeline / get| O[Observe & debug]
CP -.->|"trace receipts: why included, why excluded"| O
- Store — capture, ingest, and query observations with
store/ingest/search. Records keep backend/action annotations and provenance. - Pack —
packemits a boundedbundle_text+context_pack(schema: openclaw-mem.context-pack.v1) with citations, trust-policy decisions, and trace receipts. - Observe —
timeline,get, and artifact outputs explain what happened, support debugging, and back rollback.
When the optional mem-engine is active, Proactive Pack extends the same contract into live turns as a small, receipt-backed pre-reply bundle.
| Capability | What you get |
|---|---|
| Trust-aware packing | Quarantined/untrusted records are excluded by policy, with written reasons in the receipt — a defense-in-depth layer against memory poisoning |
| Citations everywhere | Every packed item traces back to its source record; citation coverage is measured |
| Trace receipts | Include/exclude decisions are structured JSON, not vibes — auditable after the fact |
| Rollback | Memory and skill mutations go through plan → checkpoint → apply → receipt → rollback |
| Hybrid recall | SQLite FTS + vector search, with scopes and auditable policies |
| Temporal facts | "What is currently true about X" — source-linked assertions, timelines, conflict/staleness lint |
| Graph query plane | graph query for upstream/downstream/lineage over a SQLite-derived graph |
| Local-first | JSONL + SQLite. No cloud service, no external vector DB required, data stays on your machine |
Advanced opt-in labs (graph routing, GBrain sidecar, governed continuity, Dream Lite, Self Curator engine) stay out of the first evaluation path: Core vs Advanced Labs.
Honest framing: if you want maximum recall benchmarks, the projects below are excellent — and openclaw-mem is not trying to beat them at that game. It governs what enters your context window.
| Recall-focused memory layers (mem0, supermemory, mempalace, claude-mem, memory-lancedb-pro…) |
openclaw-mem |
|
|---|---|---|
| Primary question | "Did the agent remember the right thing?" | "Should this memory be trusted — and can you prove why it's in the prompt?" |
| Inclusion logic | Similarity / relevance scores (opaque) | Explicit receipts with include & exclude reasons |
| Untrusted content | Retrieves whenever it matches | Quarantined by trust policy; exclusion documented |
| Mistake recovery | Delete and hope | Checkpointed mutations with rollback receipts |
| Storage default | Vector DB, often cloud | SQLite + JSONL, local-first |
| Best at | Recall quality, token savings | Auditability, safety, explainability |
They are complementary: openclaw-mem already pushes bounded metadata to LanceDB via its writeback loop, and the long-term direction is governance-as-a-layer over whatever recall engine you prefer.
| Time | Path | Where |
|---|---|---|
| 5 min | pip CLI + synthetic proof | Evaluator path |
| 30 min | Sidecar install, real capture, first governed pack | Install modes |
| Afternoon | OpenClaw plugin / mem-engine promotion, MCP/Channel A/hooks integration for Codex/Claude/Gemini | MCP integration, Channel A, Lifecycle hooks |
Memory governance means treating an agent's memory like a supply chain with controls: provenance for every record, trust tiers for every source, explicit policy decisions about what may enter the context window, receipts documenting those decisions, and rollback when something was wrong. Recall answers "what matches?"; governance answers "what is allowed in, and why?"
Those projects optimize recall quality and token efficiency — and do it well. openclaw-mem optimizes auditability: citations, trust policies, trace receipts, and rollback are the core contract, not add-ons. See How it compares. You can use them together.
It is a defense-in-depth layer, not a silver bullet. Content from tools, web pages, and skills starts untrusted; trust policies keep quarantined records out of packs even when they match the query, and the receipt documents the exclusion. The synthetic proof demonstrates exactly this behavior, reproducibly.
No. The default stack is SQLite + JSONL on your own machine. Hybrid recall (FTS + vector) works locally. There is no hosted service and no telemetry.
Yes. openclaw-mem harness install writes a managed persistent-memory instruction card for Codex, Claude, Gemini, or a generic agent surface. OpenClaw is the first-class host, not a requirement.
A bounded, injectable bundle (openclaw-mem.context-pack.v1) containing the selected memory text plus structured metadata: citations for every item, trust-policy decisions, and a trace receipt explaining the selection. It is designed to be small, inspectable, and stable as a contract.
Retrieval is hybrid (FTS + vector) with scopes and policy-aware ranking — solid, but openclaw-mem does not currently publish comparative recall benchmarks, and the reality check is candid about what is and isn't measured. The differentiated value is governance; broader public benchmarks are on the roadmap.
Mutations flow through explicit plans, checkpoints, diffs, and receipts. self-curator rollback --receipt <apply-receipt.json> restores the previous state from the receipt. The same posture applies to engine adoption: promotion is a one-line slot switch, and so is retreat.
It is a young, actively developed project (v1.9.x, single maintainer, 800+ commits). Core Store/Pack/Observe and the trust-policy path are shipped and tested; advanced lanes are explicitly labeled labs. Start with the reality check — it tells you what is automatic, what is partial, and what is opt-in.
OpenClaw's native memory got noticeably stronger by 2026.4.15 — good for the ecosystem, and good for this project. openclaw-mem doesn't duplicate native recall; it builds an opinionated governance layer on top of a stronger foundation: better packs, clearer evidence, safer memory maintenance. More: why openclaw-mem still exists · 2026.4.15 comparison.
- Docs site: https://phenomenoner.github.io/openclaw-mem/
- Quickstart · Architecture · Context pack · Install modes
- MCP integration · Channel A file contract · Lifecycle hooks
- Evaluator path · Core vs Advanced Labs · Reality check
- Temporal facts · Optional Mem Engine · Self Curator (review-gated)
- Agent memory SOP · Deployment patterns · Product positioning
- 繁體中文文件: docs/zh/index.md
- Release notes · Changelog
Actively developed by a single maintainer; issues and reproducible bug reports are welcome. The roadmap favors small, receipt-backed, rollbackable releases over big-bang features — see docs/roadmap.md.
Dual-licensed under MIT OR Apache-2.0 — see LICENSE (MIT) and LICENSE-APACHE.