HiveMem

Your second brain — and it stays yours. Forever. Local.

A sovereign personal knowledge system. The conversations, decisions, documents, and half-formed thoughts you produce across Claude, ChatGPT, Gemini, Copilot — and the files you accumulate in real life — all come home to one place that outlives any vendor and obeys only you.

Why HiveMem exists

When you think hard today, you often think with an LLM in the loop. School, work, authorities, court cases, taxes, family, health, relationships — these conversations contain your most private thinking. More intimate than any diary.

And then they evaporate:

Your subscription lapses or you switch providers → history gone
The provider retires a model or rewrites their ToS → answers no longer reproducible
An account ban, a provider going under, a country blocking the service → everything lost
The data sits on a vendor's servers, fed into training, served on subpoena, exposed in the next breach

HiveMem is built around the opposite stance:

Sovereignty — Your data lives in your instance. Postgres + SeaweedFS, on hardware you control. No vendor sees the contents unless you explicitly route a single LLM call through them.
Persistence — Everything is append-only with valid_from/valid_until. No subscription change can revoke access. No retention policy you didn't author can delete what's yours.
Portability — A HiveMem instance packs into one encrypted archive (Postgres dump + binary store + config) and restores anywhere. Vendor lock-in: zero.
Aggregation — What you write in Claude.ai, ChatGPT, Gemini, Claude Code, Copilot lands in HiveMem too. Those tools become front-ends; HiveMem holds the truth.
Privacy by realm — Strict separation per life area (legal, medical, private, work). Per-realm routing rules: anything touching authorities or health stays on local models, never reaches a cloud provider.

Knowledge doesn't rot here

The long-term goal is a periodic agent — the Queen — that wakes on a schedule, surveys your knowledge, and dispatches specialized worker agents (Bees) to flag isolated cells, stale facts, duplicate candidates, and realms drifting from their blueprint. Everything risky stays a proposal that flows through the existing approval workflow; you keep the kill switch.

Today the Queen and the isolated-cell Bee already run on the Vistierie agent runtime — scheduled (cron), dispatched with per-run cost accounting and a per-tenant kill switch — and their proposals flow through the approval workflow as pending tunnels. An admin-only Queen log UI (/queen) shows run history, event timelines, and the proposal queue. Still to come: a conversation UI that teaches the Queen your per-realm preferences, and further Bee types (stale-fact, duplicate-cell, blueprint-drift).

→ Roadmap — what's planned, what's partial, and the order of work.

→ Scientific foundations — the cognitive-science and PKM theory HiveMem's design is built on (Working Memory, Cognitive Load, Extended Mind, Forgetting Curve, Zettelkasten, PARA).

Docker images: ghcr.io/visterion/hivemem:main for the rolling main branch, plus semver tags such as ghcr.io/visterion/hivemem:9.1.5 for cut releases.

Highlights

6-Signal Ranked Search — Semantic similarity, keyword, recency, importance, popularity, and graph proximity — combined into one ranked result.
Temporal Knowledge Graph — Facts with valid_from/valid_until and multi-hop graph traversal; query the graph as it stood at any point in time.
Progressive Summarization — Four layers per cell: content, summary, key points, and insight. Never lose nuance.
Document & Scan Pipeline — the end-to-end picture of how any file becomes searchable knowledge: two entry points (watched folder + REST upload), one shared ingest core (hash → parse → dedup → store → cell), and four async enrichment paths (OCR · Vision · Kroki · Summarizer). The map that ties the doc features below together.
Long cells stay searchable — auto-summarizer turns multi-page documents into curated summaries that are embedded for semantic search; cost-capped, opt-in.
Scanned PDFs become searchable — Tesseract OCR extracts text from scan-only PDFs; combined with the auto-summarizer, even paper-mailed documents are findable by semantic search.
Consumption folder — auto document separation — Drop a stack of mixed scans into a network folder (SMB); HiveMem OCRs each page and uses a Vistierie LLM agent to split multi-page batches into individual documents by content — no separator or barcode sheets. The USP over Paperless-ngx; live in production.
Document-Type Extraction — invoices, contracts, and other typed documents are auto-classified during summarization; typed facts (vendor, amount, parties, dates) land in the knowledge graph.
Kroki + Vision — Diagram thumbnails (Mermaid/PlantUML/Graphviz/D2) and image description via Claude Haiku — async, opt-in, budget-capped.
Append-Only Versioning + Time Machine — No data is ever deleted. Query your knowledge at any point in time.
Agent Fleet + Approval Workflow — Agents write pending suggestions; only admins approve. Every write is human-gated.
Auto-Inject Hook for Claude Code — Relevant memories injected into every session automatically, before you even ask.
Full instance portability — Export the entire HiveMem instance (Postgres + attachments + identity) into one tar.gz, restore it on another host with one command. Mission promise made provable.
Bilingual UI (German/English, German-first) with a backend-configured default language.

→ Get started

Feature Status

Honest snapshot of what is shipping today versus what the surrounding prose describes as the long-term shape. See the roadmap for details on every 🟡 / 🔴 row.

Feature	Status	Notes
6-Signal Ranked Search	✅ Stable	semantic + keyword + recency + importance + popularity + graph proximity, all wired into one SQL ranker
Progressive Summarization	✅ Stable	content / summary / key points / insight, all four populated automatically
Auto-Summarizer for long cells	✅ Stable	summary is embedded for semantic search, cost-capped per realm
OCR for scanned PDFs	✅ Stable	Tesseract, async backfill, Vision fallback
Document-Type Extraction	✅ Stable	invoices/contracts/etc → typed facts in the knowledge graph
Kroki + Vision	✅ Stable	diagram thumbnails + Claude Haiku image description, opt-in, budget-capped
Append-Only Versioning + Time Machine	✅ Stable	`time_machine` queries by event time and ingestion time
Agent Approval Workflow	✅ Stable	every agent write lands as `pending` until an admin approves
Auto-Inject Hook (Claude Code)	✅ Stable	6-stage filter pipeline, Bearer-token auth
Full Instance Portability	✅ Stable	one-command tar.gz of Postgres + attachments + identity
OAuth Custom Connector	✅ Stable	RFC 8414 / 9728 discovery, PKCE
Temporal Knowledge Graph	🟡 Partial	bi-temporal facts and multi-hop traversal ship; automatic contradiction detection is not yet implemented
Privacy by Realm — model routing	🟡 Partial	data segregation by realm works; per-realm enforcement of "stays on local models" is not yet wired into the LLM call path
Queen + Bees periodic agent	🟡 Partial	Queen + isolated-cell-Bee run on Vistierie's agent runtime (cron, subagent dispatch, run/cost audit, kill switch); proposals land as `pending` tunnels via the approval workflow. An admin-only Queen-log UI (`/queen`) shows runs + event timelines and the proposal approval queue. Still missing: preference UI, further Bee types.
Consumption folder — auto document separation	✅ Stable	Drop a stack of mixed scans into a network folder; HiveMem ingests off a bounded worker pool, OCRs each page (auto-oriented), and uses a Vistierie LLM agent to split by content — no separator/barcode sheets. High-confidence splits → `committed`, low-confidence → `pending`. The HiveMem→Vistierie run contract is reconciled; live in production. Reassembly of non-contiguous/shuffled pages is a separate roadmap item.

Documentation


Vision	Cognitive-science and PKM foundations behind HiveMem's design
Getting Started	Prerequisites, embedding service, token creation, connect to Claude
The Structure	Realms, signals, topics, cells, tunnels — the knowledge hierarchy
Architecture	System diagram, data model, security matrix
Tools	All 46 MCP tools, the parallel REST attachment API, search signals, progressive summarization
Authentication	Roles, token management, security details
OAuth + Custom Connector	Add HiveMem as a Claude.ai/ChatGPT Custom Connector
Backup + Portability	Export and restore entire instances, disaster recovery, cloning
Hook Integration	Auto-inject context into Claude Code sessions
Operations	Deployment, migrations, debugging
Roadmap	What's planned, what's partial, order of work
Document & Scan Pipeline	End-to-end overview: entry points, shared ingest core, the four enrichment paths
Consumption Folder	Scan-to-folder ingest, automatic content-based document separation, config reference

License

HiveMem is fair-code licensed under the Sustainable Use License. Free for personal and internal business use. See LICENSING.md for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1,152 Commits
.github		.github
benchmarks		benchmarks
documentation		documentation
embedding-service		embedding-service
examples/claude-code-hook		examples/claude-code-hook
java-server		java-server
knowledge-ui		knowledge-ui
scripts		scripts
seaweedfs		seaweedfs
skills		skills
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTOR_LICENSE_AGREEMENT.md		CONTRIBUTOR_LICENSE_AGREEMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
LICENSING.md		LICENSING.md
README.md		README.md
RELEASE_NOTES_3.1.0.md		RELEASE_NOTES_3.1.0.md
RELEASE_NOTES_6.0.1.md		RELEASE_NOTES_6.0.1.md
RELEASE_NOTES_6.1.0.md		RELEASE_NOTES_6.1.0.md
RELEASE_NOTES_6.2.0.md		RELEASE_NOTES_6.2.0.md
RELEASE_NOTES_6.3.0.md		RELEASE_NOTES_6.3.0.md
RELEASE_NOTES_6.4.0.md		RELEASE_NOTES_6.4.0.md
RELEASE_NOTES_7.0.0.md		RELEASE_NOTES_7.0.0.md
RELEASE_NOTES_7.1.0.md		RELEASE_NOTES_7.1.0.md
RELEASE_NOTES_7.1.1.md		RELEASE_NOTES_7.1.1.md
RELEASE_NOTES_8.1.0.md		RELEASE_NOTES_8.1.0.md
RELEASE_NOTES_9.0.0.md		RELEASE_NOTES_9.0.0.md
RELEASE_NOTES_9.1.0.md		RELEASE_NOTES_9.1.0.md
RELEASE_NOTES_9.1.1.md		RELEASE_NOTES_9.1.1.md
RELEASE_NOTES_9.1.2.md		RELEASE_NOTES_9.1.2.md
RELEASE_NOTES_9.1.3.md		RELEASE_NOTES_9.1.3.md
RELEASE_NOTES_9.1.4.md		RELEASE_NOTES_9.1.4.md
RELEASE_NOTES_9.1.5.md		RELEASE_NOTES_9.1.5.md
RELEASE_NOTES_9.2.0.md		RELEASE_NOTES_9.2.0.md
RELEASE_NOTES_9.2.1.md		RELEASE_NOTES_9.2.1.md
RELEASE_NOTES_9.2.2.md		RELEASE_NOTES_9.2.2.md
RELEASE_NOTES_9.2.3.md		RELEASE_NOTES_9.2.3.md
RELEASE_NOTES_9.2.4.md		RELEASE_NOTES_9.2.4.md
RELEASE_NOTES_9.2.5.md		RELEASE_NOTES_9.2.5.md
RELEASE_NOTES_9.2.6.md		RELEASE_NOTES_9.2.6.md
SAFE.md		SAFE.md
SECURITY.md		SECURITY.md
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
package.json		package.json
safeskill.manifest.json		safeskill.manifest.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiveMem

Why HiveMem exists

Knowledge doesn't rot here

Highlights

Feature Status

Documentation

License

About

Uh oh!

Releases 24

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HiveMem

Why HiveMem exists

Knowledge doesn't rot here

Highlights

Feature Status

Documentation

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 24

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages