Skip to content

peopleworks/codeboarding-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ—ΊοΈ codeboarding-mcp

Living architecture docs for your codebase β€” as an MCP server.

Detect when your architecture actually changed β€” cheaply, with no LLM call β€” and regenerate the map only then.

.NET 9 MCP Built on CodeBoarding License: MIT PRs Welcome Stars


The problem

A coding agent or RAG is only as good as its mental model of your codebase. Architecture docs go stale the moment they're written. Two bad options:

  • πŸ”₯ Regenerate the map on every commit β†’ you pay for an LLM run thousands of times, mostly to re-discover that nothing structural changed.
  • 🧊 Never regenerate β†’ the agent reasons about a codebase that no longer exists.

CodeBoarding generates beautiful architecture maps (markdown + mermaid) and ships a CLI, a GitHub Action, and a VS Code extension β€” but no MCP server, and no way to know when a remap is even worth it.

The solution

codeboarding-mcp wraps the CodeBoarding CLI in a Model Context Protocol server and adds the missing piece: a cheap, no-LLM drift detector. It keeps a tiny fingerprint of each repo's architectural surface and only triggers an expensive remap when that surface shifts past a threshold you control.

   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   cheap, no LLM    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   expensive, only if stale   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚   status    β”‚ ─────────────────▢ β”‚     map      β”‚ ───────────────────────────▢ β”‚    get    β”‚
   β”‚ drift score β”‚   "is it stale?"   β”‚  regenerate  β”‚   runs CodeBoarding + LLM     β”‚  read map β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The payoff: architecture documentation that stays fresh on its own, at near-zero cost when nothing important changed β€” and a full remap exactly when it does. Point any MCP host (Claude Code, Codex, …) at it and the analysis runs locally β€” with Ollama, your code never leaves the machine.


✨ Highlights

  • 🧠 No-LLM drift detection β€” a Roslyn-based fingerprint of your public API surface. Comment and method-body edits don't move it; signature changes do.
  • πŸ’Έ Pay only when it matters β€” remap when drift crosses your threshold, not on every commit.
  • πŸ”Œ Provider-agnostic β€” local Ollama (private code), Anthropic (top quality), or any OpenAI-compatible endpoint (DeepSeek, OpenRouter, LiteLLM). Per-repo, persisted, keys never stored.
  • πŸ”’ Single-provider isolation β€” scrubs every provider env var from the child process, then sets exactly one, so CodeBoarding never errors on an ambiguous environment.
  • 🧩 Composable β€” 4 small tools any agent can call mid-conversation. Stdout is sacred (MCP protocol); all logs go to stderr.

πŸ› οΈ The four tools

Tool Cost What it does
codeboarding_status 🟒 cheap (no LLM) Drift score + stale flag + a recommendation. Call this before mapping.
codeboarding_map πŸ”΄ expensive (LLM) Runs CodeBoarding β†’ writes analysis.json to <repo>/.codeboarding/; updates the drift baseline on success.
codeboarding_get 🟒 cheap Reads the map back β€” parses analysis.json into a markdown overview + a rebuilt mermaid graph, or drills into one component.
codeboarding_configure 🟒 cheap Sets a repo's LLM provider, model, and drift threshold (persisted in the manifest).
Example: codeboarding_status on a never-mapped repo (instant, zero LLM cost)
{
  "repoPath": "/path/to/repo",
  "stale": true,
  "neverMapped": true,
  "driftScore": 1,
  "threshold": 0.15,
  "reason": "No prior map β€” a full analysis is needed.",
  "recommendation": "Run codeboarding_map with mode=\"full\".",
  "provider": "Ollama Β· qwen2.5-coder:7b",
  "totalComponents": 13,
  "added": ["src/.../Fingerprint.cs", "src/.../CodeboardingTools.cs", "..."],
  "removed": [],
  "changed": []
}
Example: codeboarding_get output (markdown + a mermaid graph rebuilt from analysis.json)
# CodeBoarding architecture map

A sample service that ingests records, validates them, and persists results.

## Components (3)
### Ingestor
Reads incoming records from the source and hands them to the Validator.
### Validator
Checks record shape and business rules; rejects bad input.
### Record Store
Persists validated records and exposes them for query.

## Relations
```mermaid
graph LR
    C0["Ingestor"]
    C1["Validator"]
    C2["Record Store"]
    C0 -->|"sends records to"| C1
    C1 -->|"writes to"| C2
```

🧬 How drift detection works

codeboarding-mcp builds a per-file architectural surface hash β€” cheaply, without an LLM β€” and compares it against the last-mapped baseline. The drift score is simply changed components / union, and a repo is stale once that score reaches your threshold (default 0.15).

File type What counts as a change
C# (.cs) public / protected / internal type & member signatures (via Roslyn). Comments and method bodies are ignored β€” only the API surface moves the hash.
Dependency manifests (*.csproj, package.json, requirements.txt, go.mod, Cargo.toml, pom.xml, …) full content hash β€” dependency changes are architectural.
Other source files path-only hash β€” add / remove / rename counts; content edits don't (yet).

v1 note: true API-surface drift is C#-only today. Other languages use a structural (path) signal β€” a stronger exported-symbol extractor is on the roadmap.


πŸ—οΈ Architecture

A self-referential taste of what CodeBoarding maps β€” here's codeboarding-mcp itself:

graph TD
    Host["MCP Host<br/>(Claude Code / Codex)"] -->|stdio JSON-RPC| Program["Program.cs<br/>MCP stdio host"]
    Program --> Tools["CodeboardingTools<br/>status Β· map Β· get Β· configure"]

    Tools -->|cheap, no LLM| Drift["Fingerprint + DriftCalculator<br/>Roslyn surface hash"]
    Tools -->|read map| Reader["AnalysisReader<br/>analysis.json β†’ md + mermaid"]
    Tools -->|persist config / baseline| Manifest["RepoManifest<br/>.codeboarding/.manifest.json"]
    Tools -->|expensive, LLM| Runner["CodeboardingRunner<br/>shells out to the CLI"]

    Runner --> Env["ProviderEnvironment<br/>scrub-all β†’ set one provider"]
    Runner -->|child process| CLI["codeboarding CLI<br/>(Python)"]
    CLI -->|writes| Analysis["analysis.json"]
    Reader -->|reads| Analysis

    Env -.-> Ollama["Ollama (local)"]
    Env -.-> Anthropic["Anthropic"]
    Env -.-> OpenAI["OpenAI-compatible<br/>DeepSeek Β· OpenRouter Β· LiteLLM"]
Loading

πŸš€ Quick start

1. Prerequisites

  • .NET 9 SDK (or newer).
  • Python 3.12 or 3.13 + the CodeBoarding CLI:
    pipx install codeboarding --python python3.13
    codeboarding-setup          # one-time: downloads language servers
  • An LLM backend β€” either local Ollama:
    ollama pull qwen2.5-coder:7b
    …or an API key for a cloud provider (Anthropic / DeepSeek / …).

2. Build

dotnet build src/CodeboardingMcp/CodeboardingMcp.csproj -c Release

3. Register with your MCP host

claude mcp add codeboarding -- \
  dotnet /path/to/codeboarding-mcp/src/CodeboardingMcp/bin/Release/net9.0/codeboarding-mcp.dll

If the CodeBoarding CLI isn't on PATH, point to it with the CODEBOARDING_CLI environment variable.

4. Use it

Just ask your agent β€” for example:

"Configure codeboarding for this repo with Ollama, check if the map is stale, and update it if so."

…or call the tools directly:

configure  β†’  status  β†’  (if stale) map  β†’  get
   set         cheap        expensive       read map
 provider      check        regen only      for RAG /
 per repo                  when it matters    agent

πŸ”Œ Choosing a provider

Each repo picks its own backend, stored in <repo>/.codeboarding/.manifest.json. API keys are read from a named environment variable and never written to the manifest.

Kind Selects Best for
ollama local Ollama (OLLAMA_BASE_URL) πŸ”’ private / client code β€” nothing leaves the machine
anthropic ANTHROPIC_API_KEY πŸ† public repos, highest-quality maps
openai-compatible OPENAI_BASE_URL + key env 🧩 DeepSeek, OpenRouter, LiteLLM, any proxy
// codeboarding_configure arguments
{
  "repoPath": "/path/to/repo",
  "kind": "anthropic",            // or "ollama" / "openai-compatible"
  "model": "claude-sonnet-4-6",
  "apiKeyEnv": "ANTHROPIC_API_KEY",
  "driftThreshold": 0.15
}

Adding a new OpenAI-compatible provider (e.g. DeepSeek) is just configuration β€” no code change.

Quality caveat (honest): a local 7B model is fast and private but often too weak for clean component extraction (it can fail CodeBoarding's internal validation). For production-grade maps, use a cloud provider on a public repo, or a larger local model. Keep private code on local Ollama and accept the quality trade-off.


πŸ“‚ What gets written

Everything lives under <repo>/.codeboarding/:

Path Written by Purpose
analysis.json CodeBoarding CLI the architecture map (description, components, components_relations)
.manifest.json this server per-repo provider, threshold, and drift baseline
cache/, logs/, health/, static_analysis.pkl CodeBoarding CLI run artifacts

Add .codeboarding/ to your .gitignore (or commit analysis.json if you want the map in version control).


πŸ—ΊοΈ Roadmap

  • 4 MCP tools, Roslyn drift, generic provider, single-provider env scrubbing
  • Parse analysis.json β†’ markdown + rebuilt mermaid in codeboarding_get
  • Stronger non-C# drift (exported-symbol extraction beyond path-only)
  • Auto-triggers (git hook / scheduled sweep) β€” v1 is in-session / agent-driven
  • CI + published binaries

🀝 Contributing

Issues and PRs are very welcome β€” new language fingerprinters, provider presets, and quality reports are especially appreciated. Please keep stdout clean (MCP protocol) and route all logging to stderr.

πŸ“„ License

MIT. Built on CodeBoarding (MIT) β€” please ⭐ them too.

Made for agents that deserve to know what your codebase actually looks like.

About

Living architecture docs as an MCP server - no-LLM drift detection + on-demand CodeBoarding maps for any repo.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages