Talk to it in plain French β by voice, by chat, or just by walking out the door. It remembers, plans, budgets and gently hands things back to you when they matter, and it runs entirely on your own hardware. No cloud account, no data broker, no feed fighting for your attention.
![]() |
![]() |
![]() |
| Dashboard β your day at a glance | Chat β streamed, in plain French | Β« Pour toi Β» β it hands thoughts back, gently |
A busy mind drops things: a worry you can't park, an idea you'll forget, a bill you meant to log, an appointment that clashes with another. Most "productivity" apps answer that by adding to the pile β more notifications, more badges, more inbound noise.
copain takes the opposite stance. Every feature has to pass one filter: does this get something out of the user's head, or does it add to it? It's a backup brain, not a to-do tyrant.
- π§ It absorbs, it doesn't nag. No unsolicited pushes. The dashboard is pull-only β information surfaces when you reach for it, never the other way around.
- π£οΈ One assistant, three doors. A PWA dashboard, a Siri voice command, and silent geofence automations β all served by the same FastAPI core.
- π It's yours. Self-hosted on a Raspberry Pi over Tailscale. Your profile, finances, location and calendar never leave your network β and never enter the git history.
In one line: a single-user assistant β natural-language pipeline, semantic memory, calendar/budget/weather integrations, an installable PWA and a Pi deployment β designed, built and shipped solo.
|
Drop a parasitic thought β a worry, an idea, a note β and it's acknowledged in 1β3 words, stored and embedded. A Β« Pour toi Β» card later surfaces what's worth a second look (a worry now closeable against a past event, a rumination loop, a stale idea) β pulled, never pushed. |
A PWA dashboard (Safari "Add to Home Screen"), a Siri voice shortcut ("Dis Γ Copainβ¦", TTS-friendly answers), and geofence automations that post arrival/departure events. Same FastAPI core, transport-agnostic pipeline. |
|
Every reply ends with a strict JSON |
Semantic memory (ChromaDB + embeddings) recalls past context, and a hand-edited profile (name, family, work, routines) is injected as stable facts into every prompt β so the assistant doesn't have to re-discover who you are on each turn. |
|
iCloud calendar (CalDAV, fuzzy calendar match, overlap warnings), budget anchored on your salary cycle (natural-language or form entry β same engine, zero drift), weather that follows you (home β work via geofence), RSS/news curation, fuel prices. |
Tailscale-only access + shared-secret |
| Layer | Choice | Why |
|---|---|---|
| Core | Python 3.12 Β· async/await throughout Β· FastAPI + uvicorn |
Single async HTTP core behind every entry point |
| LLM | Ollama β gemma4:31b-cloud (multimodal) + local fallback |
One model, routed by a <meta> block; degrades gracefully |
| Memory | ChromaDB (HNSW) Β· nomic-embed-text embeddings |
Semantic recall without a managed vector DB |
| Data | SQLAlchemy 2 async Β· aiosqlite Β· APScheduler | Tasks, thoughts, budget cycles, persisted reminders |
| Integrations | CalDAV (iCloud) Β· Open-Meteo Β· SearXNG Β· Pushover Β· Sentry | Real third-party services, real fail-soft handling |
| Frontend | Vanilla-JS PWA (ES6 modules, zero build step) | Installable, iOS-native feel, no toolchain to rot |
| Quality | pytest (690+ tests, fully mocked) Β· Ruff Β· mypy strict Β· pre-commit | Typed, linted, green on every push |
| Deploy | Docker Β· Raspberry Pi 5 Β· Tailscale | Self-hosted, private by construction |
Everything flows through one pipeline: the LLM decides the intent, the code executes the side effects, then a text reply comes back. Proactive notifications run on a separate autonomous job β no LLM, no routing.
iOS Shortcut Β· Siri Β· PWA βββΊ FastAPI core (X-API-Key) βββΊ Pipeline (transport-agnostic)
(over Tailscale) β β
β ββ <meta> intent router (10 intents)
β ββ side effects βββ
β βΌ
ββ Memory (ChromaDB + embeddings) Tasks Β· Thoughts Β· Budget
ββ Calendar (CalDAV) Β· Weather Β· Search Β· RSS Β· Fuel
ββ Proactivity job (autonomous, no LLM) βββΊ Pushover / PWA queue
bot/
βββ api.py # FastAPI app β every endpoint behind X-API-Key
βββ pipeline/ # transport-agnostic core: intent routing + side effects + streaming
βββ llm/ # Ollama client, system prompt, <meta> parsing, TTL cache
βββ memory/ # ChromaDB semantic memory + embeddings
βββ thoughts/ # cognitive deposits + Β« Pour toi Β» restitution heuristics
βββ finance/ # budget cycles, expense manager, CSV export, reminder cron
βββ calendar/ weather/ search/ rss/ news/ fuel/ locations/ # real-life integrations
βββ tasks/ notifications/ proactivity/ # reminders + opt-in pushes
βββ static/ # vanilla-JS PWA (ES6 modules, zero build step)
The choices below are where the design effort went β the part worth a conversation.
Why route through a <meta> JSON block instead of function-calling?
Native function-calling locks you to a specific API and degrades unpredictably across
models. By having the LLM emit a strict <meta> block at the end of a normal reply,
the routing logic stays in my code: one model, no vendor glue, and the same pipeline
serves voice, chat and image inputs. The block is parsed out before the user ever sees
the reply, and an invalid block fails soft rather than crashing the turn.
Why is the assistant strictly "pull-only", with no morning briefing?
This is the product's backbone, expressed as a constraint: an assistant meant to reduce mental load can't be a new source of interruptions. So spontaneous pushes are off by default, the morning briefing was deliberately removed, and even the restitution card ("here's a thought worth revisiting") is fetched on tap β never pushed. Proactivity exists, but it's opt-in and wrapped in five safeguards (time window, daily budget, per-kind cooldown, dedup, feature flag).
Why one shared LLM with a local fallback instead of several cloud APIs?
A personal assistant has to keep working when the network doesn't. A single cloud model
(gemma4:31b-cloud) handles the rich path; when it's unreachable, a small local model
(gemma3:4b) takes over so the bot still answers β fallback replies are never cached, so
quality recovers the moment the cloud is back. One provider also means one prompt to tune
and one cost to reason about.
Why let budget be entered both by chat and by form?
Natural language is great for "j'ai dΓ©pensΓ© 12 β¬ de cafΓ©", but a form is faster for
deliberate entry β so copain offers both. The trick: both channels call the exact same
ExpenseManager methods, so there is no second code path and no way for the two to
disagree on the budget math. The form simply skips the LLM intent step.
Why a vanilla-JS PWA with zero build step?
The frontend is native ES6 modules served straight by FastAPI β no bundler, no
node_modules, no build to rot. Assets are cache-busted with ?v=N bumped on deploy. The
payoff is an installable, iOS-native-feeling app (splash screen, fullscreen, glass scroll
edges) that I can still understand and ship in five years without resurrecting a toolchain.
copain is single-user and built to keep your life on your own network:
- Network layer β reachable only over Tailscale; the public internet never sees it.
- Auth layer β every endpoint requires a shared-secret
X-API-Key; anything else is a logged 403. - Repo layer β your profile, finances, location history, memory store and calendar
credentials are all gitignored and never committed. The repo ships templates
(
profile.example.yaml,.env.example), never real data.
- 690+ tests across 43 modules, fully mocked β no external services, no network, no flakiness.
- mypy strict + Ruff (lint & format) enforced via pre-commit and CI.
async/awaitend to end; pure heuristics (restitution, budget math) isolated and unit-tested.
make test # 690+ tests
make lint typecheck # ruff + mypy strictSetup, configuration & deployment (click to expand)
cp .env.example .env # fill in the variables
cp data/profile.example.yaml data/profile.yaml # edit with your info
make install # .venv + deps + pre-commit
make test # 690+ tests, fully mocked
make run # uvicorn (needs Ollama + SearXNG)See .env.example for the full list. The essentials:
API_KEYβ shared secret forX-API-Key(generate something random).ICLOUD_USERNAME/ICLOUD_APP_PASSWORDβ Apple ID + an App-Specific Password.ICLOUD_CALENDAR_NAMEβ default calendar (fuzzy match:Personnelβπ§ Personnel).HOME_LAT/HOME_LON/HOME_CITYandWORK_*β weather + location context.PUSHOVER_TOKEN/PUSHOVER_USER,SENTRY_DSNβ optional (push notifs, monitoring).
Two Shortcuts on the iPhone β see docs/ios-shortcuts.md:
- "Dis Γ Copain" β Siri voice shortcut for hands-free interaction.
- Geofence automations β 4 silent automations posting to
/event/location.
The PWA needs no setup: open https://<pi-tailscale-host>:8000/ in Safari and
"Add to Home Screen".
make docker-build
make docker-up
docker logs -f copain-bot-1Ollama runs outside Docker on the Pi (for ARM GPU/NPU access); the container uses
network_mode: host and reaches Ollama on localhost:11434.
CLAUDE.mdβ detailed architecture, conventions, system-prompt structure, full tree.docs/ios-shortcuts.mdβ Siri voice command + geofence automations..env.exampleβ environment variable template.
This project is the kind of work I enjoy most: owning a product end to end, from the natural-language pipeline to a polished iOS PWA β with a strong opinion on what it should refuse to do. Always happy to talk shop about local LLMs, assistant design, or self-hosted, privacy-first products.
π« Find me on my GitHub profile.


