Skip to content

arnaudstdr/copain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

218 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

copain

copain

A self-hosted personal assistant designed to absorb mental load β€” not pile more on.

Talk to it in plain French β€” by voice, by chat, or just by walking out the door. It remembers, plans, budgets and gently hands things back to you when they matter, and it runs entirely on your own hardware. No cloud account, no data broker, no feed fighting for your attention.


Python FastAPI Ollama ChromaDB PWA Docker
CI Ruff mypy: strict Tests Raspberry Pi 5


PWA dashboard β€” weather, next event, tasks, budget cards Chat mode β€” streaming reply with web search 'Pour toi' overlay β€” gentle restitution of a recurring thought
Dashboard β€” your day at a glance Chat β€” streamed, in plain French Β« Pour toi Β» β€” it hands thoughts back, gently

The problem it solves

A busy mind drops things: a worry you can't park, an idea you'll forget, a bill you meant to log, an appointment that clashes with another. Most "productivity" apps answer that by adding to the pile β€” more notifications, more badges, more inbound noise.

copain takes the opposite stance. Every feature has to pass one filter: does this get something out of the user's head, or does it add to it? It's a backup brain, not a to-do tyrant.

  • 🧠 It absorbs, it doesn't nag. No unsolicited pushes. The dashboard is pull-only β€” information surfaces when you reach for it, never the other way around.
  • πŸ—£οΈ One assistant, three doors. A PWA dashboard, a Siri voice command, and silent geofence automations β€” all served by the same FastAPI core.
  • πŸ”’ It's yours. Self-hosted on a Raspberry Pi over Tailscale. Your profile, finances, location and calendar never leave your network β€” and never enter the git history.

In one line: a single-user assistant β€” natural-language pipeline, semantic memory, calendar/budget/weather integrations, an installable PWA and a Pi deployment β€” designed, built and shipped solo.


Highlights

🧠 Cognitive offloading, by design

Drop a parasitic thought β€” a worry, an idea, a note β€” and it's acknowledged in 1–3 words, stored and embedded. A Β« Pour toi Β» card later surfaces what's worth a second look (a worry now closeable against a past event, a rumination loop, a stale idea) β€” pulled, never pushed.

πŸšͺ One brain, three entry points

A PWA dashboard (Safari "Add to Home Screen"), a Siri voice shortcut ("Dis Γ  Copain…", TTS-friendly answers), and geofence automations that post arrival/departure events. Same FastAPI core, transport-agnostic pipeline.

🧭 LLM routing via a <meta> block

Every reply ends with a strict JSON <meta> block routing the request into one of 10 intents (task, event, expense, depot, weather…). The code runs the side effects; the model only decides. One model, no brittle function-calling glue.

πŸ—‚οΈ Memory that knows you

Semantic memory (ChromaDB + embeddings) recalls past context, and a hand-edited profile (name, family, work, routines) is injected as stable facts into every prompt β€” so the assistant doesn't have to re-discover who you are on each turn.

πŸ“† Real-life integrations

iCloud calendar (CalDAV, fuzzy calendar match, overlap warnings), budget anchored on your salary cycle (natural-language or form entry β€” same engine, zero drift), weather that follows you (home ↔ work via geofence), RSS/news curation, fuel prices.

πŸ›‘οΈ Private, resilient, observable

Tailscale-only access + shared-secret X-API-Key. A local LLM fallback when the cloud is unreachable. Opt-in Sentry and Pushover. A TTL cache to spare the LLM. Strictly opt-in proactivity, with five layered safeguards.


Tech stack

Layer Choice Why
Core Python 3.12 Β· async/await throughout Β· FastAPI + uvicorn Single async HTTP core behind every entry point
LLM Ollama β€” gemma4:31b-cloud (multimodal) + local fallback One model, routed by a <meta> block; degrades gracefully
Memory ChromaDB (HNSW) Β· nomic-embed-text embeddings Semantic recall without a managed vector DB
Data SQLAlchemy 2 async Β· aiosqlite Β· APScheduler Tasks, thoughts, budget cycles, persisted reminders
Integrations CalDAV (iCloud) Β· Open-Meteo Β· SearXNG Β· Pushover Β· Sentry Real third-party services, real fail-soft handling
Frontend Vanilla-JS PWA (ES6 modules, zero build step) Installable, iOS-native feel, no toolchain to rot
Quality pytest (690+ tests, fully mocked) Β· Ruff Β· mypy strict Β· pre-commit Typed, linted, green on every push
Deploy Docker Β· Raspberry Pi 5 Β· Tailscale Self-hosted, private by construction

Architecture

Everything flows through one pipeline: the LLM decides the intent, the code executes the side effects, then a text reply comes back. Proactive notifications run on a separate autonomous job β€” no LLM, no routing.

iOS Shortcut Β· Siri Β· PWA   ──►  FastAPI core (X-API-Key)  ──►  Pipeline (transport-agnostic)
   (over Tailscale)                     β”‚                            β”‚
                                        β”‚                            β”œβ”€ <meta> intent router (10 intents)
                                        β”‚                            └─ side effects ──┐
                                        β”‚                                              β–Ό
                                        β”œβ”€ Memory (ChromaDB + embeddings)   Tasks Β· Thoughts Β· Budget
                                        β”œβ”€ Calendar (CalDAV) Β· Weather Β· Search Β· RSS Β· Fuel
                                        └─ Proactivity job (autonomous, no LLM) ──► Pushover / PWA queue
bot/
β”œβ”€β”€ api.py            # FastAPI app β€” every endpoint behind X-API-Key
β”œβ”€β”€ pipeline/         # transport-agnostic core: intent routing + side effects + streaming
β”œβ”€β”€ llm/              # Ollama client, system prompt, <meta> parsing, TTL cache
β”œβ”€β”€ memory/           # ChromaDB semantic memory + embeddings
β”œβ”€β”€ thoughts/         # cognitive deposits + Β« Pour toi Β» restitution heuristics
β”œβ”€β”€ finance/          # budget cycles, expense manager, CSV export, reminder cron
β”œβ”€β”€ calendar/ weather/ search/ rss/ news/ fuel/ locations/   # real-life integrations
β”œβ”€β”€ tasks/ notifications/ proactivity/                       # reminders + opt-in pushes
└── static/           # vanilla-JS PWA (ES6 modules, zero build step)

Engineering decisions

The choices below are where the design effort went β€” the part worth a conversation.

Why route through a <meta> JSON block instead of function-calling?

Native function-calling locks you to a specific API and degrades unpredictably across models. By having the LLM emit a strict <meta> block at the end of a normal reply, the routing logic stays in my code: one model, no vendor glue, and the same pipeline serves voice, chat and image inputs. The block is parsed out before the user ever sees the reply, and an invalid block fails soft rather than crashing the turn.

Why is the assistant strictly "pull-only", with no morning briefing?

This is the product's backbone, expressed as a constraint: an assistant meant to reduce mental load can't be a new source of interruptions. So spontaneous pushes are off by default, the morning briefing was deliberately removed, and even the restitution card ("here's a thought worth revisiting") is fetched on tap β€” never pushed. Proactivity exists, but it's opt-in and wrapped in five safeguards (time window, daily budget, per-kind cooldown, dedup, feature flag).

Why one shared LLM with a local fallback instead of several cloud APIs?

A personal assistant has to keep working when the network doesn't. A single cloud model (gemma4:31b-cloud) handles the rich path; when it's unreachable, a small local model (gemma3:4b) takes over so the bot still answers β€” fallback replies are never cached, so quality recovers the moment the cloud is back. One provider also means one prompt to tune and one cost to reason about.

Why let budget be entered both by chat and by form?

Natural language is great for "j'ai dΓ©pensΓ© 12 € de cafΓ©", but a form is faster for deliberate entry β€” so copain offers both. The trick: both channels call the exact same ExpenseManager methods, so there is no second code path and no way for the two to disagree on the budget math. The form simply skips the LLM intent step.

Why a vanilla-JS PWA with zero build step?

The frontend is native ES6 modules served straight by FastAPI β€” no bundler, no node_modules, no build to rot. Assets are cache-busted with ?v=N bumped on deploy. The payoff is an installable, iOS-native-feeling app (splash screen, fullscreen, glass scroll edges) that I can still understand and ship in five years without resurrecting a toolchain.


Privacy & data

copain is single-user and built to keep your life on your own network:

  • Network layer β€” reachable only over Tailscale; the public internet never sees it.
  • Auth layer β€” every endpoint requires a shared-secret X-API-Key; anything else is a logged 403.
  • Repo layer β€” your profile, finances, location history, memory store and calendar credentials are all gitignored and never committed. The repo ships templates (profile.example.yaml, .env.example), never real data.

Quality & rigor

  • 690+ tests across 43 modules, fully mocked β€” no external services, no network, no flakiness.
  • mypy strict + Ruff (lint & format) enforced via pre-commit and CI.
  • async/await end to end; pure heuristics (restitution, budget math) isolated and unit-tested.
make test            # 690+ tests
make lint typecheck  # ruff + mypy strict

Run it yourself

Setup, configuration & deployment (click to expand)

Local (dev)

cp .env.example .env                              # fill in the variables
cp data/profile.example.yaml data/profile.yaml    # edit with your info
make install                                       # .venv + deps + pre-commit
make test                                          # 690+ tests, fully mocked
make run                                            # uvicorn (needs Ollama + SearXNG)

Essential .env variables

See .env.example for the full list. The essentials:

  • API_KEY β€” shared secret for X-API-Key (generate something random).
  • ICLOUD_USERNAME / ICLOUD_APP_PASSWORD β€” Apple ID + an App-Specific Password.
  • ICLOUD_CALENDAR_NAME β€” default calendar (fuzzy match: Personnel β†’ 🧘 Personnel).
  • HOME_LAT / HOME_LON / HOME_CITY and WORK_* β€” weather + location context.
  • PUSHOVER_TOKEN / PUSHOVER_USER, SENTRY_DSN β€” optional (push notifs, monitoring).

iOS configuration

Two Shortcuts on the iPhone β€” see docs/ios-shortcuts.md:

  1. "Dis Γ  Copain" β€” Siri voice shortcut for hands-free interaction.
  2. Geofence automations β€” 4 silent automations posting to /event/location.

The PWA needs no setup: open https://<pi-tailscale-host>:8000/ in Safari and "Add to Home Screen".

Docker (Raspberry Pi 5)

make docker-build
make docker-up
docker logs -f copain-bot-1

Ollama runs outside Docker on the Pi (for ARM GPU/NPU access); the container uses network_mode: host and reaches Ollama on localhost:11434.


Documentation

  • CLAUDE.md β€” detailed architecture, conventions, system-prompt structure, full tree.
  • docs/ios-shortcuts.md β€” Siri voice command + geofence automations.
  • .env.example β€” environment variable template.

About

This project is the kind of work I enjoy most: owning a product end to end, from the natural-language pipeline to a polished iOS PWA β€” with a strong opinion on what it should refuse to do. Always happy to talk shop about local LLMs, assistant design, or self-hosted, privacy-first products.

πŸ“« Find me on my GitHub profile.

About

A self-hosted, French-speaking assistant designed to absorb mental load, not pile more on. Voice + chat + geofence, semantic memory, privacy-first, runs on a Raspberry Pi.

Topics

Resources

Stars

Watchers

Forks

Contributors