Drop your bank statement PDFs in a folder → get an interactive dashboard of where your money went. 100% local: parsing, database, and AI semantic search all run on your machine. No cloud, no API keys, no data leaves your computer.
npm install
npm run demo # generates 24 months of fake sample statements + starts the appOpen http://localhost:3789, enter the folder path shown below (or your real statements folder):
<project>/sample-statements
For real data: npm start, then point it at your Bank of America PDF statements folder.
- Startup scan + live watcher — on every app start it scans the configured folders for new PDFs (deduped by file hash, so nothing is imported twice), then keeps watching while running. Drop a new statement in and it's ingested within seconds.
- Multi-folder settings (⚙️) — add or remove statement folders anytime; removing a folder deletes exactly the data that came from it. The settings panel also shows the storage paths in use (SQLite DB file, config file, RAG model cache) with sizes.
- Global clear — one button wipes all folder paths, all imported data, and all embeddings (your PDF files are never touched). The cached embedding model is kept so you don't re-download it.
- PDF parsing — section-aware parser for Bank of America statement layouts (deposits / withdrawals / fees), with statement-period detection so transaction dates get the right year.
- SQLite storage — uses Node's built-in
node:sqlite, zero native dependencies. DB lives atdata/money-lens.db. - Auto-categorization — 18 categories via a merchant rules engine (Groceries, Dining, Subscriptions, Rent, …).
- Dashboards with filters — date range (with 3M/6M/1Y/All quick buttons), category, money in/out, amount range, free-text search. Category donut, monthly in/out/net flow, top merchants. Click a donut slice to drill in.
- 🔁 Recurring-charge detector — finds subscriptions and regular bills, shows cadence and lifetime total.
⚠️ Anomaly cards — unusually large charges (vs. your typical spend in that category) and possible duplicate charges.- 🤖 AI analyst (optional) — point the app at a local LLM (Ollama
http://localhost:11434or any OpenAI-compatible server like LM Studio) in the AI Model page. Your question → the model writes SQL → runs read-only against the local DB → the model explains the results, showing the exact SQL and a LOCAL/REMOTE badge. Setup instructions are on the page itself. - 💬 "Ask my money" (local RAG) — transaction descriptions are embedded with a local MiniLM model (transformers.js, ~25 MB, downloaded once and cached). Questions are answered by hybrid semantic + keyword retrieval; every number in the answer is computed from the matched rows. Works offline after first model download; degrades to keyword-only if the model can't be fetched.
- Node.js ≥ 22.5 (uses the built-in
node:sqlite)
src/server.js Express API + startup scan trigger
src/ingest.js folder scan, chokidar watcher, hash dedupe, embedding backfill
src/parser.js BofA PDF → transactions (section-aware state machine)
src/categorizer.js rules-based categories + merchant normalization
src/embeddings.js MiniLM local embeddings (lazy-loaded, graceful fallback)
src/rag.js hybrid semantic+keyword retrieval and grounded answers
src/insights.js recurring-charge + anomaly detection
public/ dashboard (Chart.js)
scripts/generate-samples.js fake statement generator for demos
data/ SQLite DB, config.json, cached embedding model (gitignored)
All processing is local. The only network access ever attempted is the one-time embedding-model download from Hugging Face; if blocked, the app still fully works with keyword search.
Delete data/money-lens.db* and restart to re-import everything from scratch.