Local AI-powered assistant for STM32 microcontrollers
RAG over official ST reference manuals · Deterministic calculators · C, C++, and embedded Rust
Release status: v0.1.0-alpha.1 is an alpha technical preview for
developers running STM32 Assistant from a local Python environment with
Ollama or LM Studio.
STM32 Assistant is a local, offline AI tool that answers questions about STM32 peripheral configuration. Instead of relying on the LLM's memory (which hallucinates register values and clock math), it:
- Retrieves relevant pages from official ST reference manual PDFs (hybrid keyword + embedding search)
- Computes exact register values using deterministic calculators (PLL divisors, baud rates, I2C timing, etc.)
- Injects both into the LLM prompt so the generated code is grounded in real documentation and correct math
- Returns three parallel code solutions — HAL, LL/PAC, and direct register-level — plus expert pillars and a pedagogy explanation
The desktop application also keeps short conversation context, provides structured peripheral lessons, and includes deterministic troubleshooting and visualization tools.
Why? LLMs are great at code structure but terrible at STM32 arithmetic.
PLLM=4, PLLN=180, PLLP=2isn't something you want an LLM to guess — it should be calculated. This tool does the math deterministically and lets the LLM focus on code generation.
┌─────────────────────────────────────────────────────────────────────────┐
│ STM32 Assistant Local AI-powered embedded systems expert ● API│
│ qwen2.5-coder:14b│
├─────────────────────────────────────────────────────────────────────────┤
│ Chip: [F446RE ▾] (•) C ( ) C++ ( ) Rust LLM: [Ollama ▾] │
├─────────────────────────────────────────────────────────────────────────┤
│ ┌─────────────────────────────────────────────────────────┐ [ Ask ]│
│ │ Ask about STM32 peripherals... │ ● ● ● │
│ │ (e.g. 'configure clock for 180 MHz') │ │
│ └─────────────────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────────────────────────┤
│ ┌── HAL ──┬── LL ──┬── Register ──┐ │
│ │ │ │
│ │ // Using HAL functions │ │
│ │ RCC_OscInitTypeDef RCC_OscInitStruct = {0}; │
│ │ RCC_ClkInitTypeDef RCC_ClkInitStruct = {0}; │
│ │ │ │
│ │ RCC_OscInitStruct.PLL.PLLM = 4; │ │
│ │ RCC_OscInitStruct.PLL.PLLN = 180; │ │
│ │ RCC_OscInitStruct.PLL.PLLP = RCC_PLLP_DIV2; │
│ │ │ │
│ │ if (HAL_RCC_OscConfig(&RCC_OscInitStruct) != HAL_OK) { │
│ │ Error_Handler(); │ │
│ │ } │ │
│ └───────────────────────────────────────┘ │
├─────────────────────────────────────────────────────────────────────────┤
│ ┌─ Expert Pillars ──────────────┬─ Pedagogy ───────────────────────────┐│
│ │ Side Effects: │ The code configures the STM32F446RE's ││
│ │ Flash latency must be set │ system clock by enabling the HSE ││
│ │ to 5 wait states at 180 MHz │ oscillator, configuring the PLL with ││
│ │ │ the exact M/N/P divisors, and ││
│ │ Resource Cost: │ switching the system clock source... ││
│ │ ~2KB Flash, 0 SRAM │ ││
│ │ │ ││
│ │ Verification: │ ││
│ │ Check RCC->CR & RCC_CR_PLLRDY│ ││
│ └───────────────────────────────┴───────────────────────────────────────┘│
├─────────────────────────────────────────────────────────────────────────┤
│ Status: Done │
└─────────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────┐
│ User Query │
│ "configure system clock for 180 MHz" │
└──────────────────┬───────────────────────────┘
│
┌──────────────────▼───────────────────────────┐
│ ExpertEngine │
│ (src/engine/expert_engine.py) │
└──────┬───────────────────────┬────────────────┘
│ │
┌────────────▼──────────┐ ┌─────────▼──────────────┐
│ Retrieval Layer │ │ Calculators Layer │
│ (hybrid search) │ │ (deterministic math) │
├───────────┬───────────┤ ├─────────────────────────┤
│ Keyword │ Embedding │ │ clock_pll │ usart_baud │
│ (FTS5) │ (sqlite- │ │ i2c_timing│ adc_sample │
│ │ vec) │ │ timer_pwm │ │
└─────┬─────┴─────┬─────┘ └──────────┬──────────────┘
│ │ │
┌─────▼─────┐ ┌───▼──────┐ ┌─────▼──────────────┐
│ pages │ │ vec_pages│ │ data/profiles/ │
│ table │ │ table │ │ F446RE.json │
│ (FTS5) │ │ (768-dim)│ │ WL5x.json │
└───────────┘ └──────────┘ └────────────────────┘
│ │ │
└──────┬────┴────────────────────┘
│
┌──────▼───────────────────────────────────────┐
│ Jinja2 Prompt Template │
│ "Context + Calculator output → JSON schema" │
└──────┬───────────────────────────────────────┘
│
┌──────▼───────────────────────────────────────┐
│ LLM Client (OpenAI-compatible) │
├──────────────────────┬───────────────────────┤
│ Ollama (:11434/v1) │ LM Studio (:1234/v1) │
│ qwen2.5-coder:14b │ any loaded model │
└──────────────────────┴───────────────────────┘
│
┌──────▼───────────────────────────────────────┐
│ JSON Response │
│ { hal, ll, register, expert_pillars, │
│ pedagogy } │
└──────────────────────────────────────────────┘
The headline feature. LLMs can't do STM32 clock math reliably — so we don't ask them to.
| Calculator | Trigger Keywords | What It Computes |
|---|---|---|
| Clock PLL | clock, pll, mhz, frequency |
PLLM, PLLN, PLLP, VCO frequency, Flash latency |
| USART Baud | baud, usart, uart |
BRR divisor, actual baud rate, error % |
| I2C Timing | i2c, twi |
CCR + TRISE (v1) or TIMINGR register (v2) |
| ADC Sample | adc, sample time |
Conversion cycles, total time, sample rate |
| Timer PWM | pwm, timer, duty |
PSC, ARR, CCR, actual frequency, duty cycle |
Example: Querying "configure clock for 180 MHz on F446RE" produces:
Calculator output (injected into prompt):
PLLM = 4 (VCO input = 8 MHz / 4 = 2 MHz)
PLLN = 180 (VCO = 2 MHz × 180 = 360 MHz)
PLLP = 2 (SYSCLK = 360 / 2 = 180 MHz ✓)
Flash Latency = 5 wait states
The LLM receives these exact values and writes code around them — no arithmetic hallucination possible.
User Query
│
▼
┌─────────────────────────────────────────────────────┐
│ Keyword Search (FTS5) │
│ - BM25 ranking across all indexed pages │
│ - Exact register-name boost (e.g. RCC_PLLCFGR) │
│ - Filtered to the chip's reference manual PDF │
└──────────────────────┬──────────────────────────────┘
│ top 20 candidates
▼
┌─────────────────────────────────────────────────────┐
│ Embedding Rerank (sqlite-vec) │
│ - nomic-embed-text (768-dim) via Ollama │
│ - Cosine distance reranking │
│ - Falls back to keyword-only if embeddings offline │
└──────────────────────┬──────────────────────────────┘
│ top 5 context chunks
▼
Injected into LLM prompt
Every response includes three levels of abstraction so you can pick what fits your project:
- C/C++ — HAL, LL, and direct register writes.
- Rust HAL — chip-family
no_stdHAL crate, PAC access, and unsafe raw-register access. - Rust Embassy — async Embassy APIs, PAC access, and unsafe raw-register access.
Rust dependency versions, target triples, chip features, and device assumptions come from the selected chip profile and are included in the pedagogy panel.
Plus:
- Expert Pillars — side effects, resource cost, and verification steps
- Pedagogy — 2-3 sentence technical explanation of how it works
No cloud calls. Everything runs on your machine:
- LLM: Ollama or LM Studio (OpenAI-compatible API)
- Embeddings:
nomic-embed-textvia Ollama (274 MB) - Storage: SQLite + sqlite-vec (single file, no server)
- PDFs: Parsed locally with pypdf + pdfplumber
git clone <repository-url> stm32-assistant
cd stm32-assistant
# Create venv and install
python3 -m venv .venv
source .venv/bin/activate
pip install -e '.[dev]'# For Ollama
ollama pull qwen2.5-coder:14b
ollama pull nomic-embed-text# First run builds the local page and register indexes automatically
./run.sh
# Or bootstrap and run separately:
stm32-assistant-bootstrap
stm32-assistant-api # API server on :8000
stm32-assistant-gui # Desktop GUI (needs API running)Build semantic embeddings after Ollama is running:
stm32-assistant-bootstrap --embeddingsSource checkouts include the current reference manuals. Wheel installations
keep writable indexes under the platform user-data directory and require the
manual PDFs to be placed in its pdfs/ subdirectory before bootstrapping.
On Linux the default is ~/.local/share/stm32-assistant.
curl -X POST http://localhost:8000/api/query \
-H "Content-Type: application/json" \
-d '{
"chip_part": "F446RE",
"query": "configure system clock for 180 MHz using HSE 8 MHz",
"language": "C",
"backend": "ollama"
}' | jq .For Rust:
{
"chip_part": "F446RE",
"query": "configure USART2 baud 115200",
"language": "Rust",
"rust_framework": "embassy"
}The response remains backward-compatible: hal contains Rust HAL or Embassy
code, ll contains PAC code, and register contains raw-register Rust.
Response:
{
"hal": "RCC_OscInitTypeDef RCC_OscInitStruct = {0};\n...",
"ll": "LL_RCC_PLL_ConfigDomain_SYS(LL_RCC_PLLSOURCE_HSE, 4, 180, 2);\n...",
"register": "RCC->PLLCFGR = 0x24003010;\n...",
"expert_pillars": {
"side_effect": "Flash latency must be set to 5 wait states",
"resource_cost": "~2KB Flash, 0 SRAM",
"verification": "Check RCC->CR & RCC_CR_PLLRDY"
},
"pedagogy": "The PLL multiplies HSE by N/M and divides by P..."
}| Query | Calculator | Chip |
|---|---|---|
configure system clock for 180 MHz using HSE 8 MHz |
Clock PLL | F446RE |
configure system clock for 48 MHz using HSE 32 MHz |
Clock PLL | WL5x |
configure USART2 for 115200 baud |
USART Baud | F446RE |
configure I2C1 for 400 kHz fast mode |
I2C Timing | F446RE |
configure TIM1 for 1 kHz PWM at 50% duty cycle |
Timer PWM | F446RE |
configure ADC1 for 12-bit with 15 cycle sample time |
ADC Sample | F446RE |
what are the bit fields of RCC_PLLCFGR |
Register lookup | F446RE |
explain how the PLL works |
Conceptual | Any |
| Chip | Family | Core | Max SysClk | Manual |
|---|---|---|---|---|
| STM32F446RE | F4 | Cortex-M4F | 180 MHz | RM0390 |
| STM32WL5x | WL | Cortex-M4F | 48 MHz | RM0453 |
- Copy the reference manual PDF to
data/pdfs/ - Create a JSON profile in
data/profiles/:
{
"part": "G474RE",
"family": "G4",
"manual_pdf": "rm0436-stm32g4xx-*.pdf",
"max_sysclk_mhz": 170,
"vco": { "min_mhz": 96, "max_mhz": 344 },
"pll": {
"m_range": [1, 8],
"n_range": [8, 127],
"p_values": [2, 3, 5, 7]
},
"hse_default_mhz": 24,
"flash_latency_table": [
{ "max_sysclk_mhz": 34, "latency_ws": 0 },
{ "max_sysclk_mhz": 170, "latency_ws": 4 }
],
"peripherals": { "usart": [...], "i2c": [...], "adc": [...], "timer": [...] }
}- Rebuild local data:
stm32-assistant-bootstrap
See docs/adding_a_chip.md for full details.
| Endpoint | Method | Description |
|---|---|---|
/api/query |
POST |
Submit a query, get structured JSON response |
/api/query/stream |
POST |
Streaming response (SSE) |
/api/health |
GET |
Check LLM backend health |
/api/manuals |
GET |
List indexed reference manual PDFs |
/api/registers/{name} |
GET |
Structured register lookup (bit-fields, offset, reset value) |
/api/chips |
GET |
List supported chip profiles |
Interactive Swagger docs at http://localhost:8000/docs.
stm32-assistant/
├── run.sh # One-command launcher (API + GUI)
├── data/
│ ├── pdfs/ # ST reference manual PDFs
│ ├── profiles/ # Chip JSON profiles (F446RE, WL5x)
│ └── index.db # SQLite: pages + registers + embeddings
├── src/
│ ├── config.py # Pydantic settings (env-driven)
│ ├── profiles.py # Chip profile loader
│ ├── parser/
│ │ ├── bootstrap.py # Pages + registers + optional embeddings
│ │ ├── db.py # SQLite schema + connection
│ │ ├── indexer.py # PDF → pages table (pypdf, incremental)
│ │ └── register_extractor.py # PDF → registers table (pdfplumber)
│ ├── retrieval/
│ │ ├── keyword.py # FTS5 + BM25 + register-name boost
│ │ ├── embeddings.py # nomic-embed-text + sqlite-vec
│ │ └── hybrid.py # Keyword candidates → embedding rerank
│ ├── calculators/
│ │ ├── base.py # Calculator ABC + registry
│ │ ├── clock_pll.py # PLL M/N/P + Flash latency
│ │ ├── usart_baud.py # BRR divisor + baud error
│ │ ├── i2c_timing.py # CCR/TRISE (v1) or TIMINGR (v2)
│ │ ├── adc_sample.py # Conversion time + sample rate
│ │ └── timer_pwm.py # PSC/ARR/CCR for PWM
│ ├── engine/
│ │ ├── llm_client.py # OpenAI-compatible (Ollama + LM Studio)
│ │ ├── lesson_engine.py # Structured peripheral lessons
│ │ ├── prompts.py # Jinja2 templates (JSON schema)
│ │ └── expert_engine.py # Orchestrates retrieval + calc + LLM
│ ├── api/
│ │ ├── app.py # FastAPI app
│ │ ├── routes.py # API endpoints
│ │ └── schemas.py # Pydantic request/response models
│ └── gui/
│ ├── lesson_dialog.py # Learning hub
│ ├── main_window.py # PySide6 QMainWindow (dark theme)
│ ├── workers.py # QThread HTTP worker
│ ├── syntax.py # Pygments C syntax highlighter
│ └── theme.py # Catppuccin Mocha QSS stylesheet
├── tests/ # Mocked default suite + opt-in live tests
└── docs/
├── architecture.md # Full architecture doc
├── adding_a_chip.md # How to add chip support
└── user_manual.md # Complete user guide
All settings are env-driven (prefix STM32_ASSISTANT_) or via .env:
| Variable | Default | Description |
|---|---|---|
STM32_ASSISTANT_LLM_BACKEND |
ollama |
ollama or lmstudio |
STM32_ASSISTANT_LLM_MODEL |
qwen2.5-coder:14b |
LLM model name |
STM32_ASSISTANT_EMBED_MODEL |
nomic-embed-text |
Embedding model |
STM32_ASSISTANT_USE_EMBEDDINGS |
true |
Rerank with vectors when available |
STM32_ASSISTANT_LLM_TEMPERATURE |
0.2 |
Low = deterministic |
STM32_ASSISTANT_LLM_MAX_TOKENS |
4096 |
Max response tokens |
STM32_ASSISTANT_RETRIEVAL_TOP_K |
5 |
Context chunks to retrieve |
STM32_ASSISTANT_API_HOST |
127.0.0.1 |
API bind address |
STM32_ASSISTANT_API_PORT |
8000 |
API port |
See docs/user_manual.md for the full list.
# Default: mocked LLM (no Ollama needed)
pytest
# Live tests: requires Ollama running
pytest -m live
# Verify pinned Rust crate and target combinations
cargo check --manifest-path tests/rust_fixtures/Cargo.toml --target thumbv7em-none-eabihf --features f446-hal
cargo check --manifest-path tests/rust_fixtures/Cargo.toml --target thumbv7em-none-eabihf --features f446-embassy
cargo check --manifest-path tests/rust_fixtures/Cargo.toml --target thumbv7em-none-eabihf --features wl55-hal
cargo check --manifest-path tests/rust_fixtures/Cargo.toml --target thumbv7em-none-eabihf --features wl55-embassy
# Lint + type check
ruff check .
ruff format --check .
mypy src| Layer | Technology |
|---|---|
| Language | Python 3.12 |
| LLM | Ollama / LM Studio (OpenAI-compatible API) |
| Embeddings | nomic-embed-text (768-dim) via Ollama |
| PDF Parsing | pypdf (text) + pdfplumber (register tables) |
| Storage | SQLite + FTS5 + sqlite-vec |
| Web Framework | FastAPI + Uvicorn |
| GUI | PySide6 (Qt6) + Pygments syntax highlighting |
| Templates | Jinja2 |
| Validation | Pydantic v2 + pydantic-settings |
| Testing | pytest + pytest-asyncio |
| Linting | Ruff + Mypy |
- User asks: "configure system clock for 180 MHz on F446RE"
- Retrieval: Hybrid search finds relevant pages from RM0390 (RCC chapter, PLL section)
- Calculator:
ClockPllCalculatorcomputes exact PLL values from the F446RE profile:- HSE 8 MHz → ÷4 (PLLM) → 2 MHz VCO input
- 2 MHz × 180 (PLLN) → 360 MHz VCO
- 360 MHz ÷ 2 (PLLP) → 180 MHz SYSCLK ✓
- Flash latency = 5 wait states (from the latency table)
- Prompt: Jinja2 template combines context + calculator output + JSON schema
- LLM:
qwen2.5-coder:14bgenerates HAL, LL, and register code using the exact values - Response: Validated JSON with
hal,ll,register,expert_pillars,pedagogy
MIT
The source code is MIT licensed. The bundled STMicroelectronics manuals remain subject to STMicroelectronics' own terms; verify redistribution rights before publishing a release that includes those PDFs.
Built for embedded engineers who want AI assistance without arithmetic hallucinations.
