Document AI = any content requiring IDP (Intelligent Document Processing) — scans / photos / PDF images / Office files / digital-born documents → trustworthy structured data. A channel layer, not an end-product. It doesn't consume, doesn't own, doesn't dive into business — it hands Markdown + structured metadata to downstream RAG platforms, business systems, and AI clients via REST / EventBus / MCP server / Webhook.
For the full positioning, architecture rules, OUT-of-scope list, Markdown-first contract, multi-stage ETO event contract, and security covenant, see CLAUDE.md. It is the truth source — this README only stages the operational entry points.
content requiring IDP: scans / photos / PDF images / Office files / digital-born documents
↓
[Document AI channel]: OCR + Markdown + system metadata + type-bound field extraction
↓ (REST / EventBus / MCP server / Webhook)
├─→ downstream RAG platform
├─→ business systems (finance / CLM / HR / ERP)
├─→ AI clients (Claude Desktop / Cursor / any MCP client)
└─→ any consumer (build your own subscriber)
document-ai/
├── core/ # Channel implementation — ABP layers (Abstractions / Domain.Shared / Domain / Application / EntityFrameworkCore / HttpApi / Mcp)
├── host/ # Host application — provider wiring (OCR + AI) and middleware (ASP.NET Core API)
├── angular/ # Angular SPA (operator UI)
└── docs/ # Operator-facing documentation (design decisions go to GitHub Issues, not here)
Business modules (contract management / invoice management / HR records / etc.) are not in this repo — they belong on the downstream consumer side per the channel philosophy.
| Requirement | Minimum version | Notes |
|---|---|---|
| .NET SDK | 10.0 | |
| Node.js | 20 | Required for the Angular frontend (Angular 21 needs Node 20.19+ / 22.12+) |
| SQL Server | 2019+ | LocalDB works for development; production runs full SQL Server |
| Docker Desktop | any recent | Optional but recommended — runs the PaddleOCR sidecar and the local OpenTelemetry dashboard |
The host currently wires the Vision LLM OCR provider by default (see Choosing an OCR provider), which needs no sidecar — it reuses the DocumentAI AI-provider configuration below. If you switch the host to the PaddleOCR provider, start its Docker container first:
cd host
docker compose up -d paddleocrFirst run downloads ~600 MB of model weights and takes 30–60 seconds. Subsequent starts are instant.
Create host/src/appsettings.Development.json with your local SQL Server connection string and an LLM provider key:
{
"Serilog": { "MinimumLevel": { "Default": "Debug" } },
"ConnectionStrings": {
"Default": "Server=YOUR_DB_SERVER;Database=Document AI-Dev;User ID=YOUR_USER;Password=YOUR_PASSWORD;TrustServerCertificate=true"
},
"StringEncryption": {
"DefaultPassPhrase": "any-random-string-here"
},
"DocumentAI": {
"Endpoint": "https://api.openai.com/v1",
"ApiKey": "YOUR_REAL_API_KEY",
"ChatModelId": "gpt-4o-mini",
"VisionOcrModelId": "gpt-4o-mini"
}
}This file is git-ignored. In Development mode, the application automatically generates temporary OpenIddict certificates — no
.pfxfile is needed. For LocalDB, the committedappsettings.jsondefault (Server=(LocalDb)\MSSQLLocalDB;...) already works without any override.
An LLM provider is mandatory — classification and field extraction have no non-LLM fallback, and the host fails fast at startup while DocumentAI:ApiKey is still the committed placeholder. Any OpenAI-compatible endpoint works; with the default Vision LLM OCR provider, VisionOcrModelId must point at a vision-capable model. See docs/ai-provider.md.
cd host/src
abp install-libscd host/src
dotnet runAPI: https://localhost:44348. Swagger: https://localhost:44348/swagger.
The Angular SPA lives in the repository-root angular/ directory (an Nx workspace):
cd angular
npm install
npm startSPA: http://localhost:4200. Default seeded credentials: admin / 1q2w3E*.
Document AI ships three OCR providers; the host enables exactly one ([DependsOn(...)] in host/src/DocumentAIHostModule.cs + the matching ProjectReference in host/src/Dignite.DocumentAI.Host.csproj):
- Vision LLM — the host's current default (#259). Sends images / rasterized PDF pages to a vision-capable
IChatClientmodel; the strongest option for phone photos, thermal receipts, and image-only PDFs. No sidecar — only a vision model id. See docs/ocr-vision-llm.md. - PaddleOCR — local Docker sidecar (PP-StructureV3, CPU); data never leaves the network. See docs/ocr-paddleocr.md.
- Azure Document Intelligence — cloud option (
prebuilt-layout, high accuracy) when data is allowed to leave the network. See docs/ocr-azure-document-intelligence.md.
Full selection guidance, configuration, and resource footprint: see docs/text-extraction.md.
For database connection strings, OpenIddict signing certificate, string-encryption key, and the Docker layout, see docs/deployment.md. For per-release smoke tests, see docs/deployment-checklist.md.
Feature docs (start here for any specific topic):
- Local development setup — prerequisites, Docker sidecars, configuration, troubleshooting
- Text extraction — Markdown-first contract, the two extraction paths, OCR provider comparison
- PaddleOCR — local OCR sidecar (PP-StructureV3, CPU); model choice and resource footprint
- Azure Document Intelligence — cloud OCR (
prebuilt-layout); resource setup and F0 tier limits - Vision-LLM OCR — multimodal-
IChatClientOCR for photos / thermal receipts / image-only PDFs - Classification — document-type pipeline and prompt tuning
- Reprocessing — bulk re-run of classification / field extraction over existing documents after a config change
- Export templates — per-tenant CSV / XLSX file egress: field projection, rename, ordering — zero business transformation
- MCP server — document resources + structured search tool over Streamable HTTP, OpenIddict Bearer auth
- AI provider — provider wiring for the two keyed chat clients (title generator + structured)
- Observability — OpenTelemetry pipeline, aspire-dashboard for local dev, switching OTLP backends
- Pipeline runs — run history and review-UI payloads
- Deployment — DB, certificate, Docker
- Deployment checklist — per-release smoke tests
External references:
- ABP Framework Documentation
- Application (Single Layer) Startup Template
- Configuring OpenIddict for Production
Dignite Document AI is licensed under the Apache License 2.0.