Natural language β validated, executable app schemas A compiler-inspired AI pipeline that converts plain English product descriptions into structured, cross-validated UI, API, database, and auth configurations β ready to power real applications.
β If you find this useful, give it a star β it helps others discover it.
Solum AI treats app generation like a compiler treats source code:
Natural Language β Intent β System Design β Schemas β Validated β Executable
Instead of a single massive prompt, every input goes through a 5-stage pipeline where each stage has a strict typed contract. Outputs are validated, inconsistencies are detected, and broken layers are surgically repaired β not blindly retried.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SOLUM AI PIPELINE β
ββββββββββββ¬βββββββββββ¬βββββββββββββββ¬βββββββββββββ¬βββββββββββ€
β Stage 1 β Stage 2 β Stage 3 β Stage 4 β Stage 5 β
β Intent β System β Schema β Refinement β Validate β
βExtractionβ Design β Generation β β + Repair β
β β β (parallel) β β β
β NL β β Intent β β β UI Schema β Cross-layerβ Pydantic β
β Intent β Entities β β API Schema β consistencyβ + Repair β
β Schema β Pages β β DB Schema β check + β engine β
β β Flows β β Auth Schemaβ LLM fix β β
ββββββββββββ΄βββββββββββ΄βββββββββββββββ΄βββββββββββββ΄βββββββββββ
β
Runtime Artifact Generation
(Prisma schema + Express routes)
| Stage | Name | Model | Description |
|---|---|---|---|
| 1 | Intent Extraction | Gemini 2.0 Flash | Parses NL β structured intent with entities, roles, features |
| 2 | System Design | DeepSeek R1 | Converts intent β app architecture, data model, user flows |
| 3 | Schema Generation | Gemini + DeepSeek | Generates UI, API, DB, Auth schemas in parallel |
| 4 | Cross-layer Refinement | Gemini 2.0 Flash | Detects and fixes inconsistencies across all 4 schemas |
| 5 | Validation + Repair | Gemma 3 27B | Pydantic validation + surgical per-layer repair |
Each stage has a strict Pydantic v2 contract. LLM output that doesn't conform is rejected immediately β not silently passed along. This makes the system behave like a typed compiler, not a chatbot.
Stage 3 runs all 4 schema generators (asyncio.gather) simultaneously. This cuts latency by ~60% compared to sequential generation while maintaining independent validation per schema.
When Stage 5 detects issues, it:
- Groups errors by layer (UI / API / DB / Auth)
- Repairs in dependency order: DB β Auth β API β UI
- Re-generates only the broken layer with errors as explicit context
- Re-validates after each repair
This avoids cascading failures and wasted LLM calls.
Before repair, a deterministic checker verifies:
- Every UI component's
api_endpointexists in the API schema - Every API endpoint's
db_tableexists in the DB schema - Every API field exists as a column in its table
- Every role referenced in UI/API is defined in Auth
- Every protected route in Auth has a matching UI page
- All DB foreign keys reference real tables
Output is not just JSON β it directly generates:
- A Prisma schema (
.prismafile) from the DB schema - Express.js route stubs with Zod validators from the API schema
- A
.envtemplate with all required variables - Step-by-step setup instructions
| Component | Technology |
|---|---|
| Framework | FastAPI + uvicorn |
| LLM Access | OpenRouter (free tier) |
| Validation | Pydantic v2 |
| Streaming | Server-Sent Events (SSE) |
| Eval logging | SQLite + aiosqlite |
| Concurrency | Python asyncio |
| Component | Technology |
|---|---|
| Framework | Next.js 14 (App Router) |
| Language | TypeScript |
| Styling | CSS variables + Tailwind |
| Streaming | Fetch + ReadableStream |
| Role | Model |
|---|---|
| Primary (fast stages) | google/gemini-2.0-flash-exp:free |
| Complex reasoning | deepseek/deepseek-r1:free |
| Surgical repair | google/gemma-3-27b-it:free |
| Fallback | meta-llama/llama-3.3-70b-instruct:free |
| Service | Platform |
|---|---|
| Backend | Render free tier |
| Frontend | Vercel hobby |
| Database | SQLite on disk |
solum-ai/
βββ backend/
β βββ main.py # FastAPI app, SSE endpoint, metrics API
β βββ requirements.txt
β βββ llm/
β β βββ client.py # OpenRouter client, retry + fallback logic
β β βββ prompts.py # All stage prompts (strict JSON-only)
β βββ schemas/
β β βββ models.py # Pydantic v2 contracts for all layers
β βββ pipeline/
β β βββ stage1_intent.py # NL β IntentSchema
β β βββ stage2_design.py # Intent β DesignSchema
β β βββ stage3_schemas.py # Parallel UI + API + DB + Auth generation
β β βββ stage4_refine.py # Cross-layer refinement
β β βββ stage5_validate.py # Validation + surgical repair engine
β β βββ runner.py # Pipeline orchestrator + SSE emitter
β βββ validator/
β β βββ cross_layer.py # Deterministic consistency checker
β βββ runtime/
β β βββ codegen.py # Prisma schema + Express route generator
β βββ eval/
β βββ db.py # SQLite metrics logger
β βββ dataset.json # 20 evaluation prompts (10 real + 10 edge)
βββ frontend/
βββ app/
β βββ page.tsx # Main generate page
β βββ metrics/page.tsx # Metrics dashboard page
β βββ layout.tsx # Root layout + nav
β βββ globals.css # Design tokens + base styles
βββ components/
β βββ PromptInput.tsx # Prompt textarea + examples
β βββ PipelineProgress.tsx # Live stage-by-stage progress
β βββ SchemaViewer.tsx # Tabbed output viewer
β βββ MetricsDashboard.tsx # Eval metrics UI
βββ lib/
βββ types.ts # All TypeScript types
βββ api.ts # SSE streaming + fetch helpers
cd backend
python -m venv venv
# Windows
venv\Scripts\activate
# Mac/Linux
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Add your OPENROUTER_API_KEY to .env
uvicorn main:app --reload --port 8000cd frontend
npm install
cp .env.example .env.local
# Set NEXT_PUBLIC_API_URL=http://localhost:8000
npm run devSolum AI includes a built-in evaluation dataset of 20 prompts tracked across:
| Metric | Description |
|---|---|
| Success rate | % of runs that complete all 5 stages |
| Avg latency | End-to-end pipeline time in seconds |
| Avg retries | LLM retries per run (rate limit handling) |
| Repairs performed | Surgical layer repairs per run |
| Failure types | Breakdown of what causes failures |
- 10 real prompts β CRM, e-commerce, LMS, job board, booking system, etc.
- 10 edge cases β single word, vague, conflicting requirements, feature creep, jargon-only, etc.
Live metrics are available at /metrics.
| Decision | Tradeoff |
|---|---|
| Parallel Stage 3 | +speed, +cost (4 calls at once) vs sequential |
| DeepSeek R1 for design/DB | +quality on complex schemas, +latency |
| Gemma for repair only | Lower cost for targeted fixes vs using primary model |
| Max 2 refinement rounds | Caps cost, accepts minor residual inconsistencies |
| Max 3 repair attempts per layer | Prevents infinite loops, accepts rare failures |
| SQLite over managed DB | Zero cost, acceptable for demo scale |
Stream pipeline execution as SSE.
Request:
{ "prompt": "string", "run_id": "optional uuid" }SSE Events:
stage_start β { stage, name }
stage_complete β { stage, name, latency_ms, preview }
stage_error β { stage, name, error }
clarification_needed β { questions, assumptions }
complete β { output: SolumOutput }
error β { message }
done β (stream end)
Returns aggregated eval metrics from SQLite.
Returns the 20-prompt evaluation dataset.
Health check endpoint.
MIT Β© 2026 Solum AI