Resolver

Event-driven AI customer-support copilot. Per message: triage → RAG retrieve → grounded draft → policy/QA guard → escalate-or-suggest. Runs 100% locally at $0.

What it does

A customer message arrives → the Go API persists it and publishes a message.created event → a Python LangGraph worker consumes it, classifies intent, retrieves grounded knowledge (pgvector), drafts a cited reply, runs a safety/QA guard, and either suggests a draft to a human agent or escalates when confidence is low. Results stream back to a Next.js console via GraphQL subscriptions.

Grounding is mandatory, humans stay in the loop, and an eval harness gates quality in CI so nothing untrustworthy reaches a customer.

Screenshots

_Queue	_Conversation
_{Draft panel}	_{Human approval}
_Escalation	_Sent

Architecture

flowchart LR
    UI["Next.js console"]
    R["Go API · gqlgen<br/>(thin resolvers)"]
    DB[("Postgres<br/>+ pgvector")]
    BUS{{"Redis Streams<br/>(event bus)"}}
    LLM[("LLM<br/>Ollama / OpenAI")]
    TOOLS["MCP tools<br/>(read-only)"]

    subgraph WK["Python LangGraph worker"]
        direction LR
        T["triage"] --> RT["retrieve"] --> D["draft"] --> G["guard"] --> DEC{"decision"}
        DEC -->|repair ×1| D
    end

    UI -- "mutation" --> R
    R -- "persist" --> DB
    R == "message.created" ==> BUS
    BUS == "consume" ==> T
    WK == "draft.ready / escalated" ==> BUS
    BUS -- "bridge" --> R
    R -. "subscription (live)" .-> UI
    RT -. "vector + FTS" .-> DB
    D -. "grounded gen" .-> LLM
    D -. "order / policy" .-> TOOLS

Go API (services/api) — thin: validate, persist, publish. No LLM/agent logic.
Python worker (workers/agent) — LangGraph state machine; nodes are pure-ish and schema-validated.
Contract — API ↔ worker talk only via the typed event schema (packages/events/events.schema.json) on Redis Streams. The request's trace id rides on the event, so one OpenTelemetry trace spans API → worker → LLM.

Tech stack

Go + gqlgen · Python + LangGraph · Postgres + pgvector · Redis Streams · MCP-style tools · Next.js + Tailwind + shadcn/ui · Ollama (default) / OpenAI-compatible · OpenTelemetry · GitHub Actions + Docker.

Quickstart

Prerequisites: Docker. (Local dev also: Go 1.25+, Python 3.12+, Node 22+.)

cd resolver_code
cp .env.example .env   # defaults run fully local / $0 — no keys required

make up                # build + boot full stack (postgres+pgvector, redis, ollama, migrate, api, worker, web)
make models            # pull qwen2.5:3b, qwen2.5:7b, nomic-embed-text (first run only)
make ingest            # Bitext -> KB + embeddings + held-out golden set
make eval              # run the eval harness -> report + eval_runs row

Console: http://localhost:3000 — queue, conversation view, live drafts, dashboard.
GraphQL playground / health: http://localhost:8080 · /healthz.
Traces (optional): docker compose --profile observability up jaeger, set OTEL_TRACES_EXPORTER=otlp, open http://localhost:16686.

On a CPU-only box, drafting with the local 3b/7b models is slow (minutes per draft). Point LLM_PROVIDER=openai at a hosted/compatible endpoint for fast responses — see Models & providers.

The gqlgen-generated Go files are not committed; the Docker build and make gqlgen regenerate them from packages/graphql/schema.graphql.

Repo layout

resolver_code/
├── apps/web              Next.js agent console (streaming)        [Phase 4]
├── services/api          Go + gqlgen GraphQL API                 [Phase 1]
├── workers/agent         Python LangGraph graph, rag/, tools/     [Phase 3]
│   ├── graph/nodes       triage · retrieve · draft · guard · decision · repair
│   ├── rag/              embeddings, hybrid (vector + FTS) search, RRF + re-rank
│   ├── tools/            read-only MCP tools (audited, allow-listed)
│   └── llm/              provider adapters (ollama / OpenAI-compatible)
├── pipeline/             ingest_bitext.py + eval/                 [Phase 2/5]
├── packages/graphql      shared schema + codegen TS types
├── packages/events       events.schema.json (event contract)
├── db/migrations         versioned SQL migrations
├── deploy/               docker-compose.yml
└── data/                 golden.jsonl, samples (large files gitignored)

Models & providers

The LLM provider is an env switch behind one interface — no code change to swap.

Default (local, $0): Ollama. Model tiering reflects cost/quality: qwen2.5:3b for triage/classification, qwen2.5:7b for drafting and the eval judge, nomic-embed-text (768-dim) for embeddings. Pull them with make models.
Hosted: set LLM_PROVIDER=openai and OPENAI_API_KEY (optionally OPENAI_BASE_URL for any OpenAI-compatible endpoint). The worker's chat + embeddings switch with no code change; a missing key fails loudly at startup.

Tradeoffs: local 3b/7b on CPU is slow (minutes per draft) but free and private; grounding/guard are deterministic so safety holds regardless of model strength. A hosted model raises answer quality and speed at a per-token cost (tracked per draft as cost_cents). Generation length is bounded by DRAFT_NUM_PREDICT to cap latency/cost.

Observability

OpenTelemetry traces span the whole path: the API starts a trace per request and stamps its trace id into the message.created event, so the worker continues the same trace across the bus (API → worker → graph/LLM). Per-draft tokens, cost, and latency are recorded on the draft and as span attributes; structured JSON logs carry conversation/trace ids (no secrets/PII at info).

Exporter is env-controlled (OTEL_TRACES_EXPORTER): console (default — spans in logs, $0, no extra service), otlp (ships to OTEL_EXPORTER_OTLP_ENDPOINT), or none. For a trace UI: docker compose --profile observability up jaeger, set OTEL_TRACES_EXPORTER=otlp, and open Jaeger at localhost:16686.

Why it's built this way

The design choices, and what they demonstrate:

"LLM proposes, evals + guards dispose." Every generated answer must cite retrieved KB sources; a deterministic guard (grounding + tone + a forbidden-action allow-list) and a confidence threshold decide suggest vs escalate. An eval harness gates groundedness/routing/safety in CI. Quality is enforced by code, not vibes.
The LangGraph state machine is the source of truth for control flow (triage → retrieve → draft → guard → decision → {finalize | repair | escalate}). Nodes are pure-ish and schema-validated, so each is unit-testable and the whole graph is inspectable.
Event-driven Go ↔ Python contract. The thin Go API never calls an LLM; it validates, persists, and publishes a typed event. All AI work lives in the Python worker. They communicate only through the versioned event schema on Redis Streams — independently deployable, independently scalable.
Human-in-the-loop safety by construction. Nothing auto-sends below the confidence threshold; irreversible actions (refunds, cancellations) are never executed — only proposed as a human task. Tools are read-only, allow-listed, and audited.
Cost/model tiering and local-first. Small model for triage, stronger for drafting; embeddings cached; per-draft tokens/cost recorded. Runs 100% locally at $0 on Ollama, or switches to a hosted provider with one env var.

Status

Built phase-by-phase:

Phase 0 — foundation & local infra: monorepo skeleton, docker-compose stack, DB migrations (pgvector + HNSW), typed event contract. ✅
Phase 1 — Go GraphQL API: schema-first gqlgen API (thin resolvers → service → pgx store), Redis Streams pubsub bridge, ingestMessage persists + publishes message.created, draft subscription wiring, graceful shutdown, containerized via docker compose up. ✅
Phase 2 — Dataset → KB & RAG ingestion: make ingest loads Bitext, holds out a stratified golden set (data/golden.jsonl), builds deduped KB docs, embeds them with provider-agnostic embeddings (Ollama nomic-embed-text, 768-dim), and upserts to pgvector with an HNSW index. ✅
Phase 3 — LangGraph worker: consumes message.created (Redis consumer group, idempotent by event id, retries + dead-letter), runs the agent graph triage → retrieve → draft → guard → decision → {finalize \| repair \| escalate} with schema-validated node outputs, persists a grounded SUGGESTED draft (or ESCALATED) with citations + guard report + token cost, and publishes draft.ready / draft.escalated. Forbidden actions are blocked deterministically — never finalized. ✅
Phase 4 — Web agent console: Next.js (App Router) + Tailwind + shadcn/ui console with a typed urql GraphQL client (codegen off the shared schema). Queue with status filter and pagination, conversation view (message thread + full draft panel: confidence meter, grounding sources, guard report), live draftUpdates streaming over graphql-ws, and human-in-the-loop actions (approve/edit → SENT, reject, escalate). Verified live end-to-end: ingest → triage/draft streams in → approve. ✅
Phase 5 — Eval harness & CI gate: make eval runs the real agent graph over the held-out golden set and scores routing (category), retrieval recall@k, groundedness, LLM-judge answer quality, safety (zero forbidden actions), and cost/latency — writing pipeline/eval/reports/REPORT.md and an eval_runs row. Gated on the PRD §4 numbers (groundedness ≥90%, routing ≥85%, safety 0); the run exits non-zero otherwise. GitHub Actions CI runs Go/Python/web tests, builds all images, and runs the sampled eval gate. ✅
Phase 6 — P1 enhancements: hybrid retrieval (pgvector + Postgres FTS → Reciprocal Rank Fusion → lexical rerank); read-only, allow-listed, audited MCP tools wired into drafting; priority queue ordering (urgency/sentiment, composite-cursor pagination); a quality dashboard (auto-draft/escalation rates, cost, p95, eval trend); hosted/local LLM provider switch via env; and OpenTelemetry tracing end-to-end (the API trace id propagates onto the event so the worker continues the same trace). ✅

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Resolver

What it does

Screenshots

Architecture

Tech stack

Quickstart

Repo layout

Models & providers

Observability

Why it's built this way

Status

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/workflows		.github/workflows
apps		apps
data		data
db/migrations		db/migrations
deploy		deploy
packages		packages
pipeline		pipeline
services/api		services/api
workers/agent		workers/agent
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Resolver

What it does

Screenshots

Architecture

Tech stack

Quickstart

Repo layout

Models & providers

Observability

Why it's built this way

Status

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages