✦ Pod

✦ Pod

Unified proxy for LLM inference. Pod sits in front of your AI providers and exposes a single OpenAI-compatible endpoint — with routing, fallback, caching, rate limiting, and a dashboard built in.

🚧 under active development on v0.0.x

Features

Multi-provider routing — OpenAI, Anthropic, Gemini, Codex, Ollama, 50+ providers
Compatibility APIs — OpenAI, Anthropic, Gemini, and Ollama-compatible endpoints under /v1/*
Semantic cache — deduplicates identical requests; streaming responses are cached too
Conversational memory — automatic injection and extraction across sessions
API key auth — per-key rate limiting (req/min + concurrent cap)
Rate limiting — Redis-backed distributed rate limiter with in-memory fallback
Combos — model groups with fallback and round-robin strategies
Proxy pools — per-provider proxy config with optional Vercel relay
Tunnel support — Tailscale and Cloudflare tunnel integration
Dashboard — full web UI for providers, usage analytics, quota tracking, logs, and health
Account lockout — exponential cooldown on auth failures, visible on /health
PWA & offline-first — installable dashboard with service worker shell caching, offline read cache, and offline mutation queue
Robust cache invalidation — versioned service worker, network-first non-hashed assets, and tag-based offline JSON cache invalidation after safe mutations

Quick Start

Docker (recommended)

docker run -d \
  --name pod \
  -p 20128:20128 \
  -v pod-data:/app/data \
  lazuardytech/pod:latest

Then open http://localhost:20128.

Docker Compose (with Redis + SearXNG)

cd docker
docker compose up -d

This starts Pod, Redis (rate limiting), and SearXNG (private web search) together. Works out of the box.

With an env file:

docker run -d \
  --name pod \
  -p 20128:20128 \
  -v pod-data:/app/data \
  --env-file .env \
  lazuardytech/pod:latest

Local Development

Requires bun v1.3.14+.

bun install
bun run dev        # starts on http://localhost:20128

Environment Variables

Variable	Default	Description
`PORT`	`20128`	HTTP port
`DATA_DIR`	`~/.pod` locally, `/app/data` in Docker	SQLite data directory override
`INITIAL_PASSWORD`	`123456`	Initial dashboard login password. Change after first login.
`JWT_SECRET`	(required)	Required server secret for dashboard auth sessions
`API_KEY_SECRET`	(required)	Required HMAC secret used for generated Pod API keys
`SHUTDOWN_SECRET`	(none)	Shared secret required by `/api/restart` and `/api/shutdown`
`MACHINE_ID_SALT`	`endpoint-proxy-salt`	Salt used for machine-bound identifiers
`ENABLE_REQUEST_LOGS`	`false`	Enable request log capture at runtime
`OBSERVABILITY_ENABLED`	`true`	Enable request-details observability storage
`OBSERVABILITY_MAX_RECORDS`	`200`	Max request-detail rows retained
`OBSERVABILITY_BATCH_SIZE`	`20`	Buffered write batch size for request details
`OBSERVABILITY_FLUSH_INTERVAL_MS`	`5000`	Max delay before flushing buffered request details
`OBSERVABILITY_MAX_JSON_SIZE`	`5`	Max stored JSON payload size in KiB per request-detail blob
`AUTH_COOKIE_SECURE`	`false`	Force secure auth cookies even outside HTTPS autodetection
`REQUIRE_API_KEY`	`false`	Require Pod API keys on `/v1/*` routes and protected health/model-list endpoints
`BASE_URL`	`http://localhost:20128`	Internal base URL used for self-referencing API calls (e.g. model availability checks). Set this when running behind a reverse proxy.
`CLOUD_URL`	(none)	URL of your self-hosted Cloudflare Worker (cloud deployment). Overrides the value stored in settings.
`NEXT_TELEMETRY_DISABLED`	`1`	Disable Next.js telemetry
`SEMANTIC_CACHE_MAX_BYTES`	`4194304`	Semantic cache max size in bytes
`SEMANTIC_CACHE_MAX_SIZE`	`100`	Semantic cache max entries
`SEMANTIC_CACHE_TTL_MS`	`1800000`	Semantic cache TTL (ms)
`PROMPT_CACHE_MAX_BYTES`	`2097152`	Prompt cache max size in bytes
`PROMPT_CACHE_MAX_SIZE`	`50`	Prompt cache max entries
`PROMPT_CACHE_TTL_MS`	`300000`	Prompt cache TTL (ms)
`REDIS_URL`	(none)	Redis connection URL for distributed rate limiting. When set, rate limits are shared across all Pod instances. When unset, falls back to in-memory rate limiting (single-instance only). Example: `redis://localhost:6379`
`IFLOW_OAUTH_CLIENT_SECRET`	(optional)	Required only if you use iFlow OAuth flows or token refresh
`QODER_OAUTH_CLIENT_ID`	(optional)	Optional Qoder OAuth client id override
`QODER_OAUTH_CLIENT_SECRET`	(optional)	Required only if you use Qoder OAuth flows needing a client secret

Redis (optional)

Pod supports Redis-backed distributed rate limiting. When REDIS_URL is set, API key rate limits (requests_per_minute, concurrent_requests) are enforced globally across all Pod instances sharing the same Redis — preventing limit bypass in multi-instance deployments.

With docker compose:

environment:
  REDIS_URL: redis://redis:6379

Without Redis, rate limiting uses an in-memory backend (single-instance safe, but not shared across replicas). Redis is recommended for production multi-instance deployments.

API

Pod exposes standard-compatible endpoints:

Endpoint	Protocol
`POST /v1/chat/completions`	OpenAI
`POST /v1/messages`	Anthropic
`POST /v1/responses`	OpenAI Responses
`POST /v1/embeddings`	OpenAI
`POST /v1/audio/speech`	OpenAI TTS
`POST /v1/audio/transcriptions`	OpenAI STT
`POST /v1/images/generations`	OpenAI
`GET /v1/models`	OpenAI
`GET /v1beta/models`	Gemini
`POST /v1/api/chat`	Ollama

All endpoints accept Authorization: Bearer <key> or x-api-key: <key> when API key auth is enabled.

Supported Providers

Canonical built-in provider definitions live in src/shared/constants/providers.js.

Free access: Kiro AI, Qwen Code, Gemini CLI, iFlow AI, OpenCode Free
Free tier or account/API-key based access: OpenRouter, NVIDIA NIM, Ollama Cloud, Vertex AI, Gemini, Cloudflare, BytePlus ModelArk
OAuth and tool-account providers: Claude Code, Antigravity, OpenAI Codex, GitHub Copilot, Cursor IDE, Kilo Code, Cline
API key and self-hosted providers: GLM Coding, GLM (China), Kimi, Minimax Coding, Minimax (China), Alibaba, Alibaba Intl, Xiaomi MiMo, Volcengine Ark, OpenAI, Anthropic, OpenCode Go, Azure OpenAI, DeepSeek, Groq, xAI (Grok), Mistral, Together AI, Fireworks AI, Cerebras, Cohere, Nebius AI, SiliconFlow, Hyperbolic, Blackbox AI, Chutes AI, Ollama Local, Vertex Partner
Speech, embeddings, image, and search providers: Deepgram, AssemblyAI, NanoBanana API, ElevenLabs, Cartesia, PlayHT, Local Device, Google TTS, Edge TTS, Coqui TTS, Tortoise TTS, Inworld TTS, Voyage AI, SD WebUI, ComfyUI, HuggingFace, Tavily, Brave Search, Serper, Exa, SearXNG, Google PSE, Linkup, SearchAPI, You.com Search, Firecrawl, Fal.ai, Stability AI, Black Forest Labs, Recraft, Topaz, Runway ML, AWS Polly, Jina AI, Jina Reader
Custom nodes: OpenAI-compatible, Anthropic-compatible, and custom embedding nodes can be added from the dashboard

Development

bun install          # install dependencies
bun run dev          # start dev server on :20128
bun run build        # production build
bun run check        # biome format + lint + eslint
bun run test:run     # run vitest

Always run bun run check and bun run test:run before pushing.

See AGENTS.md for project rules (applies to both humans and AI agents). Additional agent context lives in .agents/. See docs/API_INTERNAL.md for the dashboard/internal API reference.

Contributing

See CONTRIBUTING.md for guidelines.

Pod is heavily inspired by 9router and OmniRoute. Credits to their maintainers.

Security

See SECURITY.md for the vulnerability disclosure policy.

Name		Name	Last commit message	Last commit date
Latest commit History 1,014 Commits
.agents		.agents
.github		.github
.rwx		.rwx
cloud		cloud
docker		docker
open-sse		open-sse
public		public
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.npmignore		.npmignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DESIGN.md		DESIGN.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
biome.json		biome.json
bun.lock		bun.lock
eslint.config.mjs		eslint.config.mjs
jsconfig.json		jsconfig.json
next.config.mjs		next.config.mjs
package.json		package.json
postcss.config.mjs		postcss.config.mjs
vitest.config.mjs		vitest.config.mjs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✦ Pod

Features

Quick Start

Docker (recommended)

Docker Compose (with Redis + SearXNG)

Local Development

Environment Variables

Redis (optional)

API

Supported Providers

Development

Contributing

Security

License

About

Uh oh!

Releases

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

✦ Pod

Features

Quick Start

Docker (recommended)

Docker Compose (with Redis + SearXNG)

Local Development

Environment Variables

Redis (optional)

API

Supported Providers

Development

Contributing

Security

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors

Uh oh!

Languages