DepthAPI

DepthAPI is a headless B2B Retrieval-Augmented Generation (RAG) and inference infrastructure API. It provides a robust backend for AI applications, focusing on production-grade reliability, query routing across multiple LLM providers, and high-performance hybrid retrieval.

Applications and Use Cases

B2B RAG Integration: Plug-and-play backend for applications requiring semantic search and document synthesis over large custom datasets.
AI Infrastructure Orchestration: Centralized query routing and rate limiting across multiple model providers (Groq, Gemini, OpenAI) to manage costs and ensure high availability.
Technical Documentation Retrieval: Optimized for parsing, chunking, and retrieving highly technical content with high accuracy using hybrid search.
Intelligent Query Routing: Automatically directing simple queries to fast, low-cost models and complex reasoning tasks to larger, more capable models based on intent classification.

Architecture

+--------------+       +-------------------+       +-----------------------+
|              |       |                   |       |                       |
| API Client   +------>+ FastAPI Router    +------>+ Intent Classifier     |
|              |       |                   |       |                       |
+--------------+       +---------+---------+       +-----------+-----------+
                                 |                             |
                                 v                             v
                       +---------+---------+       +-----------+-----------+
                       |                   |       | Model Routing         |
                       | Hybrid RAG Engine |       | (Alias Chains)        |
                       |                   |       |                       |
                       +---------+---------+       +-----------+-----------+
                                 |                             |
                                 v                             v
                       +---------+---------+       +-----------+-----------+
                       |                   |       | Circuit Breaker &     |
                       | FAISS + BM25      |       | Fallback Orchestrator |
                       | (Local / pgvector)|       | (Redis)               |
                       +-------------------+       +-----------------------+

Key Features

1. Multi-provider Fallback and Circuit Breakers

Implements a stateful circuit breaker using a Redis Lua script to manage rate limits and provider failures. The fallback orchestrator automatically reroutes traffic to alternate models in an alias chain upon detecting retryable errors.

2. Hybrid Search RAG Pipeline

Retrieval utilizes a multi-stage pipeline combining vector similarity (FAISS with inner product for cosine similarity) and keyword matching (BM25). Results are fused using Reciprocal Rank Fusion (RRF) and Maximal Marginal Relevance (MMR), followed by an optional cross-encoder reranking step.

3. Intent-Based Model Escalation

Queries are scored on complexity, latency priority, reasoning, and explanation requirements. Based on these features, queries are routed to the most appropriate model chain:

Low-complexity/latency-sensitive: Groq (Llama 3.1 8B)
High-complexity/reasoning tasks: Gemini 1.5 Pro or equivalent

4. Privacy-Preserving Observability

Logging is handled via Structlog with JSON rendering. User identifiers are anonymized using a SHA-256 salted hash, and sensitive data (API keys, prompts) is automatically redacted before logging.

5. Local-First Development Path

Supports a complete local development environment without requiring cloud dependencies for the core API path:

Authentication via environment variables
Local Redis via Docker Compose
Filesystem-backed RAG (FAISS/BM25)

Tech Stack

Layer	Technologies
Core Engine	FastAPI (Python 3.11+), Pydantic v2, Structlog
Routing & Logic	Custom Intent Classifier, Provider Fallback Orchestrator
RAG Retrieval	FAISS, rank-bm25, SentenceTransformers
Persistence (Cloud)	Supabase (PostgreSQL), pgvector
Observability/State	Redis (Circuit Breaking, Rate Limiting)

Quick Start

Local Development Setup

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python scripts/setup_local_dev.py
docker compose up -d redis
uvicorn main:app --reload

This local path uses:

AUTH_PROVIDER_MODE=env with DEV_API_KEYS for authentication.
Local Redis from docker compose for cache/rate-limit storage.
Filesystem-backed RAG data under data/rag/.

Example Query

curl -X POST "http://localhost:8000/api/query" \
  -H "Authorization: Bearer sk-depth-local-dev" \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "System Architecture",
    "prompt_spec": {
      "depth": "technical",
      "task": "explain",
      "reasoning": "direct"
    },
    "mode": "technical",
    "use_trusted_corpus": true
  }'

Datasets

The project utilizes an open-sourced dataset available at Curated Dev Vertical Dataset on Hugging Face, containing technical documentation and system design references optimized for RAG retrieval.

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 467 Commits
.github/workflows		.github/workflows
api		api
evaluation		evaluation
scripts		scripts
supabase		supabase
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.envrc		.envrc
.gitignore		.gitignore
Dockerfile.research-corpus		Dockerfile.research-corpus
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DepthAPI

Applications and Use Cases

Architecture

Key Features

1. Multi-provider Fallback and Circuit Breakers

2. Hybrid Search RAG Pipeline

3. Intent-Based Model Escalation

4. Privacy-Preserving Observability

5. Local-First Development Path

Tech Stack

Quick Start

Local Development Setup

Example Query

Datasets

License

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DepthAPI

Applications and Use Cases

Architecture

Key Features

1. Multi-provider Fallback and Circuit Breakers

2. Hybrid Search RAG Pipeline

3. Intent-Based Model Escalation

4. Privacy-Preserving Observability

5. Local-First Development Path

Tech Stack

Quick Start

Local Development Setup

Example Query

Datasets

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages