Skip to content

capitec/pg-proxy

PGProxy

A PostgreSQL proxy with query analysis and data masking capabilities.

Features

  • JWT/OIDC Authentication: Authenticate clients via Keycloak or Azure Entra ID JWT tokens instead of database passwords
  • Query Analysis: AI-powered query efficiency analysis via Portkey LLM gateway (Claude on AWS Bedrock)
  • Data Masking: Automatic masking of sensitive columns in SELECT query results
  • Three Proxy Modes:
    • blocking - Block inefficient queries
    • non-blocking - Log issues but allow all queries through
    • shunt - Route bad queries to a separate "slow" database node
  • Config Generator: Auto-generate masking configuration by scanning database schema with AI
  • Audit Trail: JWT user identity injected into application_name for backend visibility
  • RDS / Aurora IAM Authentication: Token-based auth with automatic refresh and cross-account STS support
  • ElastiCache / Valkey IAM Authentication: AWS SigV4 token auth with automatic refresh
  • Client TLS: Optional TLS termination with configurable enforcement; filters SCRAM-SHA-256-PLUS channel binding appropriately
  • Redshift Support: Specialized AI agent with Redshift system catalogs (stv_*, svv_*)
  • Health Checks: HTTP /health and /ready endpoints for Kubernetes / NLB probes
  • Metrics: Per-query LLM, backend, and total latency recorded to query_metrics table
  • TCP Keepalives: Prevents silent connection drops through NAT gateways and load balancers

Prerequisites

  • Go 1.26+
  • PostgreSQL database
  • Docker (for local development)
  • AWS credentials (for AI features)

Quick Start

1. Start the test infrastructure

docker compose up -d postgres-fast postgres-shunt postgres-reporter valkey

This starts:

  • PostgreSQL "fast" node on port 5433 (primary database)
  • PostgreSQL "shunt" node on port 5434 (for rejected queries in shunt mode)
  • PostgreSQL "reporter" node on port 5435 (stores query_issues + query_metrics)
  • Valkey on port 6379 (query decision cache)

2. Configure environment

Copy and edit the environment file:

cp .env.example .env
# Edit .env with your settings

Key configuration options:

Variable Description Default
LISTEN_ADDR Proxy listen address :5432
HEALTH_CHECK_ADDR HTTP health check endpoint address :8080
PG_BACKEND_URL Backend PostgreSQL URL required
SHUNT_NODE_URL Slow node URL (required for shunt mode) -
PROXY_MODE blocking, non-blocking, or shunt blocking
MASKING_ENABLED Enable data masking false
MASKING_CONFIG_PATH Path to masking config ./masking_config.yaml
ANALYSIS_ENABLED Enable AI query analysis false
CONFIG_GENERATOR_MODE Run config generator instead of proxy false
OIDC_AUTH Enable JWT/OIDC authentication false
OIDC_DISCOVERY_URL OIDC discovery endpoint URL -
OIDC_ISSUER_OVERRIDE Override expected issuer claim (for EntraID) -
OIDC_AUDIENCE Expected JWT audience (comma-separated) -
OIDC_ALLOWED_ROLES Required roles (comma-separated, empty = no role check) -
OIDC_CERT_PATH Custom CA certificate for OIDC provider -
TLS_ENABLED Enable TLS for client connections false
TLS_CERT_PATH Path to TLS certificate -
TLS_KEY_PATH Path to TLS private key -
REQUIRE_CLIENT_TLS Reject non-TLS client connections false
PG_IAM_AUTH Enable RDS IAM token authentication false
PG_IAM_USERNAME Database user with rds_iam role -
PG_AWS_REGION AWS region for RDS IAM tokens -
VALKEY_IAM_AUTH Enable ElastiCache/Valkey IAM authentication false
VALKEY_URL Valkey/Redis URL for query cache -
METADATA_CACHE_TTL TTL for schema metadata cache 1h
PORTKEY_BASE_URL Portkey LLM gateway URL -
PORTKEY_API_KEY Portkey API key -
PORTKEY_VIRTUAL_KEY Portkey virtual key -
BEDROCK_MODEL_ID Claude model ID via Bedrock eu.anthropic.claude-sonnet-4-20250514-v1:0
PG_BACKEND_CA_PATH PEM CA bundle for backend TLS verification (verify-full/verify-ca). Use AWS RDS CA bundle for RDS IAM deployments. -
MAX_CONNS_PER_IP_PER_MIN Max new connections per source IP per minute (0 = disabled) 60
MAX_MESSAGE_SIZE_MB Max PostgreSQL protocol message size in MiB 4
ALLOW_MD5_AUTH Allow MD5 password authentication (deprecated since PostgreSQL 10) false
STORE_RAW_QUERY Store raw SQL text in the reporter (default: fingerprint only) false
VALKEY_TLS_SKIP_VERIFY Disable Valkey TLS certificate verification (dev only) false
OIDC_ALLOWED_DATABASES Comma-separated list of databases JWT users may access (empty = allow all — not recommended) -

3. Build and run

# Build
go build -o pgproxy ./cmd/pgproxy

# Run proxy
./pgproxy

4. Connect through the proxy

PGPASSWORD=testpass psql -h localhost -p 5432 -U testuser -d testdb

Health Checks

PGProxy exposes HTTP endpoints on port 8080 (configurable via HEALTH_CHECK_ADDR) for load balancer health checks:

  • GET /health - Liveness probe. Returns 200 OK if the process is running.
  • GET /ready - Readiness probe. Returns 200 OK if the database connection is healthy, 503 Service Unavailable otherwise.

AWS NLB Configuration

For AWS Network Load Balancer health checks:

resource "aws_lb_target_group" "pgproxy" {
  port        = 5432
  protocol    = "TCP"
  target_type = "ip"
  vpc_id      = var.vpc_id

  health_check {
    enabled             = true
    protocol            = "HTTP"
    port                = 8080
    path                = "/ready"
    interval            = 30
    timeout             = 5
    healthy_threshold   = 2
    unhealthy_threshold = 2
  }
}

Testing Health Endpoints

# Liveness check
curl http://localhost:8080/health

# Readiness check
curl http://localhost:8080/ready

JWT Authentication

PGProxy supports JWT/OIDC authentication via Keycloak (or any OIDC-compliant provider). When enabled, clients authenticate using a JWT token instead of the database password.

How It Works

  1. Client connects to the proxy and sends a startup message (with any username)
  2. Proxy requests a password (cleartext)
  3. Client sends the JWT token as the password
  4. Proxy validates the token against the OIDC provider (signature, expiry, audience, roles)
  5. If valid, proxy replaces the client's username/database with the service account credentials from PG_BACKEND_URL
  6. Proxy authenticates to the backend using the service account credentials
  7. The JWT user's email is injected into application_name for audit visibility

Configuration

# Enable authentication
OIDC_AUTH=true

# OIDC discovery endpoint
# Keycloak: https://keycloak.example.com/auth/realms/myrealm/.well-known/openid-configuration
# EntraID:  https://login.microsoftonline.com/{tenant-id}/v2.0/.well-known/openid-configuration
OIDC_DISCOVERY_URL=https://keycloak.example.com/auth/realms/myrealm/.well-known/openid-configuration

# Override issuer validation (optional)
# Required for EntraID where discovery URL differs from token issuer
# EntraID example: https://sts.windows.net/{tenant-id}/
OIDC_ISSUER_OVERRIDE=

# Expected audience in the JWT
OIDC_AUDIENCE=my-app

# Required roles (comma-separated, leave empty to skip role check)
OIDC_ALLOWED_ROLES=db_reader,analyst

# Custom CA certificate (if using internal CA)
OIDC_CERT_PATH=./certs/ca-bundle.pem

Connecting with JWT

Use the JWT token as the password when connecting:

# Using psql
PGPASSWORD="eyJhbGciOiJSUzI1NiIs..." psql -h localhost -p 5432 -U anyuser -d anydb

# Using connection string
psql "postgres://anyuser:eyJhbGciOiJSUzI1NiIs...@localhost:5432/anydb"

Important: The username and database in the client connection are ignored when JWT auth is enabled. The proxy replaces them with the service account credentials from PG_BACKEND_URL (e.g., postgres://pgproxy_service:secret@backend:5433/production). The actual user identity comes from the JWT token's email or upn claim.

Audit Trail (application_name)

When JWT authentication is enabled, the proxy automatically sets the backend connection's application_name to the authenticated user's email, prefixed with jwt::

application_name = jwt:user@example.com

This provides visibility into who is running queries:

PostgreSQL:

SELECT usename, application_name, query
FROM pg_stat_activity
WHERE application_name LIKE 'jwt:%';

Redshift:

SELECT userid, application_name, querytxt
FROM stv_recents
WHERE application_name LIKE 'jwt:%';

This works with both PostgreSQL and Redshift backends, enabling DBAs to see which JWT user is executing each query in real-time.

Supported Authentication Methods

The proxy handles backend authentication automatically using credentials from PG_BACKEND_URL:

Backend Auth Method Supported
Trust Yes
Cleartext Password Yes
MD5 Yes (client-side requires ALLOW_MD5_AUTH=true)
SCRAM-SHA-256 Yes
SCRAM-SHA-256-PLUS Yes (via pgx v5.9+)
RDS IAM Token Yes (PG_IAM_AUTH=true)

Data Masking

Configuration

Create a masking_config.yaml file:

version: "1.0"
default_schema: public
unknown_table_behavior: passthrough  # or llm_inference

tables:
  - schema: public
    table: users
    columns:
      - name: email
        mask_type: email
      - name: password_hash
        mask_type: full
      - name: username
        mask_type: partial
        params:
          visible_chars: 3

Mask Types

Type Description Example
full Complete redaction ***MASKED***
partial Show first N characters joh***
email Mask email preserving domain j***@example.com
hash SHA-256 hash (deterministic) a1b2c3...

Auto-Generate Config

Use the config generator to scan your database and create a masking config:

# Set environment
export CONFIG_GENERATOR_MODE=true
export CONFIG_GENERATOR_SCHEMAS=public  # or "all" for all schemas
export MASKING_CONFIG_PATH=./masking_config.yaml

# Run generator
./pgproxy

The generator:

  1. Scans all tables in specified schemas
  2. Queries a sample row from each table
  3. Uses AI to identify sensitive columns
  4. Outputs YAML config with confidence comments for review

Load Testing

Build and run the load generator:

# Build
go build -o loadgen ./cmd/loadgen

# Run against proxy (port 5432)
./loadgen -conn "postgres://testuser:testpass@localhost:5432/testdb?sslmode=disable" \
  -workers 10 -duration 30s

# Run against direct database (port 5433) for comparison
./loadgen -conn "postgres://testuser:testpass@localhost:5433/testdb?sslmode=disable" \
  -workers 10 -duration 30s

Options:

  • -conn: PostgreSQL connection string
  • -workers: Number of concurrent workers (default: 10)
  • -duration: Test duration (default: 30s)
  • -rampup: Ramp-up time (default: 5s)

Development

Quick Commands (Makefile)

The project includes a Makefile for common development tasks:

# Run all tests (unit + BDD, excludes @real-bedrock tests)
make test

# Run only unit tests (no Docker needed)
make test-unit

# Run only BDD integration tests
make test-bdd

# Run ALL tests including real AWS Bedrock tests (requires AWS creds, costs money)
make test-all

# Generate combined test coverage report (unit + BDD)
make coverage

# Start test infrastructure (PostgreSQL + Valkey)
make infra-up

# Stop and remove test infrastructure
make infra-down

# Clean up coverage files and test cache
make clean

# Show all available targets
make help

Pre-commit workflow:

# Before committing, run the full test suite
make test

# Or run coverage to see code coverage
make coverage

Project Structure

pgproxy/
├── cmd/
│   ├── pgproxy/       # Main proxy binary
│   └── loadgen/       # Load testing tool
├── internal/
│   ├── agent/         # AI/Bedrock client
│   ├── auth/          # JWT/OIDC authentication (Keycloak)
│   ├── cache/         # Query approval cache
│   ├── config/        # Configuration
│   ├── masking/       # Data masking logic
│   ├── metadata/      # Database metadata cache
│   ├── proxy/         # PostgreSQL proxy
│   └── reporter/      # Issue reporting (query_issues) + metrics (query_metrics)
├── docker-compose.yml # Local PostgreSQL
├── init.sql           # Test data
└── masking_config.yaml

Running Tests

Unit Tests

# Run all tests
go test ./...

# Run tests with verbose output
go test -v ./...

# Run tests for a specific package
go test -v ./internal/masking/...
go test -v ./internal/proxy/...
go test -v ./internal/config/...

# Run specific test by name
go test -v ./internal/masking/... -run TestQuoteIdentifier

Integration Tests (BDD)

We use godog for BDD-style integration tests. These tests spin up the actual proxy binary and run real queries against PostgreSQL.

Prerequisites:

  • Docker (for test databases)
  • The test databases running

1. Start the test infrastructure:

docker compose up -d postgres-fast postgres-shunt postgres-reporter valkey

This starts:

  • postgres-fast (port 5433) - Primary "fast" database
  • postgres-shunt (port 5434) - "Penalty box" for bad queries
  • postgres-reporter (port 5435) - Stores query issues & performance metrics
  • valkey (port 6379) - Cache for query decisions

Wait ~10 seconds for containers to be healthy:

docker compose ps

2. Run the tests:

# Run all tests EXCEPT real Bedrock (no AWS costs)
INTEGRATION_TESTS=true GODOG_TAGS=~@real-bedrock go test ./tests/... -v

# Run ALL tests including real Bedrock (requires AWS creds, costs money)
INTEGRATION_TESTS=true go test ./tests/... -v

3. Run specific test suites:

# Just blocking mode tests
INTEGRATION_TESTS=true GODOG_TAGS=@blocking go test ./tests/... -v

# Just shunt mode tests
INTEGRATION_TESTS=true GODOG_TAGS=@shunt go test ./tests/... -v

# Just masking tests
INTEGRATION_TESTS=true GODOG_TAGS=@masking go test ./tests/... -v

# Just real Bedrock tests (verifies AI analysis works)
INTEGRATION_TESTS=true GODOG_TAGS=@real-bedrock go test ./tests/... -v

4. Generate test reports:

# JUnit XML report (for CI systems)
INTEGRATION_TESTS=true GODOG_FORMAT=junit GODOG_TAGS=~@real-bedrock go test ./tests/... -v 2>&1 | tee test-report.xml

# Cucumber JSON report
INTEGRATION_TESTS=true GODOG_FORMAT=cucumber GODOG_TAGS=~@real-bedrock go test ./tests/... -v > test-report.json

5. Cleanup:

docker compose down -v

Test structure:

tests/
├── features/
│   ├── analysis/
│   │   ├── blocking.feature      # Blocking mode scenarios
│   │   ├── non_blocking.feature  # Non-blocking mode scenarios
│   │   ├── shunt.feature         # Shunt mode scenarios
│   │   └── real_bedrock.feature  # Real AWS Bedrock tests
│   ├── masking/
│   │   └── masking.feature       # Data masking scenarios
│   └── config_generation/
│       └── config.feature        # Config validation
├── steps/
│   └── common_steps.go           # Step definitions
├── support/
│   └── context.go                # Test helpers & mock Bedrock
└── integration_test.go           # Test runner

Writing new tests:

Tests use Gherkin syntax. Example:

@shunt @cartesian
Scenario: Cartesian product goes to shunt node
  Given the proxy is running in "shunt" mode
  And I am connected to the database through the proxy
  And the AI will reject queries containing "users, orders" with reason "Cartesian product"
  When I execute the query "SELECT * FROM users, orders"
  Then the query should succeed
  And the query should be routed to the shunt node

Mutation Testing

We use gremlins for mutation testing to verify test quality.

# Install gremlins
go install github.com/go-gremlins/gremlins/cmd/gremlins@latest

# Dry run (shows what mutations would be tested)
~/go/bin/gremlins unleash --dry-run ./internal/masking

# Run mutation tests (with longer timeout for reliability)
~/go/bin/gremlins unleash --timeout-coefficient 10 ./internal/masking
~/go/bin/gremlins unleash --timeout-coefficient 10 ./internal/config
~/go/bin/gremlins unleash --timeout-coefficient 10 ./internal/proxy

Mutation Testing Results Explained:

  • Killed: Tests caught the mutation (good)
  • Lived: Tests didn't catch the mutation (test gap)
  • Not Covered: Code not covered by tests
  • Test Efficacy: Percentage of covered mutations caught by tests

Target: >75% test efficacy, >50% mutator coverage.

Test Coverage

# Generate unit test coverage report
go test -coverprofile=coverage.out ./...

# View coverage in browser
go tool cover -html=coverage.out

# View coverage by function
go tool cover -func=coverage.out

Combined Coverage (Unit + BDD):

The BDD tests exercise the proxy binary end-to-end. To measure coverage from BDD tests:

# Create coverage directory
mkdir -p coverage/bdd

# Run BDD tests with instrumented binary
GOCOVERDIR=$(pwd)/coverage/bdd INTEGRATION_TESTS=true GODOG_TAGS=~@real-bedrock go test ./tests/... -v

# Convert to text format
go tool covdata textfmt -i=coverage/bdd -o=coverage/bdd.out

# View BDD coverage
go tool cover -func=coverage/bdd.out

The unit tests and BDD tests provide complementary coverage:

  • Unit tests: Test pure functions, mocked agent loops, auth validation
  • BDD tests: Test proxy startup, wire protocol, masking pipeline, query routing

Environment Variables

See .env for all available configuration options.

Architecture

Standard Mode (blocking/non-blocking)

Client -> PGProxy (port 5432) -> PostgreSQL (port 5433)
              |
              ├── JWT Authentication (optional)
              ├── Query Analysis (SELECT only)
              ├── Data Masking (optional)
              └── Query Blocking (blocking mode only)

With JWT Authentication

Client                    PGProxy                         Backend
  |                          |                               |
  |--- Startup Message ----->|                               |
  |<-- Password Request -----|                               |
  |--- JWT Token ----------->|                               |
  |                          |-- Validate JWT (Keycloak) --> |
  |                          |<- Token Valid ----------------|
  |                          |                               |
  |                          |-- Startup (user from URL) --->|
  |                          |   application_name=jwt:email  |
  |                          |<- Auth Request (SCRAM/MD5) ---|
  |                          |-- Auth Response ------------->|
  |                          |<- Auth OK --------------------|
  |<-- Auth OK --------------|                               |
  |                          |                               |
  |--- Query --------------->|--- Query -------------------->|
  |<-- Results --------------|<-- Results -------------------|

Shunt Mode

Client -> PGProxy (port 5432) --[good queries]--> Fast Node (port 5433)
              |                |
              |                └-[bad queries]--> Shunt Node (port 5434)
              |
              ├── Query Analysis (SELECT only)
              ├── Route decision cached
              └── Writes/transactions blocked

In shunt mode:

  • Good SELECT queries → routed to the fast (primary) node
  • Bad SELECT queries → routed to the shunt (slow) node
  • INSERT/UPDATE/DELETE → blocked entirely
  • Transactions (BEGIN) → blocked entirely

The proxy intercepts PostgreSQL wire protocol messages, analyzes/transforms queries, and forwards them to the appropriate backend database.

Documentation

Core Documentation

Technical Guides

Configuration Examples

About

AI-powered PostgreSQL proxy analyzing queries via Claude. Supports intelligent routing, data masking, and JWT authentication.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages