Skip to content

Latest commit

 

History

History
764 lines (585 loc) · 32 KB

File metadata and controls

764 lines (585 loc) · 32 KB

AWF Configuration Specification

Abstract

This specification defines the configuration model, processing rules, and environment semantics for the Agentic Workflow Firewall (AWF). It is the normative reference for:

  • the awf CLI runtime (--config)
  • tooling that compiles workflows into AWF invocations (e.g., gh-aw)
  • IDE and static-analysis validation via JSON Schema

The machine-readable schema is published alongside this specification at docs/awf-config.schema.json (live, tracking main) and as a versioned release asset (e.g., https://github.com/github/gh-aw-firewall/releases/download/v0.23.1/awf-config.schema.json).

Status of This Document

This document is normative. Informative notes are marked with Note: or placed in blockquotes. All other text is normative unless stated otherwise.

1. Conformance

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

A conforming AWF configuration document is one that:

  1. is valid JSON or YAML;
  2. satisfies all constraints defined by docs/awf-config.schema.json; and
  3. contains no properties beyond those defined by the schema (closed-world assumption).

A conforming AWF implementation MUST accept every conforming configuration document and MUST reject every non-conforming one.

2. Processing Model

When the user invokes awf --config <path|-> -- <command>, a conforming implementation MUST execute the following steps in order:

  1. If <path> is -, read configuration bytes from standard input.
  2. Determine the serialisation format:
    • If <path> ends with .json, parse as JSON.
    • If <path> ends with .yaml or .yml, parse as YAML.
    • Otherwise, attempt JSON first; if that fails, attempt YAML.
  3. Validate the parsed document against docs/awf-config.schema.json.
  4. On validation failure, abort with non-zero exit status (see §7).
  5. Map configuration fields to CLI-option semantics per §5.
  6. Apply precedence rules per §3.

3. Precedence Rules

The effective value for any configuration parameter SHALL be determined by the following precedence order (highest wins):

  1. Explicit CLI flags
  2. Config file (--config)
  3. AWF internal defaults

Note: This model enables reusable, checked-in configuration files with environment-specific CLI overrides.

4. Data Model

The root object of a conforming configuration document MAY contain the following top-level properties. All are OPTIONAL:

Property Type Description
$schema string JSON Schema URI for IDE validation
network object Network egress configuration
apiProxy object API proxy sidecar configuration
security object Security and isolation settings
container object Container and Docker settings
environment object Environment variable propagation (see §8)
logging object Logging and diagnostics
rateLimiting object Egress rate limiting

Property-level constraints, types, and descriptions are defined normatively by docs/awf-config.schema.json.

5. CLI Mapping

This section is normative.

Tools generating AWF invocations (such as gh-aw) SHOULD use the mapping below. The left side is the configuration-document path; the right side is the corresponding CLI flag.

  • network.allowDomains[]--allow-domains <csv>
  • network.blockDomains[]--block-domains <csv>
  • network.dnsServers[]--dns-servers <csv>
  • network.upstreamProxy--upstream-proxy
  • apiProxy.enabled--enable-api-proxy
  • apiProxy.enableOpenCode--enable-opencode
  • apiProxy.enableTokenSteering--enable-token-steering
  • apiProxy.anthropicAutoCache--anthropic-auto-cache
  • apiProxy.anthropicCacheTailTtl--anthropic-cache-tail-ttl <5m|1h>
  • apiProxy.maxEffectiveTokens(config-only; no CLI equivalent)
  • apiProxy.modelMultipliers--max-model-multiplier <model:multiplier,...>
  • apiProxy.maxRuns(config-only; no CLI equivalent)
  • apiProxy.models(config-only; model alias rewriting)
  • apiProxy.auth.type(config-only; maps to AWF_AUTH_TYPE)
  • apiProxy.auth.provider(config-only; maps to AWF_AUTH_PROVIDER)
  • apiProxy.auth.oidcAudience(config-only; maps to AWF_AUTH_OIDC_AUDIENCE)
  • apiProxy.auth.azureTenantId(config-only; maps to AWF_AUTH_AZURE_TENANT_ID)
  • apiProxy.auth.azureClientId(config-only; maps to AWF_AUTH_AZURE_CLIENT_ID)
  • apiProxy.auth.azureScope(config-only; maps to AWF_AUTH_AZURE_SCOPE)
  • apiProxy.auth.azureCloud(config-only; maps to AWF_AUTH_AZURE_CLOUD)
  • apiProxy.auth.awsRoleArn(config-only; maps to AWF_AUTH_AWS_ROLE_ARN)
  • apiProxy.auth.awsRegion(config-only; maps to AWF_AUTH_AWS_REGION)
  • apiProxy.auth.awsRoleSessionName(config-only; maps to AWF_AUTH_AWS_ROLE_SESSION_NAME)
  • apiProxy.auth.gcpWorkloadIdentityProvider(config-only; maps to AWF_AUTH_GCP_WORKLOAD_IDENTITY_PROVIDER)
  • apiProxy.auth.gcpServiceAccount(config-only; maps to AWF_AUTH_GCP_SERVICE_ACCOUNT)
  • apiProxy.auth.gcpScope(config-only; maps to AWF_AUTH_GCP_SCOPE)
  • apiProxy.targets.<provider>.host--<provider>-api-target
  • apiProxy.targets.openai.basePath--openai-api-base-path
  • apiProxy.targets.anthropic.basePath--anthropic-api-base-path
  • apiProxy.targets.gemini.basePath--gemini-api-base-path
  • security.sslBump--ssl-bump
  • security.enableDlp--enable-dlp
  • security.enableHostAccess--enable-host-access
  • security.allowHostPorts--allow-host-ports
  • security.allowHostServicePorts--allow-host-service-ports
  • security.difcProxy.host--difc-proxy-host
  • security.difcProxy.caCert--difc-proxy-ca-cert
  • container.memoryLimit--memory-limit
  • container.agentTimeout--agent-timeout
  • container.enableDind--enable-dind
  • container.workDir--work-dir
  • container.containerWorkDir--container-workdir
  • container.imageRegistry--image-registry
  • container.imageTag--image-tag
  • container.skipPull--skip-pull
  • container.buildLocal--build-local
  • container.agentImage--agent-image
  • container.tty--tty
  • container.dockerHost--docker-host
  • container.dockerHostPathPrefix--docker-host-path-prefix
  • environment.envFile--env-file
  • environment.envAll--env-all
  • environment.excludeEnv[]--exclude-env (repeatable)
  • logging.logLevel--log-level
  • logging.diagnosticLogs--diagnostic-logs
  • logging.auditDir--audit-dir
  • logging.proxyLogsDir--proxy-logs-dir
  • logging.sessionStateDir--session-state-dir
  • rateLimiting.enabled: false--no-rate-limit
  • rateLimiting.requestsPerMinute--rate-limit-rpm
  • rateLimiting.requestsPerHour--rate-limit-rph
  • rateLimiting.bytesPerMinute--rate-limit-bytes-pm

The following CLI flag has no config-file equivalent by design:

  • -e, --env <KEY=VALUE> — inject a single environment variable into the agent container (repeatable; CLI-only)

6. Standard Input Mode

A conforming implementation MUST accept --config - to read configuration from standard input, enabling programmatic and pipeline scenarios.

7. Error Reporting

On parse or validation failure, a conforming implementation MUST:

  1. exit with a non-zero status code;
  2. emit a diagnostic message identifying the location and nature of the error; and
  3. refrain from partial execution of the agent command.

8. Environment Merge Semantics

This section is normative.

The agent container's environment is constructed by merging variables from multiple sources. This section defines the merge order and exclusion rules.

Note: For usage guidance, examples, and troubleshooting, see docs/environment.md.

8.1 Merge Precedence

Variables from the following sources are merged in order of increasing precedence. A value set at a higher level MUST override the same-named value from any lower level.

Level Source Description
1 (lowest) AWF-reserved Proxy routing, DNS, container paths
2 --env-all Inherited host environment (when enabled)
3 --env-file Variables read from a file
4 (highest) -e / --env Explicit CLI key-value pairs

8.2 AWF-Reserved Variables

A conforming implementation MUST set the following variables in the agent container regardless of user configuration. Values from --env-all and --env-file MUST NOT override these variables.

Variable Value Purpose
HTTP_PROXY http://<squid-ip>:3128 Squid forward proxy for HTTP
HTTPS_PROXY http://<squid-ip>:3128 Squid forward proxy for HTTPS
https_proxy http://<squid-ip>:3128 Lowercase alias (Yarn 4, undici, Corepack)
NO_PROXY localhost,127.0.0.1,::1,... Loopback and container IPs bypassing Squid
SQUID_PROXY_HOST squid-proxy Proxy hostname (for tools requiring host separately)
SQUID_PROXY_PORT 3128 Proxy port
PATH (container default) MUST use the container's PATH, not the host's
HOME (host user's home) Derived via sudo-aware detection

Note: Lowercase http_proxy is intentionally NOT set. Certain curl builds on Ubuntu 22.04 ignore uppercase HTTP_PROXY for HTTP URLs (httpoxy mitigation), causing HTTP traffic to fall through to iptables DNAT interception — the intended defense-in-depth behavior.

8.3 Excluded Variables

The following variables MUST be excluded from --env-all and --env-file passthrough. A conforming implementation MUST NOT inherit them from the host:

Category Variables
System PATH, PWD, OLDPWD, SHLVL, _, SUDO_COMMAND, SUDO_USER, SUDO_UID, SUDO_GID
Proxy HTTP_PROXY, HTTPS_PROXY, http_proxy, https_proxy, NO_PROXY, no_proxy, ALL_PROXY, all_proxy, FTP_PROXY, ftp_proxy
Actions artifact tokens ACTIONS_RUNTIME_TOKEN, ACTIONS_RESULTS_URL
AWF internal controls AWF_PREFLIGHT_BINARY, AWF_GEMINI_ENABLED

Note: Host proxy variables are read for upstream proxy auto-detection (see --upstream-proxy) but MUST NOT propagate into the agent container. AWF sets its own proxy variables pointing to Squid.

8.4 Selectively Forwarded Variables

When --env-all is NOT active, a conforming implementation SHOULD forward the following host variables into the agent container:

Category Variables
GitHub authentication GITHUB_TOKEN, GH_TOKEN, GITHUB_PERSONAL_ACCESS_TOKEN
GitHub enterprise GITHUB_SERVER_URL, GITHUB_API_URL
Actions OIDC ACTIONS_ID_TOKEN_REQUEST_URL, ACTIONS_ID_TOKEN_REQUEST_TOKEN
Docker client DOCKER_HOST, DOCKER_TLS, DOCKER_TLS_VERIFY, DOCKER_CERT_PATH, DOCKER_CONFIG, DOCKER_CONTEXT, DOCKER_API_VERSION, DOCKER_DEFAULT_PLATFORM
User environment USER, XDG_CONFIG_HOME

When --env-all IS active, all host variables not in the excluded set (§8.3) SHALL be forwarded, subject to credential isolation rules (§9).

8.5 Explicit Overrides

Variables passed via -e / --env MUST override all other sources, including AWF-reserved variables. This is the only mechanism by which proxy routing variables MAY be overridden.

Note: There is no config-file equivalent for -e / --env. Individual environment variable injection is a runtime concern, not a static configuration concern.

9. Credential Isolation Semantics

This section is normative.

AWF implements defense-in-depth credential isolation for LLM API keys. Behavior is governed by the value of apiProxy.enabled.

Note: For architectural diagrams and protocol-level details, see docs/authentication-architecture.md.

9.1 Source Credentials

A conforming implementation MUST recognize the following environment variables as source credentials — real API keys read from the host:

Variable Provider
OPENAI_API_KEY OpenAI
ANTHROPIC_API_KEY Anthropic (Claude)
COPILOT_GITHUB_TOKEN GitHub Copilot
COPILOT_API_KEY GitHub Copilot (BYOK)
GEMINI_API_KEY Google Gemini

The following secondary aliases SHOULD also be recognized: OPENAI_KEY, CODEX_API_KEY, CLAUDE_API_KEY, COPILOT_PROVIDER_API_KEY.

9.2 API Proxy Enabled (apiProxy.enabled = true)

When the API proxy sidecar is enabled, the following rules apply:

  1. Source credentials (§9.1) MUST NOT be exposed in the agent container's environment. They SHALL be passed exclusively to the API proxy sidecar.

  2. The --env-all flag MUST NOT reintroduce excluded credentials into the agent environment.

  3. A conforming implementation MAY inject placeholder values into the agent container for tool compatibility (e.g., OPENAI_API_KEY=sk-placeholder-for-api-proxy). Placeholder values are not secrets and MUST NOT be treated as credentials.

  4. A conforming implementation MUST inject proxy-routing variables so that agent tools reach the sidecar rather than upstream APIs:

    Agent variable Value Purpose
    OPENAI_BASE_URL http://172.30.0.30:10000 Routes OpenAI calls to sidecar
    ANTHROPIC_BASE_URL http://172.30.0.30:10001 Routes Anthropic calls to sidecar
    COPILOT_API_URL http://172.30.0.30:10002 Routes Copilot calls to sidecar
    GOOGLE_GEMINI_BASE_URL http://172.30.0.30:10003 Routes Gemini calls to sidecar
    GEMINI_API_BASE_URL http://172.30.0.30:10003 Alias for compatibility
  5. The API proxy sidecar SHALL inject the real credentials into upstream requests. Sidecar port assignments: 10000 (OpenAI), 10001 (Anthropic), 10002 (Copilot), 10003 (Gemini), 10004 (OpenCode).

  6. A conforming implementation MUST forward the following OpenTelemetry variables from the host into the api-proxy sidecar container so that the sidecar can participate in the distributed trace established by the workflow:

    Variable Description
    OTEL_EXPORTER_OTLP_ENDPOINT OTLP/HTTP collector URL. When present, activates span export via Squid proxy.
    OTEL_EXPORTER_OTLP_HEADERS Comma-separated key=value auth headers for the OTLP endpoint.
    OTEL_SERVICE_NAME Service name tag. Defaults to awf-api-proxy when not set.
    GITHUB_AW_OTEL_TRACE_ID W3C trace-id of the parent workflow trace.
    GITHUB_AW_OTEL_PARENT_SPAN_ID W3C span-id of the parent workflow span.

    These variables are NOT forwarded to the agent container via this mechanism; the agent receives OTEL variables through the standard OTEL_* prefix forwarding described in §8.4.

    When OTEL_EXPORTER_OTLP_ENDPOINT is absent, the sidecar writes span NDJSON to /var/log/api-proxy/otel.jsonl as a local fallback. When GITHUB_AW_OTEL_TRACE_ID / GITHUB_AW_OTEL_PARENT_SPAN_ID are present and valid hex, each sidecar span is created as a child of the specified parent span, enabling end-to-end distributed tracing from the GitHub Actions workflow through the api-proxy to the LLM provider.

9.3 API Proxy Disabled (apiProxy.enabled = false)

When the API proxy sidecar is disabled (the default):

  1. Source credentials present in the host environment SHOULD be forwarded directly to the agent container.
  2. No proxy-routing variables or placeholder values SHALL be injected.

9.4 Credential Exclusion Requires API Proxy

This constraint is normative for tools generating AWF configurations.

A conforming configuration MUST NOT exclude a source credential (§9.1) via environment.excludeEnv unless apiProxy.enabled is true. Excluding a credential without enabling the API proxy leaves the agent with no key and no placeholder, causing authentication failures at runtime.

Tools that compile AWF configurations (e.g., gh-aw) MUST ensure that when an LLM agent requires an API key (OpenAI, Anthropic, Gemini, etc.), one of the following holds:

  1. apiProxy.enabled = true — the real key is held by the sidecar, and a placeholder is injected for tool compatibility; or
  2. The key is forwarded directly to the agent container (non-proxy mode).

Emitting excludeEnv: ["OPENAI_API_KEY"] without apiProxy.enabled: true is a configuration error. A conforming implementation MAY emit a warning when this condition is detected.

9.4 One-Shot Token Protection

Real credentials forwarded to the agent — whether source credentials in non-proxy mode (§9.3) or GitHub tokens (GITHUB_TOKEN, GH_TOKEN) — MUST be protected by the one-shot-token mechanism. Protected tokens are cached on first access and removed from /proc/self/environ to prevent environment variable inspection.

The default protected token list is:

COPILOT_GITHUB_TOKEN, GITHUB_TOKEN, GH_TOKEN, GITHUB_API_TOKEN,
GITHUB_PAT, GH_ACCESS_TOKEN, OPENAI_API_KEY, OPENAI_KEY,
ANTHROPIC_API_KEY, CLAUDE_API_KEY, CODEX_API_KEY, COPILOT_API_KEY,
COPILOT_PROVIDER_API_KEY

Placeholder compatibility values (§9.2 item 3) are not secrets and MUST NOT be subject to one-shot protection.

9.5 OIDC Authentication

When apiProxy.auth.type is set to github-oidc, the API proxy sidecar exchanges a GitHub Actions OIDC token for a provider-specific access token. The apiProxy.auth.provider field (default: azure) selects the token exchange protocol. A conforming implementation MUST:

  1. Forward the common OIDC configuration to the sidecar via the following environment variables:

    Config path Environment variable Required Default
    apiProxy.auth.type AWF_AUTH_TYPE
    apiProxy.auth.provider AWF_AUTH_PROVIDER No azure
    apiProxy.auth.oidcAudience AWF_AUTH_OIDC_AUDIENCE No (provider-specific)
  2. Forward the GitHub Actions OIDC runtime tokens (ACTIONS_ID_TOKEN_REQUEST_URL, ACTIONS_ID_TOKEN_REQUEST_TOKEN) to the sidecar when AWF_AUTH_TYPE=github-oidc. These are injected automatically by the Actions runner when the workflow declares permissions: id-token: write.

  3. NOT expose the exchanged provider token in the agent container environment. The sidecar SHALL inject it into upstream request headers.

9.5.1 Azure Provider (provider: azure)

Exchanges the GitHub OIDC JWT for an Azure AD / Microsoft Entra access token via workload identity federation. The sidecar injects the resulting token as a Bearer Authorization header on upstream requests.

Config path Environment variable Required Default
apiProxy.auth.azureTenantId AWF_AUTH_AZURE_TENANT_ID
apiProxy.auth.azureClientId AWF_AUTH_AZURE_CLIENT_ID
apiProxy.auth.azureScope AWF_AUTH_AZURE_SCOPE No https://cognitiveservices.azure.com/.default
apiProxy.auth.azureCloud AWF_AUTH_AZURE_CLOUD No public

Default OIDC audience: api://AzureADTokenExchange

Note: azureTenantId and azureClientId are required for Azure AD federated credential exchange but MAY be omitted when using managed identity. See docs/api-proxy-sidecar.md for protocol-level details.

9.5.2 AWS Provider (provider: aws)

Exchanges the GitHub OIDC JWT for temporary AWS credentials via sts.amazonaws.com AssumeRoleWithWebIdentity. The sidecar uses these credentials to sign upstream requests to AWS Bedrock using SigV4.

Config path Environment variable Required Default
apiProxy.auth.awsRoleArn AWF_AUTH_AWS_ROLE_ARN
apiProxy.auth.awsRegion AWF_AUTH_AWS_REGION
apiProxy.auth.awsRoleSessionName AWF_AUTH_AWS_ROLE_SESSION_NAME No awf-oidc-session

Default OIDC audience: sts.amazonaws.com

Note: AWS Bedrock uses IAM/SigV4 request signing rather than Bearer tokens. This means the sidecar MUST sign the complete request (method, path, headers, body hash) with the temporary credentials — it is not sufficient to inject a single Authorization header.

9.5.3 GCP Provider (provider: gcp)

Exchanges the GitHub OIDC JWT for a GCP access token via the Security Token Service (sts.googleapis.com), optionally followed by service account impersonation via iamcredentials.googleapis.com. The sidecar injects the resulting token as a Bearer Authorization header.

Config path Environment variable Required Default
apiProxy.auth.gcpWorkloadIdentityProvider AWF_AUTH_GCP_WORKLOAD_IDENTITY_PROVIDER
apiProxy.auth.gcpServiceAccount AWF_AUTH_GCP_SERVICE_ACCOUNT No
apiProxy.auth.gcpScope AWF_AUTH_GCP_SCOPE No https://www.googleapis.com/auth/cloud-platform

Default OIDC audience: the gcpWorkloadIdentityProvider value

When gcpServiceAccount is provided, the sidecar performs a two-step exchange:

  1. Exchange GitHub OIDC JWT for a federated access token via GCP STS
  2. Impersonate the service account to obtain a short-lived OAuth2 token

When gcpServiceAccount is omitted, only step 1 is performed and the federated token is used directly. This requires that the federated principal has direct access grants on the target resource.

9.6 DIFC Proxy Credential Isolation

When security.difcProxy.host is set, GITHUB_TOKEN and GH_TOKEN MUST be excluded from the agent environment. These tokens SHALL be held exclusively by the external DIFC proxy.

10. Effective Token Budget Enforcement

This section is normative.

When apiProxy.maxEffectiveTokens is configured, the API proxy MUST enforce a cumulative effective-token budget across all LLM API requests in a single run. The budget limits total weighted token consumption, not raw token counts.

10.1 Token Weighting

Each upstream response's usage object is decomposed into four categories, each with a fixed weight:

Category Weight Usage field
Input 1.0 input_tokens / prompt_tokens
Cache read 0.1 cache_read_input_tokens / prompt_tokens_details.cached_tokens
Output 4.0 output_tokens / completion_tokens
Reasoning 4.0 reasoning_tokens / completion_tokens_details.reasoning_tokens

The base weighted tokens for a single response are:

base = (1.0 × input) + (0.1 × cache_read) + (4.0 × output) + (4.0 × reasoning)

10.2 Model Multipliers

When apiProxy.modelMultipliers is configured, each model name MAY have an associated positive multiplier. The effective tokens for a response are:

effective_tokens = model_multiplier × base_weighted_tokens

If no multiplier is configured for a given model, the multiplier defaults to 1.

10.3 Enforcement Behavior

The API proxy MUST enforce the budget as follows:

  1. Accumulation: After each successful upstream response, the proxy extracts the usage object, computes effective tokens, and adds them to a running total for the session.

  2. Pre-request check: Before forwarding each subsequent request to the upstream provider, the proxy checks whether the cumulative total has reached or exceeded maxEffectiveTokens.

  3. Rejection: When the budget is reached or exceeded, the proxy MUST reject the request with:

    • HTTP status: 429 Too Many Requests
    • Content-Type: application/json
    • Response body:
      {
        "error": {
          "type": "effective_tokens_limit_exceeded",
          "message": "Maximum effective tokens exceeded (1234.56 / 1000).",
          "total_effective_tokens": 1234.56,
          "max_effective_tokens": 1000
        }
      }
  4. WebSocket rejection: For WebSocket upgrade requests, the proxy MUST reject with HTTP/1.1 429 Too Many Requests and include the same JSON error body before destroying the socket.

  5. Finality: Once the budget is reached or exceeded, all subsequent requests in the same run MUST be rejected. The budget is not recoverable.

10.4 Threshold Tracking

The proxy MUST track when cumulative effective tokens cross the following percentage thresholds of maxEffectiveTokens:

Threshold
80%
90%
95%
99%

Each threshold MUST be recorded at most once per run.

10.5 Token Steering

Token steering is opt-in. It is active only when apiProxy.enableTokenSteering is true (CLI: --enable-token-steering). When disabled (the default), thresholds are still tracked (for introspection) but no warning messages are injected.

When token steering is enabled and a threshold is first crossed, the proxy MUST inject a budget-warning system message into the body of the very next eligible request sent by the agent, then discard the pending message so that it is injected at most once per threshold per run.

The injected message has the format:

[AWF TOKEN WARNING] <threshold-specific text>
Threshold Injected text
80% You have used 80% of your effective token budget. Begin planning to wrap up your current work.
90% You have used 90% of your effective token budget. Complete your current task and prepare final output.
95% You have used 95% of your effective token budget. Finalize and submit your work now.
99% You have used 99% of your effective token budget. You are about to be cut off. Submit immediately.

If multiple thresholds are crossed simultaneously (e.g. a single large response crosses both 80% and 90%), the proxy MUST inject only the highest crossed threshold on the next request and queue the remaining thresholds for subsequent requests (one per request).

Provider-specific injection rules:

  • OpenAI / Copilot / OpenCode — the proxy inserts a { "role": "system", "content": "<message>" } entry into the messages array immediately after any pre-existing system messages.
  • Anthropic — the proxy appends the warning to the system field: if system is a string it is concatenated (separated by \n\n); if system is an array of content blocks a { "type": "text", "text": "<message>" } block is appended; if system is absent it is created as the warning string.
  • Gemini — the proxy appends a { "text": "<message>" } part to systemInstruction.parts; if systemInstruction is absent it is created.

If the request body cannot be parsed as JSON, or if the body format does not match the expected structure, the proxy MUST silently skip injection for that request and NOT re-queue the message.

When token steering is enabled and container.agentTimeout is configured, the proxy MUST also inject runtime warnings at 80/90/95/99% of elapsed run time using the same queueing behavior (highest crossed threshold first, then one pending warning per subsequent request):

[AWF TIME WARNING] <threshold-specific text>

10.6 Introspection

The API proxy exposes a GET /reflect endpoint on every provider port (10000–10004). Each port returns the same aggregate reflection payload, whose endpoints array lists all provider adapters. Only the management port (10000, OpenAI) serves /metrics and the aggregate /health; non-management ports still serve provider-local /health responses.

When the /reflect endpoint is queried, the response MUST include the current effective-token state:

{
  "effective_tokens": {
    "enabled": true,
    "max_effective_tokens": 1000,
    "total_effective_tokens": 456.78,
    "remaining_effective_tokens": 543.22,
    "percent_used": 45.68,
    "thresholds_crossed": []
  }
}

When maxEffectiveTokens is not configured, the enabled field MUST be false and numeric fields MUST be 0 or null.

11. Max-Runs Enforcement

This section is normative.

When apiProxy.maxRuns is configured, the API proxy MUST enforce an absolute maximum number of LLM invocations per run.

11.1 Counting Invocations

An invocation is counted each time the proxy receives a successful (2xx) HTTP response from an upstream LLM provider. Each response increments a per-run counter by one, regardless of the number of tokens consumed.

11.2 Enforcement Behavior

The API proxy MUST enforce the max-runs limit as follows:

  1. Pre-request check: Before forwarding each request to the upstream provider, the proxy checks whether the invocation count has reached or exceeded maxRuns.

  2. Rejection: When the limit is reached or exceeded, the proxy MUST reject the request with:

    • HTTP status: 429 Too Many Requests
    • Content-Type: application/json
    • Response body:
      {
        "error": {
          "type": "max_runs_exceeded",
          "message": "Maximum LLM invocations exceeded (5 / 5).",
          "invocation_count": 5,
          "max_runs": 5
        }
      }
  3. WebSocket rejection: For WebSocket upgrade requests, the proxy MUST reject with HTTP/1.1 429 Too Many Requests and include the same JSON error body before destroying the socket.

  4. Finality: Once the limit is reached, all subsequent requests in the same run MUST be rejected. The counter is not recoverable.

11.3 Introspection

The /reflect endpoint (available on all provider ports 10000–10004; see §10.6) MUST include the current max-runs state:

{
  "runs": {
    "enabled": true,
    "max_runs": 5,
    "invocation_count": 3,
    "remaining_runs": 2
  }
}

When maxRuns is not configured, the enabled field MUST be false and max_runs and remaining_runs MUST be null.

Normative References

  • RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels
  • docs/awf-config.schema.json — Machine-readable JSON Schema for configuration documents (normative)

Runtime JSONL Schemas

AWF emits structured JSONL artifact files at runtime. Each record type has a corresponding JSON Schema in the schemas/ directory:

Schema JSONL file Description
schemas/audit.schema.json audit.jsonl L7 HTTP/HTTPS traffic decisions (allowed/denied) from the Squid proxy
schemas/token-usage.schema.json token-usage.jsonl Per-API-call token usage records from the api-proxy sidecar

Versioning

Schema files do not carry an independent version. The repository release tag serves as the version:

  • The $id field in each schema resolves to a stable release download URL.
  • Each JSONL record includes a _schema wire-format field encoding the record type and AWF version (e.g., "_schema": "audit/v0.26.0").
  • Consumers SHOULD use a prefix match (_schema.startsWith("audit/")) rather than an exact match to handle future versions gracefully.

Published locations

Versioned (release assets):

https://github.com/github/gh-aw-firewall/releases/download/<tag>/awf-config.schema.json
https://github.com/github/gh-aw-firewall/releases/download/<tag>/audit.schema.json
https://github.com/github/gh-aw-firewall/releases/download/<tag>/token-usage.schema.json

Latest (main branch):

https://raw.githubusercontent.com/github/gh-aw-firewall/main/docs/awf-config.schema.json
https://raw.githubusercontent.com/github/gh-aw-firewall/main/schemas/audit.schema.json
https://raw.githubusercontent.com/github/gh-aw-firewall/main/schemas/token-usage.schema.json

Informative References