Focus Guard is a Rust dynamic module for Envoy that enforces deterministic retry guardrails before retry storms become operational and cognitive overload.
It is designed for high-pressure incident paths where teams need behavior that is predictable, inspectable, and easy to reason about.
- Rust toolchain (
rustc,cargo) - Built On Envoy CLI (
boe)
boe run --local . --config '{
"retry_threshold": 3,
"overload_status_code": 429,
"overload_body": "Focus Guard: retry overload detected.",
"enable_tars": false
}'curl -i -H "x-envoy-attempt-count: 1" http://localhost:10000/status/200
curl -i -H "x-envoy-attempt-count: 3" http://localhost:10000/status/200cargo testUse STORYBOARD.md for scene-by-scene visual documentation and YouTube-ready assembly instructions.
Important
Judge Checklist
- Deterministic guardrails are enforced in-path (threshold-based pass/throttle)
- Callback safety is preserved (no blocking outbound model calls in HTTP stream callbacks)
- Observability is explicit (headers, dynamic metadata, counters, histogram)
- Behavior is reproducible with simple CLI commands
- Validation is covered by unit tests across config parsing and decision paths
- Demo workflow is documented for submission via
STORYBOARD.md
Important
Submission Checklist
- Clear incident-response problem statement
- Deterministic proxy-layer approach
- Distinct technical contribution and safety boundary
- Practical impact on operator clarity and focus
Incident response often degrades into noisy retry loops. That noise increases ambiguity, fragments operator attention, and slows debugging.
Focus Guard applies deterministic threshold-based retry controls first, then (optionally) invokes asynchronous TARS AI callouts for below-threshold traffic to produce explicit pass/throttle outcomes with strong observability signals.
- Reframes resilience controls around cognitive clarity as well as service reliability.
- Implements the control in-path as a native Envoy Rust dynamic module.
- Prioritizes deterministic behavior first; optional AI is intentionally gated behind callback-safe boundaries.
Focus Guard converts noisy retry churn into explicit, debuggable states (pass vs throttled), helping teams maintain focus while preserving predictable runtime behavior.
At request-header time, the filter:
- Reads
x-envoy-attempt-count(defaults to1if missing or invalid). - Records telemetry (
focus_guard.requests_total, retry histogram). - Applies deterministic throttling if
attempt >= retry_threshold. - If below threshold and
enable_tars=true, sends an asynchronous TARS callout and buffers request progression. - Continues decoding on TARS
pass, or sends a local overload response on TARSthrottle. - Falls back to fail-open/fail-closed behavior when TARS callout initialization or response parsing fails.
At response-header time, it emits deterministic decision headers:
x-focus-guard:passorthrottledx-focus-guard-retry-attempt: parsed attempt countx-focus-guard-tars:disabledoractive
RawFilterConfig: JSON-deserializable user config.FilterConfigImpl: validated runtime config and metric handles.FilterConfig: global config factory and metric definition point.PerRouteFilterConfig: optional per-route behavior overrides.Filter: per-stream state (last_retry_attempt,throttled,pending_tars_callout_id) and async callout result handling viaon_http_callout_done.
flowchart LR
A[Request headers] --> B[Parse x-envoy-attempt-count]
B --> C[Record metrics and dynamic metadata]
C --> D{attempt >= retry_threshold?}
D -- Yes --> E[Mark throttled]
E --> F[Send local overload response]
F --> G[StopIteration]
D -- No --> H{enable_tars?}
H -- No --> I[Mark pass]
I --> J[Continue filter chain]
H -- Yes --> K[Send async TARS callout]
K --> L[StopAllIterationAndBuffer]
L --> M[on_http_callout_done]
M --> N{TARS decision/result}
N -- pass --> O[Mark pass + continue_decoding]
N -- throttle --> P[Mark throttled + local overload response]
N -- error/unparseable --> Q{tars_fail_open?}
Q -- true --> O
Q -- false --> P
J --> R[Response headers callback]
G --> R
O --> R
P --> R
R --> S[Set x-focus-guard, retry-attempt, tars headers]
enable_tars is callback-safe and non-blocking:
- Uses Envoy async HTTP callouts from the stream filter context.
- Returns
StopAllIterationAndBufferwhile waiting on TARS, then resumes withcontinue_decoding()when appropriate. - Applies explicit fail-open or fail-closed fallback when callouts fail or responses are unparseable.
- Never performs direct blocking network calls inside stream callbacks.
Focus Guard accepts JSON config through the dynamic-module filter config.
Supported fields:
retry_threshold(default3, minimum enforced1)overload_status_code(default429, validated to100..=599, fallback429)overload_body(default"Focus Guard throttled request due to retry overload.")enable_tars(defaultfalse; enables async TARS decision callouts)tars_cluster(default\"api.router.tetrate.ai:443\")tars_authority(default\"api.router.tetrate.ai\")tars_path(default\"/v1/chat/completions\")tars_model(default\"o3-mini\")tars_timeout_milliseconds(default250)tars_fail_open(defaulttrue)tars_api_key(optional bearer token)
Example:
{
"retry_threshold": 3,
"overload_status_code": 429,
"overload_body": "Focus Guard: please pause and retry later.",
"enable_tars": false
}Config-scoped metrics (defined when available):
focus_guard.requests_total(counter)focus_guard.throttled_total(counter)focus_guard.retry_attempt(histogram)
Namespace: focus_guard
retry_attempt(number)throttled(bool)
- Request header set by filter:
x-focus-guard - Local response headers on throttle:
x-focus-guard: throttledx-focus-guard-tars: disabled|activecontent-type: text/plain; charset=utf-8
- Response headers:
x-focus-guardx-focus-guard-retry-attemptx-focus-guard-tars
- Rust toolchain (
rustc,cargo) - Built On Envoy CLI (
boe)
cargo testboe run --local .Then exercise the route:
curl -v http://localhost:10000/status/200boe run --local . --config '{
"retry_threshold": 3,
"overload_status_code": 429,
"overload_body": "Focus Guard: retry overload detected.",
"enable_tars": false
}'Use STORYBOARD.md for a scene-by-scene production plan, including visuals, voiceover script, command captures, and a fast assembly workflow for YouTube upload.
Command sequence used in the storyboard:
boe run --local . --config '{
"retry_threshold": 3,
"overload_status_code": 429,
"overload_body": "Focus Guard: retry overload detected.",
"enable_tars": false
}'
curl -i -H "x-envoy-attempt-count: 1" http://localhost:10000/status/200
curl -i -H "x-envoy-attempt-count: 3" http://localhost:10000/status/200
cargo testImportant
Validation Checklist
- Config parsing and defaults are tested
- Invalid/fallback config paths are tested
- Deterministic pass/throttle decisions are tested
- Response signaling headers are tested Unit tests currently cover:
- config defaults and custom parsing
- invalid config fallback behavior
- per-route config parsing failure cases
- deterministic pass/throttle request-header behavior
- async TARS callout start and completion behavior
- TARS decision parsing and response-header signaling behavior
The current scope provides deterministic safety by default and optional AI-assisted throttling through callback-safe asynchronous TARS callouts.
Apache-2.0