Skip to content

perf(skill): slim incident-summary output + serialize same-host monit-agent probes#68

Open
ysyneu wants to merge 1 commit into
feat/ai-srefrom
audit-fix/2026-06-26-skill-token-eff
Open

perf(skill): slim incident-summary output + serialize same-host monit-agent probes#68
ysyneu wants to merge 1 commit into
feat/ai-srefrom
audit-fix/2026-06-26-skill-token-eff

Conversation

@ysyneu

@ysyneu ysyneu commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Two token-efficiency fixes surfaced by /audit-ai-sre-sessions (audit run audit-2026-06-26). Both touch files kept byte-identical with their embedded copies in fc-safari (logic/runtime/bootstrap/skills/flashduty/) — mirrored in linked PR flashcatcloud/fc-safari#337 (the two must land together).

P6 — slim incident-summary.sh (scripts/incident-summary.sh)

What: the script dumped ~80K chars of toon per incident (81042 / 79075 in the audited session), most of it empty boilerplate, exceeding the inline output cap and forcing 3-4 sequential read calls to page ~600 lines back in (steps 5,6,8,10,12,21,22,24 of sess_TRdrnq5oA345qF66GgbYrY).

Root cause: run() appended --output-format toon to every command. For these read verbs that selects the machine-readable branch, which marshals the full raw response object — every empty field on incident detail (account_locale, account_name, ai_summary: "", …) plus heavy blobs like a change's labels.steps deploy-JSON. Each verb's default (table/summary) renderer is instead a curated projection of exactly the summary-relevant fields (printIncidentFullDetail → id/severity/progress/channel/timestamps/ai_summary/root_cause/resolution/impact/description/labels/responders; change list → id/title/status/channel/time; etc.).

Fix: drop --output-format toon from run() so each command uses its lean default renderer — that default is the field projection a fault summary needs. Kept all six commands, set -uo pipefail, the no-errexit intent, and the read-only "print real output" purpose. Adjusted the header note (the lean detail shows the channel by name; to scope post-mortems to the incident's channel, read channel_id via incident info … --output-format toon | grep '^channel_id:').

Tradeoff: the lean change list table omits labels (one correlation signal) along with the noisy steps blob; the incident's own labels still print under detail, and a specific change can be drilled into via the change card's documented toon path. Net: a massive token cut for a fast overview.

P8 — serialize same-host monit-agent probes (reference/monit-agent.md)

What: in sess_QmoruPkRjrwA2tcHvEbNvT (steps 10,11,12) os.top_processes was rejected with code=overloadedper-target concurrency limit reached for kind="host" locator="10.101.214.50" — because multiple monit-agent probes were fired in parallel against the same host, forcing an identical retry that re-sent the growing context.

Fix: added a Gotchas line steering fan-out to serialize per target (batch a host's tools into a single invoke, whose tools array already runs concurrently agent-side) and parallelize only across distinct targets. Prose outside the GENERATED fence.

Verification

bash -n scripts/incident-summary.sh         # OK
shellcheck scripts/incident-summary.sh      # no findings
env -u GOROOT go build ./...                 # BUILD OK
go run ./internal/cmd/skilldoc check         # skilldoc: cards OK  (fence unchanged)

…it-agent probes

P6 — incident-summary.sh dumped ~80K chars of toon per incident (81042/79075
in the audited sessions), most of it empty boilerplate, overflowing the inline
output cap and forcing 3-4 sequential reads to page it back in. Root cause: the
script appended `--output-format toon` to every command, which takes each verb's
machine-readable branch and marshals the full raw response object (every empty
field on `incident detail`, plus heavy blobs like a change's `labels.steps`).
The DEFAULT (table/summary) renderer of each of these read verbs is a curated
projection of exactly the summary-relevant fields (id/severity/status/title/
channel/timestamps/ai_summary/root_cause/…). Fix: drop `--output-format toon`
from run() so each command uses its lean default renderer; that default IS the
field projection a fault summary needs. Kept all six commands, set -uo pipefail,
and the read-only "print real output" intent.

P8 — monit-agent.md: parallelizing multiple probes against the SAME host hit the
per-target concurrency limit and returned `code=overloaded`, forcing an identical
retry that re-sent the growing context. Added a Gotchas line steering fan-out to
serialize probes per target (batch a host's tools into one `invoke`) and
parallelize only across distinct targets.

Evidence: audit run audit-2026-06-26, sessions sess_TRdrnq5oA345qF66GgbYrY (P6)
and sess_QmoruPkRjrwA2tcHvEbNvT (P8). Surfaced by /audit-ai-sre-sessions.

These two files are kept byte-identical with their embedded copies in fc-safari
(logic/runtime/bootstrap/skills/flashduty/); mirrored there in a linked PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant