Skip to content

perf(auto-triage): scale to large backlogs via keyset paging + bulk close#948

Merged
scottbrumley merged 1 commit into
mainfrom
fix/auto-triage-tunning
Jun 14, 2026
Merged

perf(auto-triage): scale to large backlogs via keyset paging + bulk close#948
scottbrumley merged 1 commit into
mainfrom
fix/auto-triage-tunning

Conversation

@scottbrumley

Copy link
Copy Markdown
Contributor

No description provided.

…lose

The job timed out and couldn't drain large case backlogs: it paged the
incidents API by growing offset (slower each call) and closed one case per
update_incident call.

- replace offset pagination with a creation_time keyset cursor (offset stays
  0 every call), so fetch cost is flat and a 32k+ backlog no longer times out
- SOC Close Cases_V3 now posts incident_id_list (bulk) instead of a single
  incident_id; SOCAutoTriageScoreFilter emits pre-serialized close_batches of
  <=100 ids, turning ~32k single closes into ~320 bulk calls
- JOB - Auto Triage V3 feeds close_batches to the bulk close loop and sources
  score_threshold, window_hours, max_batches, batch_size from
  SOCOptimizationConfig_V3 (removes hard-coded max_batches "5")
- add TriageMaxBatches/TriageBatchSize to SOCOptimizationConfig_V3
- batch_size tunable, clamped to the 100 per-request ceiling; max_batches
  default 200; both fall back safely if a list key is missing

Deploy order: list -> script -> SOC Close Cases_V3 -> JOB.
@scottbrumley scottbrumley added the version:patch Bug fix or hotfix → x.x.N label Jun 14, 2026
@scottbrumley scottbrumley merged commit b25f5fb into main Jun 14, 2026
2 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

version:patch Bug fix or hotfix → x.x.N

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants