Skip to content

Recycle problem runner workers between problems.#8

Closed
rohangpta wants to merge 1 commit into
SprocketLab:mainfrom
rohangpta:fix/stream-buffer-accumulation
Closed

Recycle problem runner workers between problems.#8
rohangpta wants to merge 1 commit into
SprocketLab:mainfrom
rohangpta:fix/stream-buffer-accumulation

Conversation

@rohangpta

@rohangpta rohangpta commented May 27, 2026

Copy link
Copy Markdown
Contributor

Why

Long SCBench runs execute many problems through a small process pool. Each problem builds a full agent/evaluation/
runtime stack: agent clients, Docker clients, subprocesses, temp dirs, logging state, progress threads, and third-
party library caches.

The code has explicit cleanup paths, but over a long horizon any missed cleanup, process-local cache growth, leaked
descriptor, or stale third-party state can accumulate inside a reused worker. When that happens, failures show up
late in the run and are hard to attribute to the problem that caused the leak.

Recycling the worker after each problem gives each problem an OS-level cleanup boundary. It does not replace normal
cleanup; it limits the blast radius when cleanup is imperfect.

What Changed

  • Set max_tasks_per_child=1 on the problem runner ProcessPoolExecutor. (note: this adds process spawn overhead between problems. That should be small relative to agent/problem execution time)
  • Added a focused unit test that verifies the runner configures worker recycling.

@rohangpta rohangpta changed the title Avoid repeated stream buffer concatenation Recycle problem runner workers between problems. May 27, 2026
@gabeorlanski

Copy link
Copy Markdown
Collaborator

I like this fix and thank you for adding it. But why does it need all of the CLI utils / other parts?

@gabeorlanski

Copy link
Copy Markdown
Collaborator

Closing this in favor of the smaller one.

@rohangpta

Copy link
Copy Markdown
Contributor Author

my apologies, #12 here is the right separate PR for the patch I had in the cli_utils.py file (also a small optimization that I hit live while running your benchmark on my custom harness)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants