Skip to content

Recognize stopped bg tasks and resolve cross-session output paths#221

Open
SwordFaith wants to merge 2 commits into
PolyArch:devfrom
SwordFaith:fix/bg-task-stop-recognition
Open

Recognize stopped bg tasks and resolve cross-session output paths#221
SwordFaith wants to merge 2 commits into
PolyArch:devfrom
SwordFaith:fix/bg-task-stop-recognition

Conversation

@SwordFaith

Copy link
Copy Markdown

Depends on #215 (extract_bg_task_output_path_from_transcript); the
shared extract_* lines appear in this diff only until #215 lands.

Problem

 `"Successfully stopped task: <id>"`, but many builds do **not** also
 emit a `task_notification` system record. `list_pending_background_task_ids`
 only recognised the SDK `task_notification` and legacy queue-operation
 forms, so a TaskStop'd task stayed pending forever and the loop never
 reached Codex review.
  1. Dead task's .output unreachable across sessions. When the launch
    message carries no extractable output path, the session-derived path
    points at the current (new) session and may not exist; the real
    .output lives under the old session dir, so the dead task is never
    pruned. (The recorded-path case is handled by Fix background task liveness probe after session resume #215; this covers the
    no-path case.)

Fix

A — TaskStop completion recognition (list_pending_background_task_ids):
add a third completion source matching the .toolUseResult whose .message
contains "Successfully stopped task: <id>". The id is read from .task_id,
falling back to parsing it out of the message when that field is absent.
try ... catch empty keeps a degenerate message from raising and wiping the
whole completed set under pipefail.

B — cross-session liveness glob (is_bg_task_alive): when neither the
transcript-recorded path nor the session-derived path exists, search sibling
session dirs under the same project slug. This only expands where we
look — a genuinely alive task is never wrongly pruned (its file is found and
lsof still sees the open fd); a file missing everywhere stays unknown and
fails open as before.

Test

tests/test-stop-hook-bg-allow.sh, 50/50. Adds:

  • TaskStop with .task_id → completed (e2e + helper).
  • TaskStop with id only in the message → parse fallback (e2e + helper).
  • Cross-session glob prunes a dead task (no recorded path).
  • Cross-session glob keeps an alive task short-circuiting, guarded by a lsof
    spy asserting lsof ran on the old-session .output. Without the spy
    this case is a tautology — is_bg_task_alive fails open when the file is
    absent, so "alive → short-circuit" passes even with the glob removed.
    Mutation-checked: disabling the glob fails both glob tests.

Test names are descriptive (no AC-NN), per local convention.

When a Claude Code session is resumed, the current transcript's session
id differs from the session that launched a background task, so the
task's .output file lives under the old session dir while the stop hook
probed a path derived from the new transcript. Dead/orphaned tasks were
never pruned, blocking the loop from reaching Codex review.

Teach is_bg_task_alive to extract the real output path recorded in the
transcript launch message, piping an optional transcript_path arg
through prune_dead_bg_task_ids and list_pending_background_task_ids.
Falls back to the derived tasks_dir when the transcript is unreadable or
the message is missing.

Add regression tests AC-25 (session resume) and AC-25b (whitespace in
recorded output path).
Fix A - TaskStop completion recognition: list_pending_background_task_ids
now treats a TaskStop tool_result (.toolUseResult.message "Successfully
stopped task: <id>") as a terminal completion event, because many Claude
Code builds do not also emit a task_notification system record for a
TaskStop. The id is read from .task_id, falling back to parsing it out of
the message text when that field is absent; the fallback uses
try/catch empty so a degenerate message cannot raise and wipe the whole
completed set under pipefail.

Fix B - cross-session liveness glob: is_bg_task_alive now searches
sibling session dirs under the same project slug when neither the
transcript-recorded output path nor the session-derived path exists.
This only expands where we look, so a genuinely alive task is never
wrongly pruned; a file missing everywhere still fails open as before.

Builds on PolyArch#215 (extract_bg_task_output_path_from_transcript); the shared
lines drop out of the diff once PolyArch#215 lands.

Tests: 50/50 (descriptive names, local style).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f8f5546261

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +337 to +340
select(.toolUseResult != null)
| select((.toolUseResult.message // "")
| contains("Successfully stopped task:"))
| (.toolUseResult.task_id

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard TaskStop matching against non-string messages

When a transcript contains any unrelated toolUseResult whose message field is an object/array/number, this contains() call makes jq error; because the surrounding completion pipeline falls back to completed="", all SDK/legacy completions already found are discarded and completed background tasks are reported as pending again. This can pin the loop on ordinary transcripts that include a completed async task plus another tool result with a structured message; coerce/filter the field to strings before calling contains().

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant