Skip to content

Fix Process.waitall/detach under URing selector when no children remain#201

Merged
samuel-williams-shopify merged 1 commit into
mainfrom
fix-process-waitall-echild
Jun 30, 2026
Merged

Fix Process.waitall/detach under URing selector when no children remain#201
samuel-williams-shopify merged 1 commit into
mainfrom
fix-process-waitall-echild

Conversation

@samuel-williams-shopify

Copy link
Copy Markdown
Contributor

Summary

Fixes Process.waitall / Process.detach under the URing selector when there are no more children to reap.

The bug

The URing selector's process_wait hook (the io_uring waitid path) raised via rb_syserr_fail when waitid reported an error such as ECHILD:

if (result < 0) {
    rb_syserr_fail(-result, "...:io_uring_prep_waitid");
}

But Process.waitall (and Process.detach) rely on rb_waitpid returning -1 with errno == ECHILD so they can terminate their loops — they do not expect it to raise (see proc_waitall in Ruby's process.c). Under a fiber scheduler this path goes through the hook, so raising broke them: Process.waitall would propagate Errno::ECHILD instead of returning the statuses it had already collected.

Process.wait(-1) with no children still raises Errno::ECHILD — that's correct and unchanged (it matches non-scheduler behaviour).

The threaded fallback (EPoll / KQueue / Select) was already correct, because it reaps via Process::Status.wait, which returns the proper (-1, errno) carrier rather than raising.

The fix

Route waitid errors through the same WNOHANG reap already used on the success path. rb_process_status_wait(pid, flags | WNOHANG) reproduces the failure as a Process::Status carrying (pid -1, errno), exactly like Process::Status.wait — so rb_waitpid reports it correctly and waitall/detach behave as expected. No new API required.

Tests

Adds integration tests (via TestScheduler, across all selectors):

  • Process.waitall collects all child statuses and terminates cleanly.
  • Process.wait(-1) with no children raises Errno::ECHILD (regression guard for the corrected behaviour).

Verification

  • Reproduced first: the new waitall test errors with Errno::ECHILD only on the URing waitid path (EPoll/Select pass).
  • With the fix: green on Linux (EPoll + URing-waitid + Select) — 150/150 — and macOS (KQueue + Select) — 112/112.

Background

This came out of the discussion on bug #21704 about exposing rb_process_status_new. Notably this fix needs no new Ruby API — rb_process_status_wait already provides the full-fidelity reap (including the error carrier). A constructor like rb_process_status_new would only be needed later to avoid the extra reap syscall on the success path.

The URing selector's process_wait hook (io_uring waitid path) raised via
rb_syserr_fail when waitid reported an error such as ECHILD. But Process.waitall
and Process.detach rely on rb_waitpid *returning* -1 with errno (so they can
terminate their loops on ECHILD), not raising. Under a fiber scheduler, that
goes through the hook, so raising broke them: Process.waitall would propagate
Errno::ECHILD instead of returning its collected results.

Route waitid errors through the same WNOHANG reap used on success, which
reproduces the error as a Process::Status carrier (pid -1, errno) exactly like
Process::Status.wait. Process.wait(-1) with no children still raises ECHILD, as
before (and as without a scheduler).

The threaded fallback (EPoll/KQueue/Select) was already correct since it goes
through Process::Status.wait.

Add integration tests (via TestScheduler) for Process.waitall across all
selectors, and for the no-children-raises-ECHILD behaviour.

Assisted-By: devx/98676a79-085f-4593-a4f4-7d09ea29b5ff
@samuel-williams-shopify samuel-williams-shopify merged commit 3c518a3 into main Jun 30, 2026
56 of 60 checks passed
@samuel-williams-shopify samuel-williams-shopify deleted the fix-process-waitall-echild branch June 30, 2026 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant