Fix flaky integration test#379
Open
stepchowfun wants to merge 4 commits into
Open
Conversation
…process state The previous `wait_for_docuum()` polled `/proc/$PID/stat` for the 'S' (interruptible sleep) state. This is unreliable: the process can be in state 'S' while blocked on `docker image rm` (inside `vacuum()`), causing the test to proceed before Docuum has finished processing the current event. The new approach redirects Docuum's log output to a file and synchronizes on specific log messages that Docuum emits at known-safe points: - "Listening for Docker events" — emitted after the initial vacuum, before the event loop starts - "Going back to sleep" — emitted at the end of each event-loop iteration, after any vacuum has completed A quiescence check (count stable for 2 seconds) ensures all events from a single container run (pull, create, start, die, destroy) are fully processed before the test proceeds. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
18f2dc9 to
91e0261
Compare
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The quiescence approach (wait 2s with no new log messages) was still racy on loaded CI machines. Instead, wait_for_docuum now takes an image digest prefix as an argument, waits for that string to appear in the log (proving Docuum received the relevant event), then waits for "Going back to sleep" to appear after it (proving Docuum finished processing that event, including any vacuum). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replace the unreliable process-state polling in
wait_for_docuum()with log-based synchronization.Root cause: The old approach polled
/proc/$PID/statfor state 'S' (interruptible sleep). But Docuum enters state 'S' whenever it blocks on any syscall — includingwait()while runningdocker image rminsidevacuum(). So the test could proceed while Docuum was still mid-vacuum, causing a race where the LRU ordering was wrong by the time the next image was used.Fix: Redirect Docuum's log output to a temporary file and synchronize on two specific messages that Docuum emits at well-defined points:
"Listening for Docker events"— emitted after the initial vacuum, just before the event loop starts"Going back to sleep"— emitted at the very end of each event-loop iteration, after any vacuum has completedA quiescence check (count stable for 2 seconds) ensures all events from a single
docker container run(pull, create, start, die, destroy — typically 5 events) are fully processed before the test proceeds to the next step.Status: Ready
Fixes: N/A