WL-0MPFGDDZ3009J2P4: Fix corrupted lock cleanup grace window scaling with exponential backoff#2003
Merged
Conversation
…with exponential backoff The grace window for unparseable/corrupted lock files was calculated as Math.max(currentDelay * 2, 500), where currentDelay grows with exponential backoff on each retry. This caused the grace window to keep pace with file age, preventing corrupted lock cleanup within the default 5000ms timeout. Fix: use a fixed 1000ms grace window instead of scaling with retry delay. This is safe because lock file writes (open, write, fsync, close) complete in <100ms, so 1000ms is more than adequate to guard against concurrent writers, while ensuring corrupted files are recovered deterministically within ~1 second.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The diagnostic regression test
should log stale lock cleanup reason when lock is corruptedwas timing out because the grace window for unparseable/corrupted lock files scaled with the exponential backoff delay, preventing cleanup within the default 5000ms timeout.Root Cause
In
src/file-lock.ts:340, the grace window was calculated asMath.max(currentDelay * 2, 500). SincecurrentDelaygrows exponentially (100ms → 150ms → 225ms → ...), the grace window kept pace with the file age, making it nearly impossible for the corrupted file to ever become "old enough" to clean up before the acquisition timeout fired.Fix
Changed the grace window to a fixed 1000ms constant. This is:
Verification
Focus for Review
The single-line change at
src/file-lock.ts:340. Verify that the fixed 1000ms grace window still protects against concurrent writer races (a writer that crashes mid-write should have the lock file reclaimed after 1s, which is more than adequate).