Skip to content

Core: Make InMemoryLockManager heartbeat map thread-safe#16847

Open
LuciferYang wants to merge 3 commits into
apache:mainfrom
LuciferYang:inmemory-lockmanager-threadsafe-heartbeats
Open

Core: Make InMemoryLockManager heartbeat map thread-safe#16847
LuciferYang wants to merge 3 commits into
apache:mainfrom
LuciferYang:inmemory-lockmanager-threadsafe-heartbeats

Conversation

@LuciferYang

@LuciferYang LuciferYang commented Jun 17, 2026

Copy link
Copy Markdown

InMemoryLockManager.HEARTBEATS is a static map shared by every lock manager in the JVM and is written whenever a lock is acquired or released. The sibling LOCKS map is already a ConcurrentMap, but HEARTBEATS was left as a plain HashMap, so threads acquiring or releasing locks on different entities mutate it without synchronization. That can corrupt the map and leak the scheduled heartbeat futures it tracks.

This makes HEARTBEATS a ConcurrentMap and replaces the non-atomic containsKey/remove cleanup in acquireOnce with a single atomic remove.

Tests:

  • distinct entities: many rounds of concurrent acquire/release across separate keys, asserting no heartbeat is left behind;
  • shared entity: 16 threads contend on one key, asserting at most one heartbeat while it is held and none after release;
  • takeover: a lock whose holder never releases is re-acquired by a new owner once it expires, asserting the stale heartbeat is cancelled and replaced. This is the only path that reaches the non-null cleanup branch the change touches.

InMemoryLockManager.HEARTBEATS is a static map shared by every lock
manager in the JVM and is written whenever a lock is acquired or
released. The sibling LOCKS map is already a ConcurrentMap, but
HEARTBEATS was left as a plain HashMap, so threads acquiring or releasing
locks on different entities mutate it without synchronization, which can
corrupt the map and leak the scheduled heartbeat futures it tracks.

Make HEARTBEATS a ConcurrentMap and replace the non-atomic
containsKey/remove cleanup in acquireOnce with a single atomic remove.
@github-actions github-actions Bot added the core label Jun 17, 2026
if (HEARTBEATS.containsKey(entityId)) {
HEARTBEATS.remove(entityId).cancel(false);
// cleanup old heartbeat, using an atomic remove to avoid a check-then-act race
ScheduledFuture<?> previousHeartbeat = HEARTBEATS.remove(entityId);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This atomic remove cancels a prior heartbeat only when one is still tracked for the entity - the orphaned/expired-takeover case where the holder never released. Both new tests acquire-then-release each iteration, so release clears the heartbeat before the next acquire and this remove returns null every time; no existing test hits the non-null path either. The branch this PR actually changes is therefore unexercised. Consider a focused test that leaves a heartbeat orphaned (lock expires or its heartbeat task dies without a release), then re-acquires and asserts the stale future is cancelled and replaced.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, you're right. release() always removes the heartbeat first, so both new tests only ever hit the null path and the branch this PR changes stayed uncovered.

I added testAcquireCancelsOrphanedHeartbeatOnTakeover: it acquires without releasing, lets the lock expire (long heartbeat interval so it isn't renewed), then re-acquires with a new owner. That reaches the non-null branch, and the test asserts the stale future is cancelled and replaced by a fresh one. Needed a small @VisibleForTesting heartbeat(entityId) accessor to check the future identity. Thanks for the review!

The existing tests acquire and release each iteration, so release() clears
the heartbeat before the next acquire and the cleanup in acquireOnce always
takes the null path. Add testAcquireCancelsOrphanedHeartbeatOnTakeover,
which holds a lock without releasing, lets it expire, then re-acquires with
a new owner so the non-null branch runs, and asserts the stale heartbeat is
cancelled and replaced. Expose a heartbeat(entityId) accessor for the check.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants