Skip to content

Add a metric to capture write enqueue latency#193

Draft
jasonk000 wants to merge 2 commits into
jkoch/loop-cpu-utilization-metricfrom
jkoch/loop-enqueue-to-write-latency-metric
Draft

Add a metric to capture write enqueue latency#193
jasonk000 wants to merge 2 commits into
jkoch/loop-cpu-utilization-metricfrom
jkoch/loop-enqueue-to-write-latency-metric

Conversation

@jasonk000

Copy link
Copy Markdown
Member

This captures from when the caller tries to send a request until the request is finally sent on the wire, primarily so we can see when the enqueue time is increasing (a signal for pressure on the IO loop).

@akhaku akhaku left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically not sent on the wire but passed to the non-blocking socket, right?

Generally LGTM

if (wc <= 0L || wc < operationAttachedNs) return;
loopEnqueueToWriteLatency.record(wc - operationAttachedNs, TimeUnit.NANOSECONDS);
} catch (Throwable t) {
if (log.isDebugEnabled()) log.debug("recordLoopEnqueueToWriteLatency failed", t);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't really need the isDebugEnabled check here right since the expression in log.debug is cheap

@jasonk000 jasonk000 May 18, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Updated the comments to make that more clear, and dropped that catch there since the onyl thing we have is a record() call. -> a6d34e2

@jasonk000 jasonk000 marked this pull request as ready for review May 18, 2026 19:24
This captures from when the caller tries to send a request until the
request is finally sent on the wire, primarily so we can see when the
enqueue time is increasing (a signal for pressure on the IO loop).
@jasonk000 jasonk000 force-pushed the jkoch/loop-enqueue-to-write-latency-metric branch from a6d34e2 to e6f6073 Compare June 26, 2026 21:12
@jasonk000

jasonk000 commented Jun 26, 2026

Copy link
Copy Markdown
Member Author

I think this needs a little work. Even though per-op latency is OK, when we multiply it by 100x for a 100-element bulk call the latency can stack up to 10's of microseconds. Not a lot, but, if we can reorder the PR i think we can do a bit better.

(TODO note for myself - this code should, instead of writing directly to metrics, accumulate metrics in a TimerBatchUpdater). Likely this batching can be applied to other evcache batch query metrics too.

@jasonk000 jasonk000 marked this pull request as draft June 29, 2026 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants