Skip to content

perf: sort engines once and bound the MountResolver cache (LRU) (#108)#116

Merged
kurok merged 3 commits into
masterfrom
perf/108-mount-resolver
Jul 2, 2026
Merged

perf: sort engines once and bound the MountResolver cache (LRU) (#108)#116
kurok merged 3 commits into
masterfrom
perf/108-mount-resolver

Conversation

@kurok

@kurok kurok commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #108 (priority: medium, performance). Two hot-path issues in MountResolver, both hit on every read/write/list/delete/update:

  1. __enginesOverrideLookup re-sorted Object.entries(engines) on each resolve() although the map is immutable after construction.
  2. The mount cache was unbounded and linearly scanned (startsWith over every cached mount) per resolve — monotonic memory + scan growth in long-lived services against many/dynamic mounts.

Measured (200k worst-case resolves, old vs new):

lookup before after
engines override (50 entries) 1191ms 34ms
mount cache (500 cached mounts) 1019ms 149ms

Changes

  • Engines sorted once in the constructor — entries are normalized + sorted (longest raw prefix first, matching the previous per-call sort exactly) at construction. Observable consequence, covered by a test and flagged in the CHANGELOG: the map is now snapshotted, so mutating the object after VaultClient construction no longer affects resolution.
  • Bounded LRU cache — default cap 500 (opts.maxCacheSize), per the issue's "a few hundred". Hits refresh recency; inserts evict the oldest beyond the cap. Sizing rationale (the issue's open question): typical deployments talk to a handful of static mounts, so 500 is generous headroom while still bounding the multi-tenant/dynamic-mount worst case.
  • Segment-boundary probe instead of linear scan — the cache is now probed with the path's segment prefixes deepest-first as exact Map.get keys: O(path depth) instead of O(cache size). Vault forbids nested mounts, so at most one cached mount can prefix a given path, making deepest-first probing behavior-equivalent to the old insertion-order scan.
  • The recursive collision-fallback logic is untouched, per the issue's explicit guardrail — all its existing tests (in-flight dedup, shared-first-segment regression, retry-after-failure) pass unchanged.

Tests (written red-first where behavior changed)

  • LRU eviction once the cap is exceeded (evicted mount re-detects; survivors don't)
  • Cache-hit recency refresh protects an entry from eviction
  • In-flight dedup still holds with a bounded cache
  • Engines-snapshot semantics (post-construction mutation has no effect)
  • The previously untested Unexpected empty detection response throw (empty and null response variants) — closing the issue's coverage gap

MountResolver suite: 21 passing; full unit suite: 260 passing; lint clean.

Verification beyond unit tests

  • Real server: ran a Vault dev server (1.17.6 binary) and drove the refactored resolver through VaultClient end-to-end — autoDetect on a KV v2 mount, passthrough detection of a KV v1 mount on the same server, and an engines-override client. All round-trips correct.
  • aws4 checklist item: the lockfile pins aws4@1.8.0; the existing IAM unit tests assert the regional STS signing behavior from 2.0.0 (regional Host header + /eu-central-1/sts/aws4_request credential scope) and pass on exactly that version — no npm update aws4 needed.

Type of change

  • Bug fix
  • New feature
  • Refactor
  • Documentation
  • CI / tooling

Checklist

  • Tests added or updated
  • npm run lint && npm test passes locally (unit: 260 passing; e2e suites additionally verified against local dev servers in the prior PRs' setup)
  • User-facing changes recorded under # Unreleased in CHANGELOG.md (no version bump here to avoid colliding with open ci: enforce coverage thresholds and lint test files (#105) #114, which claims 2.0.4 — fold into whichever release ships next)
  • All commits have a Signed-off-by: trailer (git commit -s)

@kurok kurok requested review from m2broth and wRLSS as code owners July 2, 2026 11:05
Two hot-path issues hit on every read/write/list/delete/update:

- __enginesOverrideLookup re-sorted Object.entries(engines) on each
  resolve() although the map never changes after construction. The
  entries are now normalized and sorted once in the constructor (the
  map is snapshotted; post-construction mutation no longer affects
  resolution).
- The mount cache was unbounded and linearly scanned with startsWith
  per resolve. It is now a bounded LRU (default 500 entries,
  opts.maxCacheSize) probed by segment-boundary prefix as exact keys,
  O(path depth) per lookup. Vault forbids nested mounts, so at most
  one cached mount can prefix a path and deepest-first probing is
  behavior-equivalent to the old scan. Hits refresh recency; inserts
  evict the oldest entries beyond the cap.

The recursive collision-fallback logic is untouched, per the issue.

Measured (200k worst-case resolves): engines lookup with 50 entries
1191ms -> 34ms; cache lookup with 500 cached mounts 1019ms -> 149ms.

Tests: LRU eviction past the cap, recency refresh on hit, in-flight
dedup with a bounded cache, engines-snapshot semantics, and the
previously untested empty/null detection-response VaultError. Also
verified end-to-end against a real Vault dev server (autoDetect v2,
v1 passthrough, engines override) and confirmed regional STS signing
works on the pinned aws4@1.8.0 (no update needed).

Closes #108

Signed-off-by: Yuriy R <22548029+kurok@users.noreply.github.com>
@kurok kurok force-pushed the perf/108-mount-resolver branch from 89812ae to fd6361e Compare July 2, 2026 11:09
Comment thread src/MountResolver.js Outdated
@kurok kurok requested a review from m2broth July 2, 2026 11:23
Review follow-up: a negative opts.maxCacheSize was truthy, so it
bypassed the || default and reached __cacheStore unchanged, where the
eviction loop (size > cap) can never terminate once the map is empty
— delete(undefined) cannot shrink it. Non-integer caps likewise made
the bound meaningless.

Reject anything but a positive integer at construction with
InvalidArgumentsError; omitted still falls back to the default (500).
Reproduced the hang (bounded repro: eviction loop ran 1000+ iterations
on maxCacheSize=-1) before the fix; covered negative/zero/fractional/
NaN/Infinity/string caps plus the default-fallback path with tests.

Signed-off-by: Yuriy R <22548029+kurok@users.noreply.github.com>
@kurok kurok merged commit 8e5f43d into master Jul 2, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MountResolver: engines map re-sorted on every resolve; mount cache unbounded and linearly scanned

2 participants