fix(server): prevent RequestEvent retention in AsyncLocalStorage async resources after SSR#15770
fix(server): prevent RequestEvent retention in AsyncLocalStorage async resources after SSR#15770Zelys-DFKH wants to merge 6 commits into
Conversation
…er SSR Wrap the store passed to `als.run()` in a single-property container so that async resources created inside the render callback (Promise continuations, Svelte 4 component_subscribe callbacks, etc.) only retain a reference to the container rather than the full RequestStore. After the render promise settles we null out `container.current`, allowing the RequestStore and RequestEvent to be garbage-collected even when stale async resources from the render still linger in memory. Without this fix, every `als.run(store, render)` call causes Node.js to associate `kResourceStore = store` with every async resource created during the render (nodejs/node#53408). Under load, hundreds of RequestEvent objects accumulate as those resources are not GC'd, leading to linear heap growth and eventual OOM. Fixes sveltejs#15764 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: 70538d3 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
… in event.spec Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Narrows the AsyncLocalStorage container-nulling fix to the SSR render call in render.js, which is the only context where async resources (Svelte 4 component_subscribe callbacks) can outlive the request and cause a memory leak. Other with_request_store callers (command handlers, query batch, etc.) now receive gc_barrier=false (the default), so their async continuations (get_response await-0 deferral, batch setTimeout callbacks) can still read the store for the duration of the request. Fixes the query.batch refresh-in-command single-flight regression by restoring store access for those async resources without reintroducing the original memory leak. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Hey, sorry for the delay -- this one kind of fell off the radar. One thing I'm a little concerned about with this is what happens when we (pretty soon) implement streaming SSR. Right now, |
|
Good catch on the streaming case. The Returning a export function with_request_store(store, fn) {
try {
sync_store = store;
if (als) {
const container = { current: store };
const result = als.run(container, fn);
return { result, dispose: () => { container.current = null; } };
}
return { result: fn(), dispose: () => {} };
} finally {
if (!IN_WEBCONTAINER) sync_store = null;
}
}Non-streaming SSR calls Two things need to land before this merges:
const { result, dispose } = with_request_store(store, render_fn);
try {
rendered = await result;
} finally {
dispose();
}All call sites updated in the same PR. Thanks for flagging this before streaming SSR lands. |
|
Thanks for flagging the streaming-SSR case. That was the right thing to catch, and it made the fix better. You're right that nulling the container when the render promise settles would cut off the request context for any async work still running, which is exactly what streamed SSR will need. So I moved disposal under the caller's control rather than tying it to settlement. One thing changed from what I described earlier. I'd said I would move every call site onto a
The render path awaits the result and calls I also added a regression test for the boundary you raised: a continuation created during render that runs after the render promise resolves but before If you'd prefer a different shape, I'm glad to rework it. Thanks again for the careful read. |
closes #15764
Hey! I tracked down what's causing the heap to grow under SSR load with Svelte 4 apps.
What's going on
with_request_storewraps the SSR render inals.run(store, renderFn). Node.js has a quirk in async context propagation (nodejs/node#53408): every async resource created insiderenderFn(Promise continuations, callbacks, the works) getskResourceStore = storestamped onto it internally. In Svelte 4 apps,component_subscribecreates subscription callbacks inside this context that can outlive the render. Those callbacks hold a strong reference tostore, so theRequestStoreand itsRequestEventnever get garbage-collected. Under any real load, this accumulates fast.Svelte 5 isn't affected because subscriptions are released per-component immediately. Svelte 4 batches them until the full component tree renders.
The fix
Instead of passing
storedirectly toals.run(), I wrap it in a single-property container ({ current: store }) and pass that instead. Async resources created during rendering hold a reference to the lightweight container, not the store itself.with_request_storegets agc_barrierparameter (defaultfalse). Only the SSR render call inrender.jspassesgc_barrier = true; after the render promise settles,container.currentis nulled. That severs the only path from lingering Svelte 4 subscription callbacks back to theRequestStore, and normal GC takes over.Other callers (command handlers, query batches, form handlers) keep
gc_barrier = false. Their async continuations (theawait 0inget_response, thesetTimeout(0)in batch processing) need store access for the duration of the request, so nulling would break them. Scoping the barrier to the render path leaves those flows untouched.Changes
packages/kit/src/exports/internal/event.js— wrap ALS value in a container; addgc_barrierparameter (defaultfalse) to null the container only when the caller opts inpackages/kit/src/runtime/server/page/render.js— passgc_barrier = trueto thewith_request_storecall that wraps the component render (the only source of this leak)packages/kit/src/exports/internal/event.spec.js— new unit tests including a regression test that spawns a long-lived async resource insideals.run()withgc_barrier = trueand asserts it cannot reach the store after render completes (fails on pre-fix code)All existing tests pass. Happy to adjust anything — thanks for all the work you put into this.
Please don't delete this checklist! Before submitting the PR, please make sure you do the following:
Tests
pnpm testand lint the project withpnpm lintandpnpm checkChangesets
pnpm changesetand following the prompts. Changesets that add features should beminorand those that fix bugs should bepatch. Please prefix changeset messages withfeat:,fix:, orchore:.Edits