Skip to content

fix(server): prevent RequestEvent retention in AsyncLocalStorage async resources after SSR#15770

Draft
Zelys-DFKH wants to merge 6 commits into
sveltejs:mainfrom
Zelys-DFKH:fix/ssr-als-memory-leak
Draft

fix(server): prevent RequestEvent retention in AsyncLocalStorage async resources after SSR#15770
Zelys-DFKH wants to merge 6 commits into
sveltejs:mainfrom
Zelys-DFKH:fix/ssr-als-memory-leak

Conversation

@Zelys-DFKH

@Zelys-DFKH Zelys-DFKH commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

closes #15764

Hey! I tracked down what's causing the heap to grow under SSR load with Svelte 4 apps.

What's going on

with_request_store wraps the SSR render in als.run(store, renderFn). Node.js has a quirk in async context propagation (nodejs/node#53408): every async resource created inside renderFn (Promise continuations, callbacks, the works) gets kResourceStore = store stamped onto it internally. In Svelte 4 apps, component_subscribe creates subscription callbacks inside this context that can outlive the render. Those callbacks hold a strong reference to store, so the RequestStore and its RequestEvent never get garbage-collected. Under any real load, this accumulates fast.

Svelte 5 isn't affected because subscriptions are released per-component immediately. Svelte 4 batches them until the full component tree renders.

The fix

Instead of passing store directly to als.run(), I wrap it in a single-property container ({ current: store }) and pass that instead. Async resources created during rendering hold a reference to the lightweight container, not the store itself.

with_request_store gets a gc_barrier parameter (default false). Only the SSR render call in render.js passes gc_barrier = true; after the render promise settles, container.current is nulled. That severs the only path from lingering Svelte 4 subscription callbacks back to the RequestStore, and normal GC takes over.

Other callers (command handlers, query batches, form handlers) keep gc_barrier = false. Their async continuations (the await 0 in get_response, the setTimeout(0) in batch processing) need store access for the duration of the request, so nulling would break them. Scoping the barrier to the render path leaves those flows untouched.

// in with_request_store:
const container = { current: store };
const result = als.run(container, fn);
// gc_barrier = true (SSR render only): container.current = null after render settles

// try_get_request_store dereferences through .current:
return sync_store ?? als?.getStore()?.current ?? null;

Changes

  • packages/kit/src/exports/internal/event.js — wrap ALS value in a container; add gc_barrier parameter (default false) to null the container only when the caller opts in
  • packages/kit/src/runtime/server/page/render.js — pass gc_barrier = true to the with_request_store call that wraps the component render (the only source of this leak)
  • packages/kit/src/exports/internal/event.spec.js — new unit tests including a regression test that spawns a long-lived async resource inside als.run() with gc_barrier = true and asserts it cannot reach the store after render completes (fails on pre-fix code)

All existing tests pass. Happy to adjust anything — thanks for all the work you put into this.


Please don't delete this checklist! Before submitting the PR, please make sure you do the following:

  • It's really useful if your PR references an issue where it is discussed ahead of time. In many cases, features are absent for a reason. For large changes, please create an RFC: https://github.com/sveltejs/rfcs
  • This message body should clearly illustrate what problems it solves.
  • Ideally, include a test that fails without this PR but passes with it.

Tests

  • Run the tests with pnpm test and lint the project with pnpm lint and pnpm check

Changesets

  • If your PR makes a change that should be noted in one or more packages' changelogs, generate a changeset by running pnpm changeset and following the prompts. Changesets that add features should be minor and those that fix bugs should be patch. Please prefix changeset messages with feat:, fix:, or chore:.

Edits

  • Please ensure that 'Allow edits from maintainers' is checked. PRs without this option may be closed.

Zelys-DFKH and others added 2 commits April 28, 2026 14:24
…er SSR

Wrap the store passed to `als.run()` in a single-property container so
that async resources created inside the render callback (Promise
continuations, Svelte 4 component_subscribe callbacks, etc.) only retain
a reference to the container rather than the full RequestStore. After the
render promise settles we null out `container.current`, allowing the
RequestStore and RequestEvent to be garbage-collected even when stale
async resources from the render still linger in memory.

Without this fix, every `als.run(store, render)` call causes Node.js to
associate `kResourceStore = store` with every async resource created
during the render (nodejs/node#53408). Under load, hundreds of
RequestEvent objects accumulate as those resources are not GC'd, leading
to linear heap growth and eventual OOM.

Fixes sveltejs#15764

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@changeset-bot

changeset-bot Bot commented Apr 28, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 70538d3

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@sveltejs/kit Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@svelte-docs-bot

Copy link
Copy Markdown

Zelys-DFKH and others added 2 commits April 28, 2026 14:38
… in event.spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Narrows the AsyncLocalStorage container-nulling fix to the SSR render
call in render.js, which is the only context where async resources
(Svelte 4 component_subscribe callbacks) can outlive the request and
cause a memory leak.

Other with_request_store callers (command handlers, query batch, etc.)
now receive gc_barrier=false (the default), so their async continuations
(get_response await-0 deferral, batch setTimeout callbacks) can still
read the store for the duration of the request.

Fixes the query.batch refresh-in-command single-flight regression by
restoring store access for those async resources without reintroducing
the original memory leak.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@elliott-with-the-longest-name-on-github

Copy link
Copy Markdown
Contributor

Hey, sorry for the delay -- this one kind of fell off the radar. One thing I'm a little concerned about with this is what happens when we (pretty soon) implement streaming SSR. Right now, render is basically just a promise that resolves to whatever you need to return to the client, but in the future, it'll be a promise that resolves to the initial content plus an async iterable of additional stuff that needs to be sent after the fact. It seems like with this solution the "after the fact" stuff that's still running would start seeing null instead of the correct context?

@Zelys-DFKH

Copy link
Copy Markdown
Contributor Author

Good catch on the streaming case. The gc_barrier approach does fall apart there: once the render callback returns early with an async iterable still live, settling the promise too eagerly cuts off getRequestEvent() mid-stream.

Returning a dispose function handles it. with_request_store returns { result, dispose }, and the caller controls when to null:

export function with_request_store(store, fn) {
  try {
    sync_store = store;
    if (als) {
      const container = { current: store };
      const result = als.run(container, fn);
      return { result, dispose: () => { container.current = null; } };
    }
    return { result: fn(), dispose: () => {} };
  } finally {
    if (!IN_WEBCONTAINER) sync_store = null;
  }
}

Non-streaming SSR calls dispose() in a finally after await result. Streaming calls it after the iterable exhausts. Callers that hold the store for the full request duration can skip it.

Two things need to land before this merges:

dispose() in a finally in render.js. Calling it only on the success path leaks on rejection:

const { result, dispose } = with_request_store(store, render_fn);
try {
  rendered = await result;
} finally {
  dispose();
}

All call sites updated in the same PR. endpoint.js, remote.js, sequence.js, utils.js, load_data.js, and actions.js all expect the raw return value today. A missed caller will silently propagate { result, dispose } as the handler response, so the refactor needs to cover everything at once.

Thanks for flagging this before streaming SSR lands.

@Zelys-DFKH

Copy link
Copy Markdown
Contributor Author

Thanks for flagging the streaming-SSR case. That was the right thing to catch, and it made the fix better.

You're right that nulling the container when the render promise settles would cut off the request context for any async work still running, which is exactly what streamed SSR will need. So I moved disposal under the caller's control rather than tying it to settlement.

One thing changed from what I described earlier. I'd said I would move every call site onto a { result, dispose } tuple. Once I started, it turned out only the SSR render path ever needed the barrier, and threading a tuple through the other ~24 fire-and-forget callers added churn and a real risk of someone forgetting to dispose. So I kept it smaller:

  • with_request_store(store, fn) keeps its current signature and behavior for all the existing callers.
  • with_request_store_disposable(store, fn) returns { result, dispose }, and only the render path uses it.

The render path awaits the result and calls dispose() afterward, so the container stays alive for the whole render, including anything streamed, and is only nulled once the output is consumed. When streaming SSR lands, the dispose() call can move to after the stream closes without touching anything else.

I also added a regression test for the boundary you raised: a continuation created during render that runs after the render promise resolves but before dispose() still sees the store. It fails on the settle-based version and passes here.

If you'd prefer a different shape, I'm glad to rework it. Thanks again for the careful read.

@Rich-Harris Rich-Harris marked this pull request as draft June 23, 2026 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory leak in SSR with Svelte 4 since with_event was added to render.js in 2.27.0

3 participants