Skip to content

MemoryPacking: Optimize trampled data instead of giving up#8834

Merged
tlively merged 3 commits into
WebAssembly:mainfrom
JPL11:fix-issue-3244
Jun 12, 2026
Merged

MemoryPacking: Optimize trampled data instead of giving up#8834
tlively merged 3 commits into
WebAssembly:mainfrom
JPL11:fix-issue-3244

Conversation

@JPL11

@JPL11 JPL11 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

When active segments overlap, a later segment overwrites ("tramples") the data of an earlier one. #3222 made MemoryPacking check for trampling and give up optimizing on any overlap. Since active segments are applied in order during instantiation, before any code can run, only the final contents of memory are observable — so instead of giving up, we can zero out all trampled bytes and let the pass's normal optimization of zeros remove them. This leaves segment count, offsets, and order untouched, so segment referrers and trap behavior are unaffected.

The one case we still skip (with the existing warning) is an imported memory: there, a later out-of-bounds segment traps mid-instantiation and the partially-written state remains visible in the importing module, which outlives the failed instantiation, so even trampled data matters. A possible follow-up could optimize imported memories too when all segments are provably within the declared minimum size.

Why was this PR needed?

#3244 (filed after the conservative fix in #3222) left a TODO: optimize in the trampling case in canOptimize(). Before this change, a module like

(module
 (memory $0 1 1)
 (data (i32.const 1024) "x")
 (data (i32.const 1024) "\00")
)

was emitted entirely unchanged; now both segments are removed (the final memory contents are all zeros).

Testing

  • Updated the existing trampling tests in memory-packing_all-features.wast and added cases for: full trampling by a non-zero byte, partial trampling in the middle of a segment, chained trampling, one segment trampling multiple earlier ones, passive segments being unaffected, and the imported-memory case in memory-packing_zero-filled-memory.wast.
  • Full lit suite, check.py, and gtest unit tests pass.
  • Differential testing: wasm-opt --fuzz-exec --memory-packing on 300 randomly generated modules with 2–6 overlapping segments produces identical memory contents before and after the pass.

Fixes #3244

JPL11 added 2 commits June 11, 2026 16:43
When active segments overlap, a later segment overwrites ("tramples")
the data of an earlier one. Since active segments are applied in order
during instantiation, before any code can run, only the final contents
of memory are observable, so we can zero out all trampled bytes and let
the normal optimization of zeros remove them.

We only do this when the memory is defined in the module itself: with an
imported memory, an out-of-bounds segment traps mid-instantiation and
the partially-applied state remains visible in the importing module, so
there we keep the existing behavior of not optimizing.

Fixes WebAssembly#3244
Update the existing trampling tests, which asserted that we give up on
any overlap, to the new optimized output, and add coverage for: full
trampling by a non-zero byte, partial trampling in the middle of a
segment, chained trampling, one segment trampling multiple earlier
ones, passive segments being unaffected, and the imported-memory case
where we still do not optimize.
Copilot AI review requested due to automatic review settings June 12, 2026 00:24
@JPL11 JPL11 requested a review from a team as a code owner June 12, 2026 00:24
@JPL11 JPL11 requested review from tlively and removed request for a team June 12, 2026 00:24
@JPL11

JPL11 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Hi @kripken — this is my first contribution to Binaryen, following up on the good first bug label and your comment in #3244. I went with the smallest change that resolves the TODO: zeroing out trampled bytes so the existing zero-dropping machinery removes them, keeping segment structure untouched. The main judgment call I'd appreciate your eyes on is the imported-memory gate (described in the PR text). Happy to make any changes — thanks for your time!

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@tlively tlively left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Thanks for the contribution.

Comment on lines +259 to +264
// Some segments overlap, that is, a later segment tramples the data of
// an earlier one. If the memory is imported then we cannot optimize
// here: if a later segment is out of bounds then instantiation traps
// partway, leaving the data written so far visible in the imported
// memory (which outlives the failed instantiation), so even trampled
// data matters.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth adding a TODO about optimizing anyway if we can check that all the segments after the trampled segment up to and including the trampling segment will be in-bounds for the imported memory.

@JPL11 JPL11 Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added the TODO in 2738c38.

;; CHECK: (data $0 (i32.const 1024) "x")

;; CHECK: (data $1 (i32.const 1024) "\00")
(module

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave a blank line between each module to help readability.

@JPL11 JPL11 Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added blank lines between the modules here and in memory-packing_zero-filled-memory.wast (2738c38).

(data (i32.const 1024) "ab")
(data (i32.const 1026) "cd")
(data (i32.const 1023) "WXYZ")
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add a test that depends on the merging of the covered regions. For example, with segments covering the following intervals:

A: [3, 8)
B: [1, 2)
C: [0, 5)

When we visit segment A (after visiting C and B), if we didn't correctly merge the covering information for B into the covering information for C, then we would fail to detect the overlap between C and A because the map lookup would find B instead.

@JPL11 JPL11 Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea — added in 2738c38, with segments "abcde"@1027, "B"@1025, "fghij"@1024 matching your A: [3, 8), B: [1, 2), C: [0, 5) shape. Without the merge, the lookup when visiting "abcde" would find the "B" region and miss the overlap with "fghij". The output checks confirm the trampled "ab" is dropped ("cde" remains at 1029).

@JPL11

JPL11 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

The earlier CI failure was an infrastructure flake, not related to the change: the macos-14 job built and installed successfully but timed out uploading the build artifact (Request timeout: .../ArtifactService/CreateArtifact, 5 retries), and the two Windows jobs were cancelled by fail-fast as a result.

I pushed an empty commit to retrigger CI, but the new run is waiting on workflow approval. Could a maintainer approve the workflow run when they get a chance? Thanks!

- Add a TODO about optimizing trampled segments on imported memories when
  the relevant segments can be proven in-bounds.
- Add blank lines between modules in the memory-packing lit tests.
- Add a test that depends on merging the covered regions: without merging,
  looking up the region covering a segment could find a small later segment
  and miss a larger overlapping one.

@tlively tlively left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@tlively tlively enabled auto-merge (squash) June 12, 2026 17:21
@tlively tlively merged commit d0b9c03 into WebAssembly:main Jun 12, 2026
16 checks passed
@JPL11 JPL11 deleted the fix-issue-3244 branch June 13, 2026 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize trampled data in MemoryPacking

3 participants