JIT: fix opt-repeat divide-by-zero miscompile (#129386)#129533
Open
JulieLeeMSFT wants to merge 2 commits into
Open
JIT: fix opt-repeat divide-by-zero miscompile (#129386)#129533JulieLeeMSFT wants to merge 2 commits into
JulieLeeMSFT wants to merge 2 commits into
Conversation
Contributor
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates CoreCLR JIT exception modeling for integer div/mod so that divide-by-zero potential is derived from value-based information (instead of a flow-sensitive flag), preventing opt-repeat/value-numbering/CSE from treating a divide as non-throwing and hoisting it above its dominating non-zero guard. It also adds a targeted JitBlue regression test that runs under JitOptRepeat.
Changes:
- Update
GenTree::OperExceptionsto ignoreGTF_DIV_MOD_NO_BY_ZEROand rely onIsNeverZero()for divide-by-zero exception modeling. - Update several backend codegens (arm64/loongarch64/riscv64/wasm) to consult
GTF_DIV_MOD_NO_BY_ZEROdirectly to elide explicit runtime divide-by-zero checks. - Add a new JitBlue regression test project that forces opt-repeat for the repro method.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/jit/gentree.cpp | Makes divide-by-zero exception modeling value-based to keep VN/GTF_EXCEPT correct under opt-repeat and code motion. |
| src/coreclr/jit/codegenarm64.cpp | Preserves check-elision by masking out DivideByZeroException at codegen when GTF_DIV_MOD_NO_BY_ZERO is set. |
| src/coreclr/jit/codegenloongarch64.cpp | Same as above for LoongArch64 div/mod codegen paths with explicit checks. |
| src/coreclr/jit/codegenriscv64.cpp | Same as above for RISC-V div/mod codegen paths with explicit checks. |
| src/coreclr/jit/codegenwasm.cpp | Same as above for WASM div/mod codegen paths with explicit checks. |
| src/tests/JIT/Regression/JitBlue/Runtime_129386/Runtime_129386.csproj | New regression test project enabling opt-repeat via CLRTestEnvironmentVariable. |
| src/tests/JIT/Regression/JitBlue/Runtime_129386/Runtime_129386.cs | New repro-based regression test validating correct behavior under opt-repeat. |
GTF_DIV_MOD_NO_BY_ZERO and GTF_DIV_MOD_NO_OVERFLOW are flow-sensitive annotations that assertion propagation sets on a divide once a dominating check proves the divisor safe at that node's location. They are "set once" facts. Under JitOptRepeat, a later optimization iteration re-runs (flow-insensitive) value numbering while these stale flags are still present; VN then models the divide as non-throwing and clears GTF_EXCEPT, after which CSE/loop hoisting moves a loop-invariant `0 % p0` into the loop preheader, above the dominating `p0 == 0` guard, where it executes unconditionally and throws DivideByZeroException. Clear these flags in ResetOptAnnotations (which already discards other "set once" annotations such as VNs, assertions, and CSE numbers between opt-repeat iterations) so each iteration's assertion propagation re-derives them from scratch. Recompute statement side effects afterwards so GTF_EXCEPT is restored to a consistent state for the cleared divides. Adds a regression test that reproduces the miscompile under JitOptRepeat. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
3974e17 to
3eb376c
Compare
The header comment described the earlier (superseded) value-based exception modeling approach. Update it to reflect the current fix: clearing the stale flow-sensitive GTF_DIV_MOD_NO_BY_ZERO / GTF_DIV_MOD_NO_OVERFLOW flags between opt-repeat iterations in ResetOptAnnotations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
This PR (code and description) was generated with AI assistance (GitHub Copilot CLI), reviewed by @JulieLeeMSFT.
Fixes #129386.
Summary
Under
JitOptRepeat/STRESS_OPT_REPEAT, the JIT could silently miscompile an integer divide, taking a wrong-direction branch into an unreachable% 0path and throwingDivideByZeroException(default FullOpts/MinOpts return the correct result).Root cause
GTF_DIV_MOD_NO_BY_ZERO(and the siblingGTF_DIV_MOD_NO_OVERFLOW) are flow-sensitive annotations thatoptAssertionProp_ModDivsets on a divide once a dominating check proves the divisor safe at that node's location. They are "set once" facts.Under opt-repeat, a later optimization iteration re-runs (flow-insensitive) value numbering while these stale flags are still present. VN then models the divide as non-throwing and clears
GTF_EXCEPT, after which CSE / loop hoisting moves a loop-invariant0 % p0into the loop preheader, above the dominatingp0 == 0guard, where it executes unconditionally and throws.Confirmed from
JitDump: the divide's value number carriesDivideByZeroExcon the first VN pass but not on the repeat pass, after which CSE places the def in the preheader and the guard is folded.Fix
Clear
GTF_DIV_MOD_NO_BY_ZERO/GTF_DIV_MOD_NO_OVERFLOWon div/mod nodes inCompiler::ResetOptAnnotations— the method that already discards other "set once" annotations (VNs, assertions, CSE numbers) between opt-repeat iterations — so each iteration's assertion propagation re-derives them from scratch and value numbering never observes stale flow-sensitive state. Statement side effects are recomputed afterward (gtUpdateStmtSideEffects) soGTF_EXCEPTis restored to a consistent state for the cleared divides.This keeps
OperExceptions, code generation, andStackLevelSettermutually consistent (no behavioral split), preserves the codegen divide-by-zero check elision, and covers both the divide-by-zero and overflow flags.Testing
src/tests/JIT/Regression/JitBlue/Runtime_129386(runs the repro underJitOptRepeat). It fails without the fix and passes with it.JitOptRepeatCount=2/4,JitStress=1/2,STRESS_OPT_REPEAT, and default; existing div/mod codegen, assertion-prop, and a JitBlue/CodeGenBringUp/opt sweep pass.History of earlier approaches (superseded)
Attempt 1 —
OperExceptionsvalue-based + codegen reads the flag (superseded).Made
GenTree::OperExceptionsmodel divide-by-zero from value-based info only (ignoringGTF_DIV_MOD_NO_BY_ZERO), and had the arm64/loongarch64/riscv64/wasm code generators consultGTF_DIV_MOD_NO_BY_ZEROdirectly to keep eliding the runtime check. This fixed the miscompile but was the wrong shape:StackLevelSetterstill usedOperExceptionsand would reserve anSCK_DIV_BY_ZEROthrow-helper block that codegen never jumps to (dead block). (Flagged by the automated review.)GTF_DIV_MOD_NO_OVERFLOWrisk for signed div/mod overflow. (Also flagged by the automated review.)The current approach (clearing the stale flags at the single, already-existing opt-repeat reset point) is smaller, keeps all consumers consistent, and handles both flags.