[cdac] x86: implement IGCInfoDecoder.EnumerateLiveSlots; unblock GCRoots stackref tests#129547
Draft
max-charlamb wants to merge 4 commits into
Draft
[cdac] x86: implement IGCInfoDecoder.EnumerateLiveSlots; unblock GCRoots stackref tests#129547max-charlamb wants to merge 4 commits into
max-charlamb wants to merge 4 commits into
Conversation
…ots stackref tests Builds on the partial x86 IGCInfo support added in dotnet#129456 by porting the remaining decoder pieces required for GC-root scanning on x86, so that `IStackWalk.WalkStackReferences` returns live frame slots on x86 cDAC. The x86 GC info uses the legacy bit-packed `InfoHdr` byte-stream encoding (`src/coreclr/vm/gc_unwind_x86.inl`, `src/coreclr/inc/gcdecoder.cpp`) instead of the modern `GcInfoDecoder` shared by other architectures, so the implementation lives entirely on the existing `X86GCInfo` decoder under `Contracts/GCInfo/X86/`. Changes ------- * `X86GCInfo`: add `UntrackedSlots` lazy property + `DecodeUntrackedSlots()` -- delta-decoded signed varints with the double-align-frame rebase from `gc_unwind_x86.inl:3467`. * `X86GCInfo`: add `VarPtrLifetimes` lazy property + `DecodeVarPtrLifetimes()` -- triplets of (varOffs, begOffs delta, endOffs delta) for EBP-frame tracked locals. * Two new public record types `UntrackedSlot` and `VarPtrLifetime` capture the decoded entries. * `IsCodeOffsetInProlog` / `IsCodeOffsetInEpilog` helpers (offset-parameterised, so EnumerateLiveSlots can answer for any instruction offset without re-constructing X86GCInfo). * `RegMaskToRegisterNumber` helper maps the single-bit `RegMask` flags-enum values to the x86 ModRM register numbers used by `X86Context.TryReadRegister` and `LiveSlot.RegisterNumber`. * Implement `IGCInfoDecoder.EnumerateLiveSlots(uint offset, options)`: early-return empty in prolog/epilog (or aborted+non-interruptible), emit untracked locals (suppressed for filter funclets), emit VarPtr lifetimes covering `offset`, walk `Transitions` up to `offset` accumulating live registers + pushed pointer args, and emit a partially-interruptible `GcTransitionCall` exactly at `offset`. * Flip `IGCInfoDecoder.GetSizeOfStackParameterArea` from `NotSupportedException` to `return 0` for x86 -- x86 has no separate outgoing-argument scratch area; per-offset transitions report pushed args directly, so the GcScanner scratch-area filter is a no-op (correct). * Remove the `[SkipOnArch("x86", "GCInfo decoder does not support x86")]` markers on `GCRoots_WalkStackReferences_FindsRefs` and `GCRoots_RefsPointToValidObjects`. * `DumpTests.targets`: add optional `DebuggeeFilter=<Name>` to restrict `GenerateAllDumps` to a single debuggee. Useful for iterative local x86 work where some other debuggee's publish may fail. * `docs/design/datacontracts/GCInfo.md`: enumerate which `IGCInfoDecoder` APIs are wired up on x86. Out of scope (deferred) ----------------------- * `GetInterruptibleRanges` for x86 -- the only consumer is the catch-handler PC override in `StackWalk_1`; no x86-relevant scenarios today. * "this"-pointer special-case reporting for synchronized methods (VarPtr 0x2 bit currently masked out). * IPtrMask interior-pointer bitmaps for pushed args (uses the simpler per-push `Iptr` flag). * Funclet handling beyond the existing `IsParentOfFuncletStackFrame` caller-side early-skip. * Finer `IsActiveFrame` register filter precision. Validation ---------- * All 2525 cDAC unit tests pass. * The two unblocked `GCRoots_*` tests pass against a freshly generated x86 GCRoots dump. * Broader `DumpTests` x86 sweep: 34 pass / 46 fail / 830 skip -- net +2 vs. before this change (the two GCRoots tests), zero regressions. The 46 pre-existing failures are all unrelated to GCInfo (`ThreadDumpTests` / `ComWrappersDumpTests` / `RuntimeInfoDumpTests` / `WorkstationGCDumpTests` and similar). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
The intro paragraph was getting wordy with the per-API status; pull that content out into a new "x86 specifics" section at the end of the file with a table covering supported/not-implemented APIs and a deferred-edges list. Intro now just notes that x86 is partially supported and links to the section. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends the cDAC x86 GCInfo decoder so stack-walk GC root scanning can enumerate live frame slots on x86, and enables the previously x86-skipped GCRoots WalkStackReferences dump tests.
Changes:
- Implements
IGCInfoDecoder.EnumerateLiveSlotsfor x86, plus lazy decoding for untracked locals and VarPtr lifetimes. - Enables x86 execution for the two GCRoots
StackReferenceDumpTestsby removing the x86 skip. - Adds an MSBuild
DebuggeeFilteroption to limit dump generation to a single debuggee and updates GCInfo contract documentation.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| src/native/managed/cdac/tests/DumpTests/StackReferenceDumpTests.cs | Removes x86 skips so GCRoots stackref dump tests run on x86. |
| src/native/managed/cdac/tests/DumpTests/DumpTests.targets | Adds DebuggeeFilter support to limit debuggee csproj discovery/build. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/X86/GCInfo.cs | Adds lazy decoding for untracked/VarPtr tables and implements x86 live-slot enumeration. |
| docs/design/datacontracts/GCInfo.md | Updates the x86 support statement in the GCInfo contract doc. |
Comment on lines
+659
to
+663
| case Action.PUSH: | ||
| depthSlots++; | ||
| int spPush = -depthSlots * 4; | ||
| pushedPtrs[spPush] = ptrT.Iptr ? 0x1u : 0u; | ||
| break; |
Comment on lines
+624
to
+642
| case Action.PUSH: | ||
| // The register's value is pushed onto the stack as a callee argument. | ||
| // Each push advances depth; record the slot at the new top-of-stack. | ||
| for (int i = 0; i < regT.PushCountOrPopSize; i++) | ||
| { | ||
| depthSlots++; | ||
| int sp = -depthSlots * 4; // x86 grows down; offset relative to call-site SP | ||
| pushedPtrs[sp] = regT.Iptr ? 0x1u : 0u; | ||
| } | ||
| break; | ||
| case Action.POP: | ||
| // Pop unrolls the most recently pushed slots. | ||
| for (int i = 0; i < regT.PushCountOrPopSize && depthSlots > 0; i++) | ||
| { | ||
| int sp = -depthSlots * 4; | ||
| pushedPtrs.Remove(sp); | ||
| depthSlots--; | ||
| } | ||
| break; |
Comment on lines
+444
to
+448
| // For non-interruptible methods, an ExecutionAborted offset that isn't at a recorded | ||
| // safe point yields no reliable GC info; skip reporting as the native walker does | ||
| // (gc_unwind_x86.inl:3093). | ||
| if (options.IsExecutionAborted && !Header.Interruptible) | ||
| return Array.Empty<LiveSlot>(); |
Comment on lines
+72
to
+77
| /// <summary> | ||
| /// The untracked frame variable table, always-live GC frame slots. | ||
| /// Decoded lazily on first access. | ||
| /// </summary> | ||
| public ImmutableArray<UntrackedSlot> UntrackedSlots => _untrackedSlots.Value; | ||
| private readonly Lazy<ImmutableArray<UntrackedSlot>> _untrackedSlots; |
Comment on lines
+79
to
+84
| /// <summary> | ||
| /// The frame variable lifetime (VarPtr) table, per-offset-range tracked GC variables. | ||
| /// Decoded lazily on first access. Empty for non-EBP frames (only EBP frames track variables this way). | ||
| /// </summary> | ||
| public ImmutableArray<VarPtrLifetime> VarPtrLifetimes => _varPtrLifetimes.Value; | ||
| private readonly Lazy<ImmutableArray<VarPtrLifetime>> _varPtrLifetimes; |
Comment on lines
+697
to
+700
| /// <param name="StackOffset">Frame-relative byte offset of the slot.</param> | ||
| /// <param name="IsEbpRelative">True if <see cref="StackOffset"/> is EBP-relative; false if ESP-relative.</param> | ||
| /// <param name="LowBits">Raw flag bits from the encoded offset (0x1 = byref/interior, 0x2 = pinned).</param> | ||
| public readonly record struct UntrackedSlot(int StackOffset, bool IsEbpRelative, uint LowBits); |
Comment on lines
+711
to
+715
| /// <param name="LowBits"> | ||
| /// Raw flag bits from the encoded offset (0x1 = byref/interior, 0x2 = "this" pointer -- note that for | ||
| /// tracked locals the 0x2 bit means "this", not "pinned" as it does for untracked slots). | ||
| /// </param> | ||
| public readonly record struct VarPtrLifetime(uint BeginOffset, uint EndOffset, int StackOffset, uint LowBits); |
| This contract is for fetching information related to GCInfo associated with native code. | ||
|
|
||
| The GCInfo contract has platform specific implementations as GCInfo differs per architecture. With the exception of x86, all platforms have a common encoding scheme with different encoding lengths and normalization functions for data. x86 uses an entirely different scheme which is partially supported by this contract. | ||
| The GCInfo contract has platform specific implementations as GCInfo differs per architecture. With the exception of x86, all platforms have a common encoding scheme with different encoding lengths and normalization functions for data. x86 uses an entirely different scheme which is partially supported by this contract: x86 currently implements `GetCodeLength`, `GetStackBaseRegister`, `GetSizeOfStackParameterArea`, `GetCalleePoppedArgumentsSize`, and `EnumerateLiveSlots` (sufficient for SOS code-size lookups and for `WalkStackReferences` GC-root scanning). `GetInterruptibleRanges` is not yet implemented on x86 -- x86 does not encode explicit interruptible ranges; per-offset transitions are used instead, and the only consumer (catch-handler PC override in `StackWalk_1`) has no x86-relevant scenarios today. |
This was referenced Jun 18, 2026
Open
Context: x86 has used the funclet EH model since PR dotnet#115957 / dotnet#122872 (I had previously assumed otherwise and documented this API as intentionally not implemented). Catch funclet unwinding does call into `IGCInfo.GetInterruptibleRanges` via the parent-frame PC override path in `StackWalk_1.WalkStackReferences`, so throwing `NotSupportedException` is a real correctness gap (silently swallowed by the per-frame try/catch and producing missed parent-frame GC roots). Implementation -------------- Match the semantics of the native x86 walker (`EnumGcRefsX86` in `gc_unwind_x86.inl`): * Fully interruptible (`Header.Interruptible == true`): emit one range per gap between the prolog end and each epilog start, plus a final range from the last epilog end to `MethodSize`. This mirrors the native walker's prolog/epilog short-circuit return at line 3091. * Partially interruptible: walk `Transitions` and emit a single-byte `(offset, offset + 1)` range for each `GcTransitionCall`. Per the native partial-interrupt encoding doc (`gc_unwind_x86.inl:1066+`), call sites are the only GC-safe points in this mode -- the intervening `GcTransitionRegister` / `GcTransitionPointer` / `StackDepthTransition` / `IPtrMask` / `CalleeSavedRegister` events are all bookkeeping, not safe points. Docs ---- Update `docs/design/datacontracts/GCInfo.md` x86 specifics: * Move `GetInterruptibleRanges` from "Not implemented" to "Implemented" in the supported APIs table. * Remove the false claim that x86 has no funclets / no x86-relevant scenarios for this API. * Reference PR dotnet#115957 (Enable new exception handling on win-x86) for context on the EH model. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pure file rename (git mv) -- the class was renamed `GCInfo` -> `X86GCInfo` in dotnet#129456 to avoid collision with the empty IGCInfo fallback struct, but the file kept the old name. Bring the filename in line. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (7)
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/X86/GCInfo.cs:638
EnumerateTransitionLiveSlotsemits pushed pointer-arg slots using the raw offsets stored inpushedPtrs(SpOffset: pushed.Key). Those offsets are tracked in a pre-push coordinate system (negative, relative to the SP value before the outstanding pushes), butGcScannerinterpretsSpOffsetas an offset from the currentSP_RELbase. As-is, this will compute incorrect stack addresses for pushed args (and the comment above claims the offsets are positive). Translate the offsets to be relative to the current SP before emitting LiveSlots.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/X86/GCInfo.cs:602activeCallSiteis currently captured forGcTransitionCallentries even in fully-interruptible methods. That makesactiveCallSitenon-null at call offsets and can change downstream behavior (e.g., it can force register reporting even whenIsActiveFrameis false).GcTransitionCallshould only be used as the authoritative live-state source for partially-interruptible methods.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/X86/GCInfo.cs:631- Register reporting is currently guarded by
if (options.IsActiveFrame || activeCallSite is not null). Even after restrictingactiveCallSiteto partially-interruptible methods, this still reports the accumulatedliveRegsset at call sites, which can double-report registers alongside theactiveCallSite.CallRegistersemission and can re-enable reporting whenIsActiveFrameis false. Consider keeping accumulated register emission strictly tied toIsActiveFrameand skipping it when anactiveCallSiteis present.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/X86/GCInfo.cs:76 - This adds new
publicsurface area (UntrackedSlots) onX86GCInfo. Since the new data is only consumed internally within the x86 decoder, consider making thisinternal(orprivate) to avoid expanding the public API surface and the associated API-approval requirements for externally-consumable contracts assemblies.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/X86/GCInfo.cs:83 - This adds new
publicsurface area (VarPtrLifetimes) onX86GCInfo. If it’s not intended for external consumption, making itinternalhelps avoid locking in an API contract for implementation details (and avoids needing API approval for these additions).
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/X86/GCInfo.cs:750 UntrackedSlotis introduced as apublictype but is only used as an implementation detail withinX86GCInfo. Consider making itinternalto avoid committing to a public API shape for decoder internals (and to avoid needing API approval). If you do this, ensureUntrackedSlotsisn’t public either (it can’t return an internal type).
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/GCInfo/X86/GCInfo.cs:765VarPtrLifetimeis introduced as apublictype but is only used as an implementation detail withinX86GCInfo. Consider making itinternalto avoid expanding public API surface for decoder internals (and to avoid needing API approval). If you do this, ensureVarPtrLifetimesisn’t public either.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
This PR was authored with assistance from GitHub Copilot.
Summary
Builds on the partial x86 IGCInfo support added in #129456 by porting the remaining decoder pieces required for GC-root scanning on x86, so that
IStackWalk.WalkStackReferencesreturns live frame slots on x86 cDAC. Unblocks the two[SkipOnArch("x86", "GCInfo decoder does not support x86")]markers oncdac/tests/DumpTests/StackReferenceDumpTests.cs.The x86 GC info uses the legacy bit-packed
InfoHdrbyte-stream encoding (src/coreclr/vm/gc_unwind_x86.inl,src/coreclr/inc/gcdecoder.cpp) instead of the modernGcInfoDecodershared by other architectures, so the implementation lives entirely on the existingX86GCInfodecoder underContracts/GCInfo/X86/.Changes
X86GCInfo-- decode the untracked-locals and VarPtr-tracked-lifetimes tables (previously skipped) into two new lazy properties (UntrackedSlots,VarPtrLifetimes) backed by new public record types.IGCInfoDecoder.EnumerateLiveSlots-- replace theNotSupportedExceptionthrow with a real implementation that:EnumGcRefsX86.[BeginOffset, EndOffset)covers the queried instruction offset.Transitionsup to the instruction offset to accumulate live registers and pushed pointer args; handlesGcTransitionRegisterLIVE/DEAD/PUSH/POP/KILL,GcTransitionPointerPUSH/POP/KILL,StackDepthTransition.GcTransitionCallwhoseCodeOffsetmatches exactly.GetSizeOfStackParameterArea-- flip fromNotSupportedExceptiontoreturn 0on x86. x86 has no separate outgoing-argument scratch area; per-offset transitions report pushed args directly, so theGcScannerscratch-area filter becomes a no-op (correct).IsCodeOffsetInProlog/IsCodeOffsetInEpilog(offset-parameterised),RegMaskToRegisterNumber(single-bitRegMask-> x86 ModRM register number, matchesX86Context.TryReadRegister/LiveSlot.RegisterNumber).[SkipOnArch("x86", ...)]markers on the GCRootsStackReferenceDumpTests.DumpTests.targets-- add optional/p:DebuggeeFilter=<Name>to restrictGenerateAllDumpsto a single debuggee. Useful for iterative local x86 work where an unrelated debuggee's publish may fail.docs/design/datacontracts/GCInfo.md-- enumerate whichIGCInfoDecoderAPIs are wired up on x86.Out of scope (deferred follow-ups)
GetInterruptibleRangeson x86 -- the only consumer is the catch-handler PC override inStackWalk_1; no x86-relevant scenarios today.Iptrflag).IsParentOfFuncletStackFramecaller-side early-skip.IsActiveFrameregister filter precision.Validation
GCRoots_*tests pass against a freshly generated x86 GCRoots dump.DumpTestsx86 sweep: 34 pass / 46 fail / 830 skip -- net +2 vs. before this change (the two GCRoots tests), zero regressions. The 46 pre-existing failures are all unrelated to GCInfo (ThreadDumpTests/ComWrappersDumpTests/RuntimeInfoDumpTests/WorkstationGCDumpTestsand similar).References
GCInfocontract; implemented the offset-independent queries).EnumGcRefsX86insrc/coreclr/vm/gc_unwind_x86.inl(~640 lines), corresponding existing managed reference port atsrc/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/x86/GcSlotTable.cs(used as a reference for the untracked + VarPtr table decoders).