Skip to content

[VRT] Fix null ptr deref when a buddy superblock is full#122

Open
merkelmarrow wants to merge 1 commit into
Xilinx:devfrom
merkelmarrow:fix/buddy-null-deref-pr
Open

[VRT] Fix null ptr deref when a buddy superblock is full#122
merkelmarrow wants to merge 1 commit into
Xilinx:devfrom
merkelmarrow:fix/buddy-null-deref-pr

Conversation

@merkelmarrow

Copy link
Copy Markdown

BuddySuperblockBase::allocate returned nullptr when a superblock had no free buddy large enough for a request. However, its callers in Allocator::allocate are written to expect a std::bad_alloc instead. This means that, instead of moving on to the next superblock, the null was wrapped in a MediumBlock and returned as a valid allocation, then dereferenced host-side which produces a segfault.

To reproduce

A host program crashes immediately in the vrt::Buffer constructor when allocating three 32 MiB buffers on the MEM/HBM_VNOC path. The fault is host-side, the actual vbin design doesn't matter much.

LargeBlockSuperblock is fixed at 64 MiB, so two 32 MiB buffers fill one superblock and the third needs a new one:

  • Buffer 1 (32 MiB): no superblocks yet, create superblock 1, split, return the lower 32 MiB.
  • Buffer 2 (32 MiB): superblock 1 matches, return the free upper 32 MiB. Superblock 1 is now full.
  • Buffer 3 (32 MiB): superblock 1 is full. BuddySuperblockBase::allocate falls through its search loop and returns nullptr.
    The caller catches bad_alloc to fall through, but a returned null is not an exception, so the catch never fires and the new-superblock fallback below the loop is skipped:
try {
    UntypedBuffer buffer = superblock->allocate(size); // returned null on full, no throw
    return std::make_unique<MediumBlock>(superblock.get(), buffer); // wrapped the null and returned
} catch (const std::bad_alloc&) {
    continue; // intended exhaustion path, never reached
}

Fix

Replace return nullptr with a throw std::bad_alloc(). Matches the exception already used at the top of the same function for the oversize-index case.

Testing

Reproduced and fixed on V80 with libvrt built from this branch.

  • Pre-fix: three 32 MiB allocations, segfault on third allocation, gdb backtrace shows last call vrtd::Buffer::getPhysAddr(this=0x0), called from Buffer::initAllocate via view->getPhysAddr(). Buffers 0 and 1 sit at 0x4000000000 and 0x4002000000, 32 MiB apart.
  • Post-fix: all three allocations succeed, H2D/D2H returns correct data.
  • 200 x 4 KiB (SmallBlock plus a new medium superblock)
  • 16 x 32 MiB (many new medium superblocks)
  • 1 x 96 MiB (LargeBlock)
  • 5 x 64 MiB (one superblock per buffer)

BuddySuperblockBase::allocate returned a null UntypedBuffer when no free
buddy of the requested size remained, but Allocator::allocate only catches
std::bad_alloc to advance to the next superblock or create a new one. The
null buffer was therefore wrapped in a MediumBlock/SmallBlock and returned,
and Buffer::initAllocate then dereferenced its null backing buffer in
view->getPhysAddr(), causing a segfault.

This surfaces when using MEM/HBM_VNOC: three 32 MiB buffers fill one
64 MiB superblock after two allocations, so the third allocation needs a
new superblock and instead crashes. Throwing bad_alloc on full
matches the existing throw at the top of the same function.

Signed-off-by: Marco Blackwell <mblackwe@amd.com>
@merkelmarrow merkelmarrow changed the title Fix null ptr deref when a buddy superblock is full [VRT] Fix null ptr deref when a buddy superblock is full Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant