Fix/issue 240 gc madvise page size#242
Merged
Merged
Conversation
…riors GC pass 5 madvises away the interior of large free blocks while they remain linked on the freelist, skipping the first page so the buddy links (fl_prev/fl_next at offset 0-15) stay resident. The skip was a hardcoded 4096 bytes. On macOS arm64 the page size is 16384, and Darwin's madvise rounds an unaligned range OUTWARD (trunc_page(addr), round_page(addr+len)), so `blk + 4096` was truncated back to `blk` and MADV_FREE hit the header page of a block still linked on the freelist. Once the kernel reclaimed the page under memory pressure, fl_next read back as zero and the next GC freelist walk crashed dereferencing NULL+8 — the EXC_BAD_ACCESS at address 0x8 inside ray_heap_gc reported in #240. The reclaim dependence is why the crash was machine/state-dependent and invisible on 4K-page Linux (aligned start, and Linux rejects unaligned madvise with EINVAL anyway). Round the release range inward to whole pages of the real sysconf(_SC_PAGESIZE): start at the first page boundary past the header, end at the last boundary inside the block. The kernel's own rounding becomes a no-op on every platform, and sub-page blocks (8KB blocks on 16K pages) correctly release nothing. Fixes #240.
ray_term_update_ghost only refreshed ghost_word_start/ghost_word_len on the path that actually rendered ghost text. Early returns (best match no longer than the typed word, empty word) left comp_count populated while the word span still described a previous — possibly longer — line. Tab then started a completion cycle from the stale span, and comp_cycle_insert computed a negative tail that the (size_t) cast turned into a multi-exabyte memmove: an instant crash plus an out-of-bounds write through the terminal buffer. Repro: type a long line, Ctrl-C, type a short prefix whose best completion equals the prefix itself (e.g. `t`), press Tab. Found by pty-fuzzing the REPL while investigating #240. Set the span right after computing the word on every update, reset comp_count when there is no word at the cursor, and drop the cycle in comp_cycle_insert if its span no longer fits the buffer.
…_DFD) ASan cannot see use-after-free inside the mmap-backed pool allocator, which is why both the #240 crash and several latent double releases survived sanitized builds. Add a debug-build shadow set of currently free block addresses — maintained at freelist insert/remove, slab push/pop, and foreign-list enqueue — and check it on every ray_free, ray_release, and ray_retain. A hit reports a backtrace and aborts before allocator state is corrupted. ray_heap_gc additionally validates every registered heap's freelists for NULL links and cycles at entry, catching corruption at the same point the release build crashes, but deterministically. Disabled unless RAY_DFD=1 (debug builds only; compiled out of release entirely). RAY_DFD_NO_ABORT=1 reports without aborting, useful for collecting all hits across a test-suite run. Already pays for itself: it flagged genuine stale releases in test_csr.c (sip_sel double ownership), the datalog dl_project/ray_const_table chain, and a recurring hand-rolled parted vector pattern in test_partition_exec.c/test_exec.c — tracked as follow-up work.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.