Fix #240 GC crash on 16K-page systems + five use-after-free fixes found by new RAY_DFD detector#243
Merged
Merged
Conversation
dl_builtin_before's contract — like every DL_BUILTIN helper — is to return a NEW owned table: dl_compile_rule does `filtered = dl_builtin_before(accum); ray_release(accum); accum = filtered`. The empty-accum early return handed back tbl without retaining it, so the caller's release dropped the only reference while accum kept pointing at the freed block. The slab allocator recycled that block as dl_project's output header, and the subsequent release(accum) in dl_compile_rule then freed the projected table out from under ray_const_table — a stale retain flagged by RAY_DFD on datalog/builtin_before_empty. Retain before returning, matching the all-rows-pass path. The NULL and error pass-throughs stay bare: retain/release are no-ops for both. Found by the RAY_DFD pool-allocator use-after-free detector.
All three are ownership-contract violations that corrupt the pool allocator when hit; each was flagged as a stale retain/release by the RAY_DFD shadow-set detector and reproduces in isolation. eval.c — atomic_map_binary_op, atomic_map_unary, to_boxed_list: the boxed-list fallback sets result->len to the full length before the fill loop, but the mid-loop error path released the filled prefix manually AND released the list — whose free walks len children, re-releasing the prefix (double free) and "releasing" uninitialized garbage pointers from the unfilled tail. Trim len to the initialized prefix and let the list free do the cleanup, matching the len=0 idiom already used in collection.c and serde.c. Reachable from any error mid-way through an atomic map over a boxed list, e.g. (+ 1 (list 1 2 "s")). filter.c — exec_filter_head: the non-table/non-BOOL pass-through returned `input` without retaining, but the exec.c OP_HEAD call site releases its input ref and returns the value as the node result — leaving the result dangling. Reachable from a head-limited filter whose predicate doesn't evaluate to BOOL. Retain on pass-through (NULL/error stay bare: retain/release no-op on both). sort.c — topk_take_vec: the full-sort fallback passed the LAZY handle from asc/desc into ray_take_fn, which materializes lazy args at entry and thereby CONSUMES the handle (ray_lazy_materialize releases its input) — the subsequent ray_release(sorted) double-freed it, and the slab recycled the block into the result, cascading stale releases up through eval. Hit by (top v n)/(bot v n) on STR/GUID/LIST vectors. Materialize explicitly before take so take borrows a concrete vector.
The RAY_DFD detector flagged ~60 sites across the suite where tests hand-assemble containers (PARTED, MAPCOMMON, TABLE, DICT, LIST blocks via raw ray_alloc) and store children without retaining, then release both the container and the children in teardown. The container's free path (ray_release_owned_refs) releases its children, so the manual child releases were double frees — silent until the slab recycled the blocks into later allocations. - test_partition_exec.c: make_mapcommon/make_parted helpers now retain children (the old "segments are referenced (not retained)" contract was unimplementable — container free always releases children); same for the three inline constructions the helpers don't cover. - test_exec.c: exec_make_mapcommon retains; all inline parted/mapcommon segment assignments retain. - test_heap.c: alloc_copy table/dict/mapcommon blocks retain children (rc assertions are relative, unaffected). - test_csr.c: sip_expand retains sip_sel — the graph owns a ref (ray_graph_free releases it) and the test releases its own. - test_domain.c: drop a duplicate release — ray_vec_append's COW already consumed the extra retain via ray_cow. - test_store.c: list elements are owned by the list (drop manual releases); ray_table_get_col returns a borrowed pointer (drop the release of col_out). Full suite is detector-clean: RAY_DFD=1 runs report 0 hits.
db74191 to
9cbc047
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Started as the fix for #240 (segfault in
ray_heap_gcon Apple Silicon) and grew into a memory-safety sweep: the investigation produced an opt-in use-after-free detector for the pool allocator, which then surfaced five more genuine bugs (four core, one large family of test-code double frees). The full suite is now detector-clean.Fixes #240 —
a850a767GC pass 5 madvises the interior of freelist-linked blocks, skipping a hardcoded 4096-byte "first page" to protect the
fl_prev/fl_nextlinks. macOS arm64 has 16K pages and Darwin'smadviserounds unaligned ranges outward (trunc_page/round_pagein xnu), so the start fell back toblkandMADV_FREEhit the header page of a live freelist node. Once the kernel reclaimed the page,fl_nextread back as zero and the next GC walk crashed dereferencingNULL + 8— exactly the reporter'sEXC_BAD_ACCESS (address=0x8). Now the release range is rounded inward to whole pages of the realsysconf(_SC_PAGESIZE), making the kernel's rounding a no-op everywhere. Mechanism + fix verified with a harness emulating Darwin's rounding.RAY_DFD detector —
c6f4e7c5ASan can't see use-after-free inside the mmap-backed pool allocator (why #240 and everything below survived sanitized builds). Debug builds now have an opt-in (
RAY_DFD=1) shadow set of currently-free blocks, checked on everyray_free/ray_release/ray_retain; hits report a backtrace and abort before allocator state corrupts.RAY_DFD_NO_ABORT=1collects instead. GC entry also validates all freelists for NULL links/cycles.Core use-after-free fixes —
ab5ff350,7e0ed7e6All flagged by the detector, all reproduce in isolation:
dl_builtin_before: returned its input unretained on empty tables; the caller releases its input unconditionally (DL_BUILTINhelpers' contract is "return a NEW owned table").atomic_map_binary_op,atomic_map_unary,to_boxed_list): mid-loop error cleanup double-released the filled prefix and "released" uninitialized tail pointers, becauselenwas set to the full length before filling. Reachable from any error midway through an atomic map, e.g.(+ [[1 2] [3]] [[1 2] "x"]).exec_filter_head: non-table/non-BOOL pass-through returnedinputunretained while the exec OP_HEAD call site releases its input and returns the value — dangling node result for head-limited filters with a non-BOOL predicate.topk_take_vec: passed the lazy asc/desc handle intoray_take_fn, which materializes lazy args at entry and thereby consumes the handle; the subsequent release double-freed it. Hit by(top v n)/(bot v n)on STR/GUID/LIST vectors (rfl/sort coverage).Interactive crash fix —
6842ae04Found by pty-fuzzing the REPL during the #240 hunt: stale completion word span after an early return in
ray_term_update_ghostmade TAB compute a negative tail incomp_cycle_insert— a hugememmove. Repro: type a long line, Ctrl-C, typet, TAB.Test-code double frees —
db74191a~60 sites hand-assembled containers (PARTED/MAPCOMMON/TABLE/DICT/LIST via raw
ray_alloc) without retaining children, then released both container and children in teardown. Also:sip_selownership in test_csr, a duplicate release in test_domain, and releasingray_table_get_col's borrowed pointer in test_store.Validation
RAY_DFD=1full-suite runs: 364 hits before → 0 after, verified across consecutive runs (hit visibility varies with slab recycling/thread timing, so single clean runs aren't trusted).