fix: harden flox builds for reliable nixpkgs and manifest packaging#21
Conversation
Fix three related issues that caused `flox build t3` and `flox build nixpkgs-t3` to fail intermittently or consistently on Linux: 1. Pin midline-flush dependency on correct target (Makefile) The compiled `tests/midline-flush` binary must be declared as a dependency of `tests/midline-flush.sh` rather than `tests/midline-flush/run`, which is the target that invokes it. The indirection exposed a race condition in parallel Nix builds. 2. Fix midline-flush timing (tests/midline-flush.c) The 10µs sleep between flushing stderr and stdout was too short under Nix sandbox scheduling pressure, causing non-deterministic line ordering. Increased to 10ms, matching the inter-iteration shell sleep already in use. 3. Fix manifest build dependency and flag hygiene (Makefile, manifest) Introduce OPTFLAGS (default -g) so packaging builds can override to -O -static-libgcc without touching the dev default. The manifest build now passes OPTFLAGS="-O -static-libgcc" to avoid embedding debug-symbol references to build-time Nix store paths in the binary. Add glibc.out to runtime-packages as a workaround pending package-builder treating glibc as an implicit C runtime dependency. Also add `flox-build-test` to `make test`: when flox is in PATH, exercises both `flox build t3` and `flox build nixpkgs-t3` to catch Nix-sandbox-specific failures that don't surface in local builds. Silently skips when flox is absent (e.g. inside the Nix sandbox itself, or in CI environments without flox). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The manifest build command passed OPTFLAGS="-O -static-libgcc" on every
platform, but -static-libgcc is a GCC-only flag and the pure Nix sandbox on
Darwin compiles with clang, which rejects it:
clang: error: unsupported option '-static-libgcc'
This broke `flox build t3` on macOS, and since flox-build-test is part of
`make test`, it broke `make test` on macOS for anyone with flox in PATH.
Select the flag by platform in the build command: keep "-O -static-libgcc" on
Linux (where it avoids a runtime libgcc dependency, the original intent) and
use plain "-O" elsewhere, where -static-libgcc is both unsupported and
unnecessary. Verified on macOS: `flox build t3`, `flox build nixpkgs-t3`, and a
full `make test` all pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Reviewed and tested thoroughly on macOS. The non-flox changes are sound (default build, the One macOS regression found and fixed (pushed as
Since The fix selects the flag by platform in the build command — case "$(uname -s)" in
Linux) optflags="-O -static-libgcc" ;;
*) optflags="-O" ;;
esacRe-locked so Minor (non-blocking): the lockfile's |
PR Review: fix: harden flox builds for reliable nixpkgs and manifest packagingReviewed as a single-reviewer pass across Security & Correctness, Performance & Architecture, and Conventions & Tests. This is build/packaging infrastructure (Makefile, flox manifest/lock, a nixpkgs override expression, and a test-helper timing tweak) — no product C logic changes. Overall the change is sound and the stated goals (reproducible packaging builds, a CI-visible flox build gate, deterministic midline-flush ordering) are achieved. Findings below; none are blocking on their own, but per Forge review policy any Minor finding should be acknowledged before merge. Stage 1: Spec Compliance — PassNo slice/design doc applies (standalone utility, infra change). The PR body's stated objectives map cleanly onto the diff: OPTFLAGS hook, Linux-only Stage 2: Code QualityCritical (C) — Must Fix None. Important (I) — Should Fix I1:
Minor (M) — Nice to Have M1: Locked
M2: Duplicated entries in
M3:
Confirmed correct
SummaryMust fix before merge: None VerdictChanges requested — the change is functionally sound and well-documented; please weigh I1 (make the flox build gate opt-in/deterministic rather than PATH-triggered) and confirm M1 (glibc Via Forge (interactive) • 3ca4fd79 |
|
Thanks for the thorough pass — and for confirming the I1 — You're right, though, that the no-recursion guarantee was resting entirely on the sandbox type. With a M1 — glibc locked output: the lock is faithful; the divergence is a read of the wrong section. M2 — duplicated M3 — Net: I1's recursion edge is now closed ( |
flox-build-test runs only when flox is on PATH, which a "pure" sandbox does not provide, so the in-build `make test` skips it and there is no recursion today. But that safety relies entirely on the sandbox type: switching the build to a non-pure sandbox would put flox on PATH inside the build, and the build's `make test` would then re-enter `flox build`, looping indefinitely. Add a DISABLE_FLOX_BUILD_TEST make variable that skips the gate regardless of flox's presence, and have the manifest build pass it to the inner `make test`. This makes the no-recursion guarantee explicit and independent of the sandbox type. Verified on macOS: `flox build t3` and a full `make test` pass, with the in-build `make test` now skipping via DISABLE_FLOX_BUILD_TEST rather than by flox happening to be absent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Verified the two new commits and the dispositions against the tree — all resolved, and the M1 pushback is correct. I1 — recursion edge closed ( M1 — withdrawn; you're right, our reviewer read the wrong section. Confirmed M2 — agreed, upstream. The byte-for-byte reproducible re-lock is conclusive that it's catalog-server data faithfully recorded, not anything this repo can scrub. flox/floxhub#729 is the right home. M3 — confirmed harmless, no need to act. No remaining review blockers from my side. Via Forge (interactive) • 3ca4fd79 |
Summary
tests/midline-flush.shnow declares its dependency on the compiledtests/midline-flushbinary directly, closing a race condition exposed by parallel Nix builds.tests/midline-flush.cfrom 10µs to 10ms — 10µs was too tight under Nix sandbox scheduling pressure, causing non-deterministic line ordering and intermittent test failures.OPTFLAGS ?= -gin the Makefile so packaging builds can override to-O -static-libgccwithout changing the dev default; the manifest build now uses this to avoid embedding debug-symbol references to build-time Nix store paths.flox-build-testtomake test: whenfloxis in PATH,make testnow exercises bothflox build t3andflox build nixpkgs-t3to catch Nix-sandbox-specific failures that don't surface in local builds; silently skips when flox is absent.Workaround note
glibc.outhas been added to the manifest'sruntime-packagesto satisfy the package-builder's dependency checker. This is a workaround: the package-builder should treat glibc as an implicit runtime dependency of compiled C binaries without requiring it to be declared explicitly. A follow-up issue should be filed againstflox/floxto track this.Test plan
make testpasses locally (20/20 runs of midline-flush test)flox build nixpkgs-t3passesflox build t3passes (withresult-t3-buildCachecleared)make flox-build-testruns both flox builds end-to-endflox-build-testcorrectly skips with "flox not in PATH"🤖 Generated with Claude Code