WEB-4808: supply discovery daemon credentials via persisted config (fix scheduled idle)#141
Draft
vigneshsubbiah16 wants to merge 2 commits into
Draft
WEB-4808: supply discovery daemon credentials via persisted config (fix scheduled idle)#141vigneshsubbiah16 wants to merge 2 commits into
vigneshsubbiah16 wants to merge 2 commits into
Conversation
The recurring unbound-discovery LaunchDaemon runs `unbound-discovery scan`
as root, with no --api-key/--domain, no EnvironmentVariables, and no shell
env. Every scheduled run found no credentials and idled fail-open (exit 0),
so fleet discovery happened only once at onboard time via the inline scan.
Fix (two coordinated parts, plist and onboard.sh left untouched):
1. setup_cmd.py: _run_discovery now persists the discovery key + backend
domain to /opt/unbound/etc/discovery.json (0600 root:wheel, atomic
write via mkstemp+os.replace) before the one-shot scan. The dir is the
pkg-provisioned config dir (postinstall, 0750 root:wheel). Best-effort:
a persist failure is logged, never aborts onboarding.
2. unbound_discovery_entry.py: credential resolution now falls back to that
config file when --api-key/UNBOUND_API_KEY and --domain are absent, so
the scheduled root daemon can find its creds. File-sourced creds are fed
through the existing channels (key via UNBOUND_API_KEY env to keep it out
of `ps`; domain via --domain argv). Explicit flags/env still win.
Also strips a leading `scan` token the same way `mcp-scan` is shifted off
upstream's strict argparse: with no creds the lenient parse tolerated it
(masking the bug), but once creds resolve an unstripped `scan` would
SystemExit(2). Stripping it keeps the world-readable plist secret-free
AND lets the credentialed sweep actually run.
Fail-open preserved: absent/unreadable/malformed config returns {} and the
daemon idles-and-exits-0 exactly as before. The config loader never raises.
Tests:
- packaging/test_discovery_entry.py (new, 12 cases): config file resolves
creds for the daemon's `scan` invocation; flags/env win; partial/missing/
malformed config still idles exit 0; loader never raises; --version
short-circuits.
- packaging/test-discovery-daemon.sh: variant 3 reproduces production
exactly (plist with `scan` token, NO plist creds, creds only in the
config file) and asserts the daemon scans + POSTs; backs up/restores any
pre-existing host config so it stays non-destructive.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…o CI Convergent reviewer should-fix items on the root-on-fleet discovery credential path (PR #141). Surgical, fail-open preserved. 1. (M1 / dir-creation mode) _write_discovery_config now creates the leaf /opt/unbound/etc dir as 0750 instead of relying on the umask-default 0755 (world-traversable), and when running as root enforces 0750 + root:root ownership even on a pre-existing dir. This reinforces the directory-level symlink/TOCTOU barrier the 0600 credential file relies on. It no longer auto-creates ancestors above `etc`: a missing /opt/unbound (pkg-provisioned) returns a reason instead of building the tree world-readable. The atomic 0600 root:wheel file write (mkstemp+fchmod+fchown+os.replace) is unchanged, and the whole thing stays best-effort/fail-open — a persist failure is reported, never aborts onboarding. 2. The 12 hermetic fail-open tests (packaging/test_discovery_entry.py) were not run by CI — only the macOS release build runs pytest, which would let a fail-open regression merge. static-checks.yml (every PR) now runs `pytest packaging/`, gating the SACRED idle-and-exit-0 path. 3. New hermetic writer test (packaging/test_discovery_writer.py): asserts the persisted file is 0600 and round-trips api_key+domain, the etc dir is not world-traversable, and a missing-grandparent / non-writable parent returns a reason string rather than raising. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem (WEB-4808, High)
The recurring
ai.getunbound.discoveryLaunchDaemon runsunbound-discovery scanas root, every 43200s, with no--api-key, no--domain, noEnvironmentVariables. A root daemon sees no user shell env, so the entry point finds no credentials and idles fail-open (exit 0) on every scheduled run. Fleet discovery therefore happened only once, at onboard time, via the one-shot inline scan — AI tools installed after onboarding were never re-inventoried.Linear: https://linear.app/unboundsec/issue/WEB-4808
Fix
Two coordinated parts. The world-readable plist and
onboard.sh.tmplare left untouched (the binary self-loads config instead of baking secrets into the plist).1. Persist creds at setup time —
binary/src/unbound_hook/setup_cmd.py_run_discoverynow calls a new_write_discovery_config(opts)that writes/opt/unbound/etc/discovery.json(api_key+domain) atomically, 0600 root:wheel (mkstemp +os.replace), into the pkg-provisioned config dir (postinstallcreates it 0750 root:wheel). Best-effort: a persist failure is logged but never aborts onboarding.2. Read config as a fallback —
packaging/unbound_discovery_entry.pyCredential resolution now falls back to that file when
--api-key/UNBOUND_API_KEYand--domainare absent. File-sourced creds are fed through the existing channels (key viaUNBOUND_API_KEYenv to keep it out ofps; domain via--domainargv). Explicit flags/env still win.It also strips a leading
scantoken the waymcp-scanis shifted off upstream's strict argparse. With no creds the lenient parse toleratedscan(which masked the bug); the moment creds resolve, an unstrippedscanwouldSystemExit(2)— so credential delivery alone would have turned the idle into an exit-2 instead of a working scan. Stripping it keeps the plist secret-free and lets the credentialed sweep run.Security
ProgramArgumentsstaysscan, noEnvironmentVariables).0600 root:wheel, written atomically so a reader never sees a partial/wrong-permission file.Fail-open preserved (sacred)
If the config file is absent, unreadable, malformed, empty, or wrong-typed, the loader returns
{}(it never raises) and the daemon idles-and-exits-0 exactly as today. The fix only adds a credential source; it can never crash, crash-loop, or block dev work.Tests
packaging/test_discovery_entry.py(new, 12 cases, green locally): config file resolves creds for the daemon'sscaninvocation and the sweep runs; explicit flags/env win over the file; partial / missing / malformed / empty config all still idle exit 0; loader never raises;--versionshort-circuits before config.packaging/test-discovery-daemon.sh: added variant 3 — the exact production shape (plist withscantoken, no plist creds, creds only in the config file) — asserting the daemon scans + POSTs and emits neither "No configuration provided" nor "unrecognized arguments". Backs up/restores any pre-existing host config so it stays non-destructive.(The frozen-binary daemon variant 3 and the CLI suite run in CI against the built bundle;
coding_discovery_toolsis only fetched at build time, so the in-process pytest stubs the upstream sweep to assert the credentialed path is reached.)Coordination
_run_discovery+ a new helper insetup_cmd.py(hunks at L516+ / L575+) — clear of the concurrently-edited_write_claude_managed_settings(~L255).onboard.sh.tmpluntouched (PR WEB-4826: fix install-report POST path + non-silent failure (onboard.sh) #138 in flight).Draft — do not merge.
🤖 Generated with Claude Code