Skip to content

internal bridge APIs + dylint bans (total sweep, no grandfathering) #844

Description

@zackees

Goal

Generalize the "internal bridge API + dylint ban on the third-party primitive" pattern that fbuild already uses for subprocess, paths, serial, and deploy. Per user directive: no grandfathering, no backwards compatibility concerns, no extant allowlists carried forward. Beta software — sweep everything.

A 5-agent audit on 2026-06-29 surveyed the workspace for every place that bypasses an existing bridge or would benefit from a new one. All work is rolled into this one issue (was split across #844 + #845-#848 briefly).

Existing lints (do not re-do)

Internal API Dylint Banned primitive Origin
fbuild_core::subprocess::run_command ban_raw_subprocess std::process::Command::{spawn,output,status} + tokio variant #264
fbuild_core::path::NormalizedPath ban_std_pathbuf std::path::PathBuf #826
fbuild_serial::SharedSerialManager ban_direct_serialport use serialport:: outside fbuild-serial #605
fbuild_deploy::Deployer trait ban_deploy_tool_direct_invocation Command::new("esptool"|"avrdude"|...) #694
(no API) ban_file_based_locks OpenOptions::create_new(true), fs2::FileExt, flock( #826
(no API) ban_unrooted_tempdir tempfile::TempDir::new(), tempfile::tempdir() #826
(no API) ban_std_sync_mutex_in_async std::sync::Mutex in daemon/serial scope #826
(no API) ban_process_exit_outside_main std::process::exit outside main.rs #826
(no API) ban_unwrap_in_daemon_handlers .unwrap() inside crates/fbuild-daemon/src/handlers/ #826
(no API) cli_no_build_deploy_direct_use imports of fbuild_build::* / fbuild_deploy::* from crates/fbuild-cli/src/ #826
(no API) require_multi_thread_flavor_when_spawning #[tokio::test] w/o flavor = "multi_thread" on tests that spawn #826

Closing checklist

Every box below must be checked before this issue closes.

Bridge pair 1 — HTTP client

  • Hoist fbuild_packages::http::client() to fbuild_core::http::client(), with client_with_timeout(Duration) and blocking_client(Duration) (latter for the port_scan OS-thread case).
  • Migrate 10 reqwest::Client::new() direct-construct sites:
    • crates/fbuild-python/src/daemon.rs:150,184,238,300,314
    • crates/fbuild-python/src/async_serial_monitor.rs:348
    • crates/fbuild-python/src/outcome.rs:151
    • crates/fbuild-cli/src/daemon_client.rs:209-212
    • crates/fbuild-cli/src/cli/port_scan.rs:129
    • crates/fbuild-daemon/tests/{test_emu_endpoint.rs:75, build_streaming.rs:59}
  • Ship ban_bare_reqwest lint with zero allowlist (forbids reqwest::get, reqwest::blocking::get, reqwest::Client::new, reqwest::ClientBuilder outside the bridge).

Bridge pair 2 — File I/O in async

  • Add fbuild_core::fs::* curated re-export of tokio::fs::*.
  • Migrate all std::fs::* calls reachable from async fn (audit found 18 in 7 files as direct tokio::fs imports; per "no grandfathering" every std::fs in async context migrates too).
  • Ship ban_std_fs_in_async lint (detects std::fs::* inside async fn or tokio::spawn closures; exempts spawn_blocking closures). Zero allowlist.
  • Ship ban_tokio_fs_direct_import lint forcing imports through fbuild_core::fs. Zero allowlist.

Bridge pair 3 — Sleep

  • Add fbuild_core::time::sleep (re-export of tokio::time::sleep).
  • Migrate 67 std::thread::sleep sites to fbuild_core::time::sleep or tokio::time::sleep.
  • Ship ban_std_thread_sleep lint with zero allowlist (the user directive overrides the prior thought of exempting test/sync helpers).

Bridge pair 4 — Channels

  • Add fbuild_core::channel::{bounded, unbounded} wrapping tokio::sync::mpsc.
  • Migrate 54 std::sync::mpsc sites + 23 direct tokio::sync::mpsc imports.
  • Ship ban_std_mpsc_in_async_reachable lint and ban_tokio_mpsc_direct_import lint. Zero allowlist.

Bridge pair 5 — Path canonicalize

  • Add fbuild_core::path::canonicalize_existing(path) that calls std::fs::canonicalize + strips Windows \\?\ UNC prefix + returns NormalizedPath.
  • Migrate 23 std::fs::canonicalize sites (highest concentration: crates/fbuild-header-scan/src/walker.rs with 7).
  • Ship ban_std_fs_canonicalize lint. Zero allowlist.

Bridge pair 6 — Atomic state-file writes

  • Add fbuild_core::fs::write_atomic(path, content) — writes to <path>.tmp.<pid>.<nonce>, File::sync_all(), then atomic rename (MoveFileExW with MOVEFILE_REPLACE_EXISTING | MOVEFILE_WRITE_THROUGH on Windows).
  • Migrate 5+ HIGH-risk state-file writers (corruption blocks rebuilds):
    • crates/fbuild-build/src/build_info.rs:291 (build fingerprint JSON)
    • crates/fbuild-packages/src/toolchain/esp32_metadata.rs:236,247,273,283 (framework discovery cache)
    • crates/fbuild-cli/src/cli/symbols_cmd.rs:180, clang_tools.rs:206, clangd_config.rs:87
  • Refactor crates/fbuild-daemon/src/status_manager.rs:210-225 (already does the pattern manually) to call the shared helper.
  • Code review checklist: any new state-file writer must use write_atomic. (Not lint-feasible — false-positive heavy.)

Bridge pair 7 — Named Duration constants

  • Add fbuild_core::time::{SHORT_HTTP_TIMEOUT, MEDIUM_HTTP_TIMEOUT, LONG_HTTP_TIMEOUT, POST_DEPLOY_RECOVERY_DEADLINE, DAEMON_LONG_OP_TIMEOUT, POLL_50MS, POLL_100MS, POLL_200MS, REAL_BUILD_TIMEOUT} etc.
  • Audit found 335 distinct Duration::from_* literals; the named-constant set covers the top 80%. Migrate the top-10 most-used patterns.

Lint 8 — Runtime::new() outside entry points

  • Ship ban_runtime_new_outside_main — only allowed in main.rs, src/bin/*.rs, #[cfg(test)] modules.
  • Migrate 8 sites:
    • crates/fbuild-packages/src/library/library_manager.rs:314
    • crates/fbuild-packages/src/lnk/resolver.rs:120
    • crates/fbuild-packages/src/toolchain/avr.rs:351 (currently .unwrap() — fix the panic too)
    • crates/fbuild-packages/src/toolchain/esp32_metadata.rs:87
    • crates/fbuild-python/src/lib.rs (3 sites)
    • crates/fbuild-python/src/serial_monitor.rs:199

Lint 9 — Poison-panic ban

  • Ship ban_poison_panic lint flagging std::sync::Mutex::lock().unwrap()/expect(_) and std::sync::RwLock::{read,write}().unwrap()/expect(_) workspace-wide.
  • Migrate poison-panic sites (10+ identified in daemon, ~47 std::sync::Mutex workspace-wide).
    • crates/fbuild-daemon/src/broker/{backend.rs:149, service.rs:401}
    • crates/fbuild-daemon/src/context.rs:615
    • crates/fbuild-daemon/src/device_manager.rs — multiple sites
  • Zero allowlist.

Lint 10 — Tempfile root migration (drive existing allowlist to zero)

  • Add fbuild_paths::dev_or_prod_temp_root() returning ~/.fbuild/{dev|prod}/tmp/<subdir>/.
  • Migrate 92 sites currently in dylints/ban_unrooted_tempdir/src/allowlist.txt:
    • crates/fbuild-packages/src/extractor.rs (HIGH: per-package-install)
    • crates/fbuild-packages/src/library/esp32_framework/libs.rs (HIGH: per-env install)
    • crates/fbuild-packages/src/disk_cache/{gc,mod}.rs (HIGH: cache cleanup)
    • crates/fbuild-build/src/linker.rs (HIGH: per-build)
    • crates/fbuild-build/src/framework_core_cache.rs (HIGH: framework hydration)
    • …87 more
  • Delete dylints/ban_unrooted_tempdir/src/allowlist.txt (or leave header-only) after all sites migrate.

Lint 11 — Extended unwrap discipline

  • Extend ban_unwrap_in_daemon_handlers scope from crates/fbuild-daemon/src/handlers/** to:
    • All non-test code under crates/fbuild-daemon/src/**.
    • CLI subcommand entry points under crates/fbuild-cli/src/cli/**.
  • Migrate 2,238 production .unwrap() calls across the workspace. Per-crate sweep order:
    1. fbuild-paths / fbuild-core / fbuild-config (smallest leaves)
    2. fbuild-packages/disk_cache/
    3. fbuild-packages/library/ (5 high-count files: 46, 43, 42, 38, 36)
    4. fbuild-build/build_info.rs (50), build_fingerprint/ (101), framework_libs.rs (76), pipeline/library.rs (39)
    5. fbuild-daemon/models.rs (51), device_manager.rs, broker/
    6. fbuild-cli/src/cli/
  • Zero allowlist on the extended lint.
  • Top-10 highest-unwrap-count files for prioritization: see unwrap discipline sweep (#844) #847 / migration table above.

Lint 12 — CLI output discipline

  • Add crates/fbuild-cli/src/output.rs curated API: progress(), result(), warn(), error(), debug()tracing-backed.
  • Migrate 248 println! / eprintln! calls in crates/fbuild-cli/src/**.
  • Migrate 12+ debug println!/eprintln! calls in crates/fbuild-build/src/**/*linker*.rs and per-platform orchestrators to tracing::debug!(target = "fbuild_build::linker", ...).
  • Ship ban_print_in_production lint scoped to crates/fbuild-cli/src/** and crates/fbuild-build/src/**, allowlist = [crates/fbuild-cli/src/output.rs] (the bridge module itself).
  • --color={auto,always,never}, --quiet, --verbose flags consistently routed through tracing's level filter.

Smaller tightenings

  • run_command(None) audit — 8 sites currently pass literal None for the timeout argument:

    • crates/fbuild-daemon/src/handlers/emulator/avr8js_npm.rs:14 (node --version probe)
    • crates/fbuild-daemon/src/handlers/emulator/runners.rs:250 (simavr --help probe)
    • crates/fbuild-packages/src/library/library_compiler.rs:373, 552
    • 4 unit/test sites in crates/fbuild-core/src/subprocess.rs

    Migrate each to either explicit Some(Duration::from_secs(N)) or run_command_no_timeout(...). Then add a lint to reject the literal None.

Audit findings — NOT worth a lint (confirmed clean or no policy concern)

These were surveyed but require no action:

  • Socket usageTcpStream / TcpListener / UnixStream / interprocess::local_socket — all daemon-axum, broker IPC, or test.
  • DNS resolution — no direct ToSocketAddrs / trust_dns / hickory use.
  • Retry / backoff logic — 4 different retry patterns, each domain-tuned. No consolidation justified.
  • crossbeam::channel — not used.
  • std::thread::JoinHandle::join() — zero blocking joins.
  • std::thread::spawn direct use — 3 sites, all intentional (broker IPC backend, port_scan reqwest isolation, test mock).
  • walkdir / ignore / globwalk — minimal use (8 sites), no policy needs.
  • File handle scope — clean.
  • TLS configuration — handled via shared reqwest client.
  • espflash direct use — correct (library API in spawn_blocking from Deployer).
  • Python via uv — clean in production.
  • cargo / rustc / git direct invocation — zero sites in production.
  • Helper crates (duct, xshell, cmd_lib, os_pipe) — none in deps.
  • panic! / todo! / unimplemented! in production — clean (1 panic! in test).
  • block_on inside async fn — none found.
  • tokio::spawn(...).await anti-pattern — none found.
  • unsafe blocks — 30 sites, all with // SAFETY: comments.
  • Box::leak / mem::transmute — 7 sites, all intentional (containment harness).
  • std::env::set_var/remove_var in production — 7 sites, all for child-process environ setup.
  • tokio::sync::broadcast / tokio::sync::watch — per-feature, correct.
  • Atomic Ordering choice — 70 sites, all correct.

Operating rules

  1. No allowlists this round. Per user directive, every new lint ships with zero allowlist entries — code that doesn't comply gets fixed first.
  2. Bridge has strictly less surface than the primitive. If you find yourself re-exporting every flag, kill the wrapper.
  3. Lint + API land in the same PR. Don't ship a ban without a sanctioned replacement.

Cost ceiling

fbuild is at 11 lints today. After this checklist completes, it will be at ~20-22. That's the practical ceiling for contributor cognitive load. Anything beyond this should be code review / CodeRabbit rules, not dylint.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions