Flint 0.5#9
Open
irzhywau wants to merge 436 commits into
Open
Conversation
Record the direction this branch is a foundation for, so the intent behind the substrate isn't lost. Frames the work as completing one control loop — reflect → preview → approve → act → audit — then putting selectable shells (including an intent-led AI shell with a contained agent capsule) on top. Contents: where we are (the built substrate); ordered roadmap (approval loop next; dispatch merge-gated on DDRM; shell-manager + selectable shells; intent-led AI shell; pluggable local/cloud intelligence; a Morphic/Godot living-object canvas — presentation only, core stays the authority); the experience we're building toward (authority made legible: trust as material, gates as visible circuits, approval as a deliberate ceremony, audit as a timeline); business model (shell tiers, DRM-self-enforced access, agent-safe enterprise wedge); and the trust/security framing (build-time vs run-time boundaries; open code != open authority; the real risks are the signing trust root, automation bias, and TCB creep — not forking). Direction, not a commitment. Honest real-vs-vision split; gaps stay tracked via the KNOWN_GAPS ratchet pattern. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016ZKy5Cca9RzwDuLb1szdeq
…sible DCT-QIM) Protected pixel-lock/HTML-lock pages now egress with TWO forensic marks carrying the same identity, so a leaked frame stays attributable even if one is removed: - Visible: faint tiled stamp (opacity 0.07) of the FULL owner EVM address + short content id + UTC open-minute, rendered in the anti-aliased DejaVu face (replaces the old elided address and blocky 8x8 bitmap). - Invisible (render/invisible.rs): a blind DCT-domain QIM mark in luminance under perceptual masking (flat white margins left pristine). Carries a COMPACT 20-byte wallet (232-bit codeword) so it recovers from CONTENT-SPARSE pages (short code/config snippets), validated end-to-end against real rendered text AND code pages — not just synthetic images. QIM bounds the per-block nudge so a high-contrast block can never carry the wrong bit (the fixed-margin scheme's flaw). Survives q85 + recompression + brightness/contrast + same-res screenshot + vertical offset; rescale/rotation/width-crop out of scope (documented). - Fail closed: pixel-lock (watermark::finalize) and HTML-lock (EPUB) both REFUSE to emit a protected page without a non-empty forensic stamp. - Offline forensics: `decrypt-provider --extract-watermark <image>` prints the recovered 0x wallet. - Helper stamps the full wallet + content id + UTC minute (quorum.rs). Gates: fmt + clippy clean; decrypt-provider 76/76, media-authority 10/10. Co-authored-by: Cursor <cursoragent@cursor.com>
…edge an open The parallel 2-of-3 recover spawned a thread per node but the collector join()ed ALL three before checking the threshold (the join was for the cheater cross-check). A single dead/wedged node therefore held the whole release hostage for its full per-node carrier timeout (~20s) — past the caller's open deadline — even when two healthy shares were already in hand. That's the "stuck on Verifying access & recovering keys" / 502 an open hit whenever one geo node was flaky. Now the recover threads are DETACHED and feed an mpsc channel, and collection RETURNS the instant the 2-of-3 threshold is met plus a short 1.5s grace to still admit a promptly-arriving third share for the cheater cross-check — realizing the long-documented "slowest of the two fastest" intent. A straggler simply stops being waited on; its thread finishes and cleans up its own pooled connection. Invariants preserved: still FAIL-CLOSED (< 2 served shares is refused); a recover panic is caught and counted a fault (catch_unwind), never a share, so one bad node can't abort the single-threaded warm-daemon loop; a spawn failure is a fault too. Non-reporting nodes are recorded as timeout faults so a fail-closed message names every node. 16 MiB recover stack retained (PQ-hybrid unseal is a stack hog). The collection logic is factored into a pure `collect_quorum_shares` and unit-tested over an mpsc channel: returns at threshold without waiting for a dead node, admits a prompt third within grace, never counts a fault toward the threshold. Gates: fmt + clippy clean (no new warnings); key-provider 52/52 (dev-modes). Co-authored-by: Cursor <cursoragent@cursor.com>
Rasterizable types ship a two-layer per-buyer mark, but audio/video are key-protected, NOT yet fingerprinted (the browser-MSE ceiling without EME). Document the honest status and the forensic upgrade plan instead of overclaiming: - New docs/AV_WATERMARKING.md: threat model; why the in-boundary image path can't transfer to streaming (per-segment decode→mark→re-encode breaks CENC/AAD); the chosen approach — A/B forensic variant watermarking (video) and spread-spectrum/ echo-hiding (audio), produced once at mint, selected per buyer from their SIGNED grant at serve time (CEK boundary + one canonical path intact, fail-closed when variants are absent); offline buyer-recovery extractor; phased plan where each chunk has a one-sentence pass/fail check; principles-conformance self-review. - PROTECTED_CONTENT.md: AV section now states "key-protected, not yet fingerprinted" and links the design. - ROADMAP.md: AV watermarking added under stronger-attestation as a mint-time transcode-pipeline track (roadmap item, not a patch). alignment-check: OK. Co-authored-by: Cursor <cursoragent@cursor.com>
A creator controls the source file, so a tiny crafted input could declare enormous dimensions and force a multi-GB allocation in the decrypt boundary (the most security-sensitive process) — a "pixel bomb" OOM. Bound BEFORE allocating, three chunks through one shared chokepoint, all fail-closed: - Chunk 1 (raster decode): new render::decode_bounded routes the single-image and CBZ-page decoders through image::Limits (max dims MAX_DIM=10k, max alloc MAX_DECODE_BYTES=256MiB) so an oversized image is refused at header time, before the pixel buffer is built. Replaces the unbounded image::load_from_memory at image_page.rs and cbz.rs. Inner decode_with_limits is factored out so the rejection path is unit-tested with a tiny limit (CI never builds a real bomb). - Chunk 2 (PDF): bounded_scale clamps the render scale on BOTH axes and by area (MAX_PIXELS=48MP), not width alone — an extreme aspect (tiny w x huge h) no longer rasterises to a gigapixel pixmap — plus a final predicted-size guard that fails closed. Pure fn, unit-tested over adversarial native sizes. - Chunk 3 (CBZ): per-page (MAX_ENTRY_BYTES=64MiB) and aggregate (MAX_TOTAL_BYTES=512MiB) uncompressed caps via pure account_entry, enforced with an early declared-size reject + a take()-capped read so a lying header can't slip an under-declared bomb. Also bounds warm session memory (pages held for the open). API notes vs. the spec: image::Limits is #[non_exhaustive] (built by mutating a default(), not a struct literal) and reader.limits() is &mut self -> () (not chainable) — adjusted accordingly. SVG keeps its own local MAX_DIM; per-format decode *time* remains a wall-clock/watchdog follow-up (documented, not assumed). Gate: decrypt-provider 138/138 (rail-stream,rail-mint,pdf-render); no new clippy warnings in the changed files; alignment-check OK. Co-authored-by: Cursor <cursoragent@cursor.com>
… & EPUB hardening) Two boundary holes on the object egress path, both fail-closed: - (3) Serve-time content sniff. The raw `/bytes` egress trusted the mint-time `pixel_locked` flag, which trusts the creator-declared mime — so a renderable/ scriptable document mislabeled with a non-pixel-lock mime could egress as raw plaintext. `viewer_object_bytes` now sniffs the DECRYPTED bytes (after authority.object(), before octet_stream) via a pure magic-byte `sniffs_as_lockable` (PDF / ZIP / raster image / SVG-XML) and returns 403 if a "raw" asset's content looks pixel-lockable. The one exception is an explicitly-declared `application/zip` (generic archive download). Buyer-safe: the declared mime lives in the signed descriptor, not buyer-controlled. Verified a PDF mislabeled as a 3D model (which shares this decrypt-passthrough handler) is now caught. - (5) HTML-lock CSP/nosniff. EPUB chapters served as sanitised HTML now carry an enforced HTTP `Content-Security-Policy` with a `sandbox` directive (`default-src 'none'; img-src data:; style-src 'unsafe-inline'; font-src data:; base-uri 'none'; form-action 'none'; frame-ancestors 'self'; sandbox`) plus `X-Content-Type-Options: nosniff` and `Referrer-Policy: no-referrer`, so the document is sandboxed at the RESOURCE level by the browser even if loaded directly or framed without the attribute — the hand-rolled sanitiser is no longer the sole barrier. JPEG pages get `nosniff` only. Out of scope (tracked follow-ups, not silently skipped): the media/stream egress (`viewer_media`/MSE) is a second egress door with no sniff yet; text/code mislabel needs heuristics (no reliable magic byte). The render direction is already fail-closed via the parsers. Gate: elastos-server viewer_object 7/7; clippy -D warnings clean. Co-authored-by: Cursor <cursoragent@cursor.com>
…l-lock CSP Make the protected-content docs state the watermark's true strength and the new boundary defenses exactly (Principle 12 — docs/code/threat-model agree): - Watermark forensic scope & privacy (THREAT_MODEL §3 row + §6.6; PROTECTED_CONTENT "Forensic strength & privacy"): the mark is UNKEYED and CRC-protected (not signed), so it is forgeable and repudiable — a deterrent/tracer, NOT court-grade evidence; the authenticated record is the §4 signed custody log. It is also NOT anonymous: both layers embed the full opening wallet (visible layer human-readable), so anyone who sees a rendered page de-anonymizes the buyer — the deliberate leak-attribution trade. Names the roadmap upgrade (authenticate the payload: MAC/opaque token). - Pixel-bomb resource bounds (PROTECTED_CONTENT): documents decode_bounded (image::Limits), the PDF both-axes+area scale clamp, and the CBZ per-page/total caps. - HTML-lock CSP (PROTECTED_CONTENT): documents the enforced HTTP CSP `sandbox` + nosniff containment order (HTTP CSP true layer ▸ meta/iframe belt ▸ sanitiser DiD). Docs-only; alignment-check OK. Co-authored-by: Cursor <cursoragent@cursor.com>
…d grant
Tier C (1), chunks 1-4: upgrade the invisible pixel-lock watermark from an
unkeyed CRC-only mark (forgeable + repudiable) to one ANCHORED IN THE BUYER'S
OWN WALLET SIGNATURE — so a leaked frame is non-repudiable and forgery rises
from "anyone can plant any wallet" to "only a party holding the victim's signed
grant can." Code and docs land together (Principle 12).
- Shared digest (ddrm-envelope): `grant_watermark_digest16(delegation_sig_hex)`
= SHA-256(normalized EIP-191 delegation signature)[..16]. Lives in the crate
BOTH the embedder and the verifier link, so they cannot drift. No new deps
(sha2 already present).
- Payload codec (decrypt-provider/render/invisible.rs): new TAG_GRANT_DIGEST
carrying `[wallet_prefix(4) | grant_digest(16)]` = 21 B <= the 24 B CAP, so
the 232-bit PERIOD (and sparse-page recovery) is unchanged. `embed` takes the
digest; `extract` refactored into `extract_raw` + `parse_grant_mark` so the
verifier reads the raw anchor. No-grant/local-dev opens fall back to the
compact wallet (back-compat).
- Wire (watermark.rs + media-authority quorum.rs): the authority appends an
invisible-only `\u{1F}gd:<hex>` token to the stamp; `finalize` splits it back
off so the VISIBLE mark stays the clean human `wallet . content . time` and
only the INVISIBLE layer carries the authenticated digest.
- Verifier (main.rs): `--extract-watermark <img> [--verify-grant <grant.json>]`
prints the wallet prefix + digest and reports MATCH/NO MATCH by recomputing
via the shared fn. Gated on pq-envelope (always in the shipped render binary).
- Docs: THREAT_MODEL S3 row / S6.6 refreshed to the authenticated state and S4
records the chunk-5 retention decision (option C: fold the digest into the
existing tamper-evident audit record, TTL + access-controlled; status pending
wiring). PROTECTED_CONTENT forensic-strength block + the invisible-layer
description match. Honest bound kept explicit: the delegation signature is not
a hard secret, so this is non-repudiation + raised-forgery, NOT full
anti-framing; a server-key MAC / opaque custody token remains the north star.
Gates (capsules are not -D warnings gated by `just`; verified directly):
decrypt-provider compiles clean + render tests 59/59; media-authority 12/12
(incl. cross-crate digest agreement); ddrm-envelope digest test + 60 existing;
alignment-check OK.
Co-authored-by: Cursor <cursoragent@cursor.com>
…stody chain Wire Tier C-1 chunk 5: fold the 16-byte authenticated grant digest (a non-reversible commitment to the buyer's signed delegation — the same value the invisible pixel-lock watermark embeds) into the existing append-only content_open custody record, so a leaked frame is verifiable against an audit row WITHOUT a second who-opened log or any raw wallet/grant retention (option C). - audit.rs: optional grant_digest on AuditEvent::ContentOpen, serde-skipped when absent so prior records hash-verify unchanged; content_open() takes it; test proves backward-compat + chain verification with and without the anchor. - viewer_open.rs: resolve the wallet-signed grant (fresh AND cached paths) ABOVE the custody write and derive grant_digest from the EXACT signature forwarded to the quorum, so the §4 record carries the anchor; malformed fresh grant still fails before any "opened" record is written. Media/no-grant opens -> None. - elastos-server cannot link the PQ ddrm-envelope crate, so it carries a no-shared-dep twin (grant_watermark_digest16_hex) guarded by a golden vector cross-checked against ddrm_envelope::grant_watermark_digest16 in BOTH crates, pinning the trim+lowercase normalization so the two sides cannot drift. - THREAT_MODEL.md §4: retention entry updated to "option C, wired" — minimization-via-non-reversibility, not TTL (the chain is intentionally permanent); records a TTL-prunable index as explicitly rejected (Principle 10). Gates: ddrm-envelope golden, elastos clippy -D warnings (workspace), runtime audit chain test, elastos-server golden, decrypt-provider + media-authority tests, alignment-check — all green. Co-authored-by: Cursor <cursoragent@cursor.com>
…closed-by-construction Close the last two audit loose ends. (1) Lowercase-address normalize on compare. The invisible mark recovers the EVM wallet LOWERCASED (the 20 raw bytes carry no EIP-55 checksum casing), so any attribution compare against a stored/expected address must normalize both sides or a checksummed address would false-mismatch. - render/invisible.rs: add normalize_evm_hex() (trim, strip 0x, lowercase) + a one-line test proving checksum casing compares equal. - main.rs --verify-grant: advisory wallet cross-check — when the candidate grant JSON declares owner_address, confirm it matches the recovered 4-byte wallet prefix (both normalized). Fail-safe: advisory only, never overrides the digest verdict; pq-envelope-absent still returns 2 (no silent pass). (2) HiDPI/Retina screenshot doc nuance (invisible.rs header + PROTECTED_CONTENT.md): "same-resolution screenshot" means a 1:1 pixel-grid capture; a HiDPI/Retina screenshot resamples (~2x) = rescaling = the already-unsupported case, so most real-world HiDPI screenshots will not recover. Don't over-rely on it. (3) THREAT_MODEL.md: reclassify the media/stream egress as CLOSED BY CONSTRUCTION, not an open guard gap. The media tier serves only fMP4 from the ffmpeg transcode+fragment ingest (media-provider prod, ddrm-media-authority dev): a non-media file fails transcoding (no asset), and the pipeline re-encodes (AV1/AAC) rather than -c copy, so source bytes never survive into served segments even for a polyglot. With documents confined to the object tier (content-sniff guarded), no media-tier sniff guard is needed. Re-open only for a bring-your-own pre-segmented ingest or an ffmpeg -c copy/remux fast-path (would warrant a segment-0 mdat sniff). Gates: decrypt-provider invisible tests (pdf-render,pq-envelope) 13 pass incl new; rustfmt --check clean on both touched files; clippy introduces no new warnings; alignment-check OK. Co-authored-by: Cursor <cursoragent@cursor.com>
…n Linux CI The canonical gate and CI both scoped to `cd elastos && cargo --workspace`, which does NOT reach the crates this branch's protected-content work lives in (capsules/decrypt-provider, capsules/ddrm-envelope, scripts/dev/ddrm-media-authority). Their 217 tests — watermark codec, grant-digest envelope, media-authority — had ZERO automated coverage; they were gated by hand each commit. - justfile: add `verify-capsules` (build+test the capsule crates under their CANONICAL feature sets, matching scripts/dev/run-creator-gateway.sh: decrypt-provider = rail-stream,rail-mint,pdf-render,pq-envelope; ddrm-envelope = access-grant; media-authority = default) and fold it into `verify`, so the repo's "definition of green" finally covers the whole surface (Principle 12: the gate must match reality). clippy -D warnings is deliberately held back for the capsules (pre-existing lint debt); build+test is the real regression gate. Verified: all three are rustc -D-warnings-clean under these features, so the workflow's global RUSTFLAGS does not break them. - ci.yml: add a `verify` job (ubuntu, installs `just`, runs the full `just verify` incl. the Linux-only carrier smoke the macOS dev box can't run) and an isolated `capsules` job (`just verify-capsules`) so a heavy/flaky smoke run can never mask a capsule regression. Add `workflow_dispatch` so this feature branch can be put through the full Linux gate on demand before merge. This is the last gate between the branch and truly-done: turns "manually covered" into "full green on Linux". Co-authored-by: Cursor <cursoragent@cursor.com>
Add the feature branch to the push trigger so the full Linux gate (verify + capsules) runs on our own work in isolation, without a PR to main. This entry lives only on the branch and does not affect main or other branches until merge. Co-authored-by: Cursor <cursoragent@cursor.com>
First Linux CI run surfaced two real issues the macOS box could not (just verify aborts at the Linux-only smoke before reaching fmt): - viewer_object.rs (landed in the Tier B-3/D-5 commit) was not rustfmt-clean — 6 long-line/comment violations. cargo fmt -p elastos-server fixes only that file. - the `verify` job failed at its first step (just alignment-check) because the GitHub runner has no ripgrep, which check-wci-alignment.sh requires. Install it before `just verify`. (The capsules job needs no rg and already passed green.) Co-authored-by: Cursor <cursoragent@cursor.com>
…ider binary Linux CI surfaced this: chain_mode_without_wallet_fails_closed expected the "wallet not linked" fail-closed error but instead hit "rights-provider not found" because decide_owned_access resolved/checked the capsule binary BEFORE validating the subject wallet. On a clean runner (no pre-built capsule) the binary check fired first, the test panicked, and its panic poisoned ENV_LOCK — cascading into release_build_defaults_to_chain_and_refuses_dev_rights_modes. Reorder so subject/wallet validation runs first: a chain-mode request with no linked wallet is invalid on its face and must fail closed before we resolve or spawn any external binary. This is both more correct (don't spawn a subprocess for an obviously-invalid request) and makes the unit test hermetic (it is not an #[ignore]'d integration test, so it must not depend on a built capsule). Verified with ELASTOS_RIGHTS_PROVIDER_BIN=/nonexistent: both tests pass. Co-authored-by: Cursor <cursoragent@cursor.com>
The verify job got through alignment-check + ripgrep but failed in local-carrier-setup-smoke with `error[E0463]: can't find crate for std`: the smoke builds the Home capsules (capsules/home-cli and friends) to wasm32-wasip1, and the runner's stable toolchain ships only the host target. Add `targets: wasm32-wasip1` so the smoke's wasm build has std. The other four jobs are host-only and unaffected. Co-authored-by: Cursor <cursoragent@cursor.com>
…hain has its std The verify smoke still failed with E0463 after adding the target to the dtolnay @stable step: rust-toolchain.toml pins channel 1.89.0, so every cargo invocation uses 1.89.0 — not stable — and the wasm target had been added to the wrong toolchain. Declare `targets = ["wasm32-wasip1"]` in rust-toolchain.toml so rustup auto-installs the wasm std for the pinned toolchain everywhere (CI and local), and drop the now-redundant `targets:` from the workflow step. Verified locally: the home-cli wasm build compiles clean. Co-authored-by: Cursor <cursoragent@cursor.com>
… for GitHub Actions The full `just verify` cannot complete on a stock GitHub runner: its `local-carrier-setup-smoke` step fetches the net-provider artifact over Elastos Carrier, which a clean runner can't reach (proven on CI: it builds + runs the entire ~18-min gate and fails only there). Everything else a clean runner CAN verify. - justfile: add `verify-ci` = the full gate MINUS the carrier smoke, with a hidden `_verify-tail` shared by both `verify` and `verify-ci` so they can't drift. alignment-check stays first in both. `just verify` (with the carrier smoke) is unchanged for a Carrier-capable Linux box / self-hosted runner. - ci.yml: the Linux job now runs `just verify-ci` (renamed "Verify (Linux CI gate)") and documents that the carrier smoke is covered separately. This lands the branch's surface — incl. the 217-test capsule gate and the full elastos workspace fmt/clippy/test — under an enforceable green GitHub Actions gate. Co-authored-by: Cursor <cursoragent@cursor.com>
Fold the off-tree AV-watermarking feasibility study (verdict: GO) into the roadmap doc, with the audit caveats baked in rather than the harness's headline claims: - New Phase 0 (top of §5): video survival matrix, audio matrix, registration result, and the grant-anchored Tardos collusion chain. - FP correction: the harness's single-seed empirical threshold (mean+3.5sigma) is flagged as ~1.25% false-accusation (400-trial Monte-Carlo); a certified bound now requires the analytic Tardos threshold + an MC FP/FN sweep, and the per-asset bound is recomputed at the FP-controlled threshold (duration minimums move up). - New §3.4 Channel coding (required): the leak channel is bursty (whole-segment loss) -> timeline interleaving + an erasure-aware code; wired into chunks 2/6. - Audio re-validation made concrete (chunk 6): psychoacoustic masking model + PEAQ/ODG + human A/B/X on real music/speech/silence, and time-stretch/pitch. - Multi-strategy collusion (random/minority/all-ones/interleaving) mandated before any certified bound. - Registration -> Phase 5 gating DSP item (deterministic template/pilot or log-polar/Fourier-Mellin; brute search proven insufficient). - Full-variant-set AAD weld in §3.1/§4 (CEK binds the complete variant set; per-buyer selection is post-unwrap routing). - §8 resolved (ECC->Tardos, q-ary density lever, published per-asset bound at the FP-controlled threshold, channel-coding requirement); §7 honest-limits expanded; VMAF 96.7 demoted from gate to synthetic relative signal. Doc-only; no shipped behaviour. alignment-check green. Fix Widevine typo in §7. Co-authored-by: Cursor <cursoragent@cursor.com>
…review The "approve" step of the control loop (reflect → preview → APPROVE → act), parallel-safe and read-only. - elastos-runtime::approval (new, pure): `decide(mode, approver)` is fail-closed — the only path to Approved without an explicit yes is an affordance declared as needing no approval; User/RuntimePolicy default to PendingApproval; an explicit no always wins. `required_approval(actions)` scales the requirement with action strength (anything beyond read/message needs a human). 3 tests. - inspect/intent (new provider op, read-only): given a capsule + operation, derives the gate (via plan), the approval it requires, and the fail-closed default decision. Records nothing, dispatches nothing. - Gated consistently: `intent` added to the canonical op→action contract (Read) and the System-only browser allow-list. - Decisions: `revoke` and recorded approve/deny stay on the runtime/dispatch (mutation) path — the product InspectProvider remains a read-only projection. Recording pairs with dispatch (merge-gated). fmt --check PASS; targeted tests green (approval 3, inspect incl. intent 31 +2 ratchets ignored, provider_resource contract 1). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016ZKy5Cca9RzwDuLb1szdeq
- CAPSULE_INSPECTOR.md: add the inspect/intent wire contract (approval-intent preview); add a "path note" clarifying revoke + self are served on the embedded RequestHandler (shell) path while the product InspectProvider is a read-only projection (capsules/capsule/plan/intent) — closes the contract-honesty gap. - KNOWN_GAPS.md: G4 decision core DONE (approval + intent, fail-closed, tested); remaining = recording a signed approve/deny, which pairs with dispatch (G3). (An orchestrator CLAUDE.md was written locally but is .gitignored by repo policy, so it stays a local contract and is not committed.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016ZKy5Cca9RzwDuLb1szdeq
Pre-mainnet hardening from the deep audit (none block the branch; ① is the
item to put in front of the external auditor):
① Document the dKMS re-seal AAD invariant — the node re-seals the recovered
CEK under the caller-supplied aad_b64, which is NOT bound into the recover
possession-proof; safe only because the decrypt boundary rebuilds the
segment-bound AAD and fails closed. Loud comment at the seal_bound call +
THREAT_MODEL §7 note. Binding it into the proof is scoped with the auditor.
② Lock the release-build invariant — a compile_error! rejects a release build
(no debug_assertions) of dkms-authority with dev-modes/legacy-receipt-authz,
and a new CI job (dkms-release-invariant) asserts both directions. Adds
docs/DEPLOY_CHECKLIST.md (incl. the node-set-id authorize-time guard, which
is release-only and not unit-testable under cfg(test)).
③ Redact key-provider Debug — manual Debug on Request/ReleaseSessionContext
prints only the op name, so no CEK/escrow bytes can leak via {:?}.
④ viewer_open — log_fp(&object_cid) for the fresh-grant line (was the raw cid).
⑤ VENDORING.md — three.js r160 pin + periodic-refresh/upstream-watch plan.
Plus a fail-closed dKMS-open testing checklist in DKMS_OVER_CARRIER.md
(rights-mode + Carrier-rail must match how the asset was minted) so the
"foreign escrow" 502 diagnosis doesn't recur.
Gates: elastos-server fmt+clippy; dkms-authority build (debug+release) +
24/24 tests + guard verified; key-provider build + 52/52 tests; alignment-check.
Co-authored-by: Cursor <cursoragent@cursor.com>
…word + offline extractor AV forensic-variant layer, tractable + pipeline-free pieces built on the proven Phase-0/5 research. Feature-gated OFF by default (`av-variants`), so it cannot destabilize the default build; chunks 3/4/5 (mint transcode DSP, full-variant-set AAD weld, serve-time selector) are deferred to the live CENC/DASH/quorum pipeline. Chunk 1 — variant manifest schema (`elastos.ddrm.av-variants/v1`) in capsules/ddrm-envelope/src/av.rs: marked subset, q-ary variant refs (+ segment digest for the chunk-4 weld), codeword scheme (length/interleave/erasure τ/bias commitment). serde round-trips; validate() fails closed; single_encode() is the honest `fingerprinted:false` default. Chunk 2 — canonical, RNG-free codeword: asset_bias_vector / buyer_codeword (from grant_watermark_digest16, no per-buyer storage) / interleave_map / tardos_score. A domain-separated SHA-256 stream over integers (NOT any language's RNG), so the Rust serve selector and the Python extractor derive identical codewords. Replaces the Phase-0 numpy-RNG derivation. Chunk 6 — offline forensic extractor as the proven Python reference under tools/av-forensics/ (offline, operator-run, no key material, not in the boundary), re-anchored to the chunk-2 canonical construction. The load-bearing FM fix is preserved: register() resolves the Fourier-Mellin scale/rotation ambiguity on the VALID (non-border) region. The Rust --extract-av-fingerprint CLI is deferred until the scheme is frozen/certified. Cross-language anti-drift weld: tools/av-forensics/test_canonical.py asserts the same golden vectors as av::tests::canonical_golden_vectors — change either side and both fail. Wired into `just verify-capsules` (now also tests ddrm-envelope with av-variants), so CI covers the new module + the weld. Pure stdlib (no numpy/ffmpeg). Still uncertified (carried honestly in docs/AV_WATERMARKING.md): analytic Tardos threshold + Monte-Carlo FP/FN sweep (argmax is not proof), rotation estimator (out of envelope), audio on real content. AV remains key-protected, not fingerprinted, until chunks 3/4/5 ship and the certification gates pass. Gates: ddrm-envelope 51 tests (av-variants, incl. golden vectors); default build unaffected (module gated off); av.rs clippy-clean; cross-language weld PASS; ported extractor validated end-to-end (FM-reg → bitERR 0, leaker ranked top; no-reg fails closed); just verify-capsules PASS; just alignment-check OK. Co-authored-by: Cursor <cursoragent@cursor.com>
Replace the Phase-0 empirical mean+kσ accusation threshold (Monte-Carlo showed ~1.25% FP — not certifiable) with the analytic, FP-controlled threshold Z = √m·Φ⁻¹(1−ε/N): the innocent symmetric-Tardos score is exactly mean-0, variance-1 per kept position ⇒ N(0,m). - canonical.py: tardos_threshold + _inv_norm_cdf (Acklam, pure stdlib, extractor-side only — not a cross-language weld surface). - extractor.py: accuse only above the analytic Z (erasure-aware m), not an ad-hoc gap. - montecarlo.py: multi-strategy FP/FN sweep (random/majority/minority/ all-ones/all-zeros/interleave). 2000 trials, m=2332 N=500 c=3 ε=1e-3 BER=0.13 ⇒ FP ≤ ε with 100% detection across all six; old empirical threshold runs 2–10× over ε (majority ≈1.05%). - test_canonical.py: stdlib threshold sanity (Φ⁻¹(0.975)≈1.96, monotonicity, Z(2332,500,1e-3)=222.69) — runs in the CI weld. Code-level accusation statistics only; media-survival certification (real content/screen-record/CMAF lengths) remains open. Docs updated. Co-authored-by: Cursor <cursoragent@cursor.com>
Leads with the one deliberately-open invariant — the re-seal AAD is the caller-supplied aad_b64 and is NOT bound into the recover possession-proof (dkms-authority recover → seal_bound, src/main.rs:1028). Safe today only because the single consumer (decrypt boundary) rebuilds the segment-bound AAD and fails closed. Packages the SECURITY INVARIANT comment, THREAT_MODEL §7, and the DEPLOY_CHECKLIST open item into one hand-off with the trust boundary, crypto roots, CI-enforced release invariants, repro gates, and a reviewer checklist (incl. the landing test: tampered aad_b64 fails the possession-proof closed at the node). Co-authored-by: Cursor <cursoragent@cursor.com>
…pre-mainnet invariant) The dKMS node re-seals a recovered CEK under the caller-supplied `aad_b64`, which was NOT bound into the recover possession-proof. A MITM that tampered `aad_b64` in transit could make the node seal under an AAD of its choosing; it was safe only because the decrypt boundary independently rebuilt the AAD and failed closed (a compensating control, not a fix). Now the canonical possession-proof preimage binds `sha256(reseal_aad)` (`ddrm_envelope::recover_proof_message`, domain bumped v1 -> v2). The client signs over the exact AAD it sends (key-provider), and the node verifies the proof over the byte-identical `args.aad_b64` in `verify_session` BEFORE any CEK is recovered or re-sealed. The AAD (DecryptTranscriptV1) already carries `node_set_id` + `segment_digests`, so all three are bound transitively; the 32-byte digest keeps the preimage bounded for long presentations. A MITM cannot re-sign the proof (it lacks the token-bound caller key), so a tampered `aad_b64` now fails closed at the node (`session_invalid`). The decrypt boundary's rebuild remains as defense-in-depth. - ddrm-envelope: recover_proof_message/sign/verify take `reseal_aad`; bind sha256; bump DKMS_RECOVER_DOMAIN to /v2; unit test asserts tampered-AAD -> verify=false. - dkms-authority: verify_session verifies over decode(args.aad_b64) before recover; SECURITY INVARIANT comment rewritten to CLOSED; landing test recover_fails_closed_on_a_tampered_aad (35 legacy / 25 default tests green). - key-provider: recover_proof_b64 + both delegate paths sign over the request's aad_b64. - dev harnesses (ddrm-runtime-open, dkms-live-recover): each direct node recover signs over its request AAD. - docs: THREAT_MODEL §7 + DEPLOY_CHECKLIST + AUDITOR_PACKET §1 flipped open -> closed, with the landing test referenced. Gates: ddrm-envelope + dkms-authority tests, key-provider/dev-script builds, verify-capsules, alignment-check all green. Co-authored-by: Cursor <cursoragent@cursor.com>
…3+4 core) ddrm-envelope::av gains the pure, fail-closed serve-time selector (select_symbols) and the full-variant-set commitment (variant_set_commitment) that chunk 4 welds into the decrypt transcript. The selector binds the per-asset bias commitment (wrong secret -> refuse), supports arity-2 A/B (direct codeword->segment mapping, matching the proven tools/av-forensics extractor), and returns an empty selection for an honest single-encode. DecryptTranscriptV1 gains to_aad_with_all_bindings, a strictly-extending encoder that appends the variant-set commitment AFTER the rights binding, so a non-fingerprinted open stays byte-identical to to_aad_with_bindings (all committed goldens replay unchanged) while a fingerprinted open is bound to the exact published variant set (manifest swap / out-of-set variant fails the CEK unwrap closed). Pure functions, fully unit-tested; no pipeline wiring yet. Co-authored-by: Cursor <cursoragent@cursor.com>
asset_secret_from_master derives the per-asset watermark secret from a node-held master + the content hash, so the mint embed and the serve selector agree on the bias/codebook without ever publishing or per-asset-storing it (the manifest carries only the bias commitment; rotating the master re-keys every asset). build_manifest assembles + validates a fingerprinted VariantManifestV1 from produced variants (canonical interleave + bias commitment), or returns the honest single-encode for an empty marked set. A round-trip test closes the mint->serve loop: build_manifest keyed by the derived secret produces a manifest that select_symbols (same secret) accepts, and distinct buyers select distinct variant sets. Pure functions, tested. Co-authored-by: Cursor <cursoragent@cursor.com>
… open Mark the pure core of chunks 3/4/5 as landed (selector, variant-set AAD weld encoder, manifest builder, per-asset secret KDF — all in ddrm-envelope::av/ lib.rs, fail-closed + unit-tested) and spell out precisely what remains: the pipeline WIRING (ddrm-media-authority serve selection, decrypt-provider AAD rebuild, mint emit) plus the real perceptual DSP (bounded-placeholder seam now; certified embed swaps in post media-survival cert). Adds a "remaining wiring" section with exact files and the one thing needed to validate end-to-end (a gateway bring-up with a synthetic asset; real media only for the perceptual cert). Notes the interleave-application follow-up as tracked, not dropped. Co-authored-by: Cursor <cursoragent@cursor.com>
The local 2-of-3 stand-in nodes need the dev-modes legacy-receipt path to
authorize an offline recover (the live quorum uses wallet-signed grants); the
gateway dev script already builds the node this way. Without it the smoke fails
closed ("legacy receipt authorization is disabled") even on an unmodified tree.
With it the helper recovers a minted asset byte-identically (3/3 served).
Co-authored-by: Cursor <cursoragent@cursor.com>
…eam) embed_placeholder_variant appends an ignorable ISO-BMFF `free` box carrying the variant symbol AFTER the mdat, so the fragment stays valid/playable but byte- distinct per symbol; encrypt_fragment (CENC) and strip_senc (decrypt) both carry it through verbatim, so the selected variant is byte-distinct end-to-end and the symbol survives back to the clean fragment. read_placeholder_variant recovers it (the placeholder stand-in for the offline extractor). Explicitly NOT a watermark (no perceptual signal, no transcode survival) — it makes mint->serve->select->weld real and testable; the certified DSP embed swaps in behind the same interface post-cert. Tested end-to-end through the CENC rail on the real ffmpeg fixture. Co-authored-by: Cursor <cursoragent@cursor.com>
…istrations Chunk D of #2: the gateway inspector (server_infra) was already handing `with_registry` a real `Arc::downgrade(&provider_registry)` — the merge's passthrough was the only thing dropping it, so Chunk A already made gateway dispatch live. This wires the remaining production serve paths (serve_cmd, 3 sites) the same way, so `dispatch_approved` is live wherever the inspector is registered. Test and mcp-serve registrations stay unwired (fail-closed by construction). Completes flint-0.5 enforcement degradation #2. All three tracked degradations are now RESTORED; flint-0.5 is a true superset of flint — 0.5's features plus fully intact KEEP enforcement, no regression. Gate: cargo test --workspace --lib green (all crates), esp npm test 89, fmt clean, 0 warnings. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
The 0.5 merge dropped the dDRM producer-spine names (encrypt/publish/media) from the sub-provider allowlist while server_infra.rs still registers all three at boot. register_sub_provider rejects non-reserved names and the error is swallowed as a warn!, so on a real boot the Create-portal mint/ publish/media paths silently go dark (elastos://encrypt|publish|media/* -> no provider). No unit test caught it because provider wiring only runs at boot. Restores flint's capability (strict superset) and adds a regression guard test so a future merge can't re-drop them. Co-authored-by: Cursor <cursoragent@cursor.com>
The 0.5 auth path resolves the trusted-auth data dir from a process-global env var (home_launch_auth_data_dir / room_transport_identity_data_dir). In production that var is set per-child-process at spawn and never mutated mid-process, but a few unit tests mutate the shared test process to exercise the override path. Under parallel `cargo test` other auth-gated tests would transiently read the half-set value and correctly fail closed (403) -- a nondeterministic red that only shows under load (it slipped past the cloud's scheduling; serial runs were always green). Serialize test-side access at the two read funnels: mutating tests hold a process-wide write guard for their duration; the funnels take a brief read guard so no *other* thread observes the in-flight mutation. The mutating thread is tracked so its own funnel reads skip the (non-reentrant) read lock. Zero production behavior change (the guards compile only under cfg(test)). Co-authored-by: Cursor <cursoragent@cursor.com>
- setup.rs: package identity requires a valid IPFS CID before skipping materialization - auth.rs: atomic_write uses unique temp filenames so parallel writers cannot collide - provider/bridge.rs: label bridges for latency tracing on the serial provider path - viewer_media.rs: bound public cover fetches with a timeout so unresolvable CIDs cannot pin a gateway worker - ipfs-provider: Cat timeout_ms, macOS path canonicalization, kubo usability probe - run-creator-gateway.sh: ELASTOS_BUILD_PROFILE=release support, dev-modes only where the capsule declares them - docs: release-profile bootstrap notes and the macOS IPFS data-dir migration Co-authored-by: Cursor <cursoragent@cursor.com>
… in the 0.5 merge Restores the shell-windows owned-open subsystem, loading-window styles, and the ddrm-viewer/elacity-player entries in SHELL_MESSAGE_OPEN_TARGET_SOURCES so Library and viewers open content again. Bumps home asset cache-busting to home-20260701c. Co-authored-by: Cursor <cursoragent@cursor.com>
….5 merge Storage authority on the localhost provider path must come from the carrier/provider envelope token; a caller-supplied body token is stripped before dispatch. Restores the redaction plus both regression tests, and updates the admin-locked discover test for the fail-closed inspect_resource path. Co-authored-by: Cursor <cursoragent@cursor.com>
…fail-closed scope rules - plan emits elastos.inspect.gate-preview/v1 (capabilities, audit events, execution policy, dispatch:false) so inbox gate summaries show real authority again - revoke is an explicit unsupported_operation, not a silent fallthrough - provider_resource gains inspect_resource(op) so unknown inspect ops fail closed - restores the four inbox-approval gateway tests (fresh passkey, principal scoping, deny-without-dispatch) and provider authority/redaction tests - docs: Act path and runtime scope-rule expectations, corrected inspect/self routing Co-authored-by: Cursor <cursoragent@cursor.com>
…rces
- dkms-authority: deny_unknown_fields on the Request enum so hidden authority
fields fail closed; lockfiles pick up elastos-common 0.5.0
- creator/ddrm-viewer: reword raw chain/backend references so app capsules stop
claiming provider authority they route through the runtime
- library: replace platform-branded "Finder" wording with file-manager phrasing
- marketplace: classify providers via name.endsWith("-provider")
Co-authored-by: Cursor <cursoragent@cursor.com>
… post-merge truth - home-entropy-check: current home asset version, expanded library open allowlist, post-merge inspector routing, act-emitter README in the Users/self allowlist - check-wci-alignment: justified exclusions for chain-native crates, backend-scheme elacity pattern instead of the bare word - command-smoke/installed-command-audit: hermetic HOME on macOS and a portable timeout (timeout/gtimeout/perl alarm) so the gates run off-Linux - state.md: restore the canonical journey proof records lost in the merge - docs: unlink gitignored CLAUDE.md, point DDRM rail table at per-capsule wasm-smoke scripts Co-authored-by: Cursor <cursoragent@cursor.com>
filter/map instead of bool::then in filter_map for browser session listing, tail expression instead of return in the cfg-split supports_hibernation, and indented doc-comment link definitions in elastos-vz. Co-authored-by: Cursor <cursoragent@cursor.com>
…ricks `elastos home`
Root cause of the local-carrier-setup-smoke failure ("Capability request still
pending after 3s"): the G-ID flip fail-closes every identity gate for sessions
with no capsule identity, and /api/auth/attach created exactly such sessions
(vm_id: None). The managed-home flow then dead-ended three ways: capability
intake recorded no requester identity, the consent-broker's grant POST 403'd
fail-closed ("no requester capsule identity") in an infinite retry loop, and
even a minted token would have been unredeemable ("session has no capsule
identity"). Fail-closed did its job; the flow lost its identity plumbing.
Predates the 0.5 merge — the smoke was never re-run on Linux after G-ID landed
(the Mac cannot run it), so it slipped every gate until now.
Fix at the root seam: attach-authenticated sessions record an HONEST host
identity ("host-client" / "host-shell") — the attach secret is owner-only
(chmod 600), so the caller IS the host user; this is truthful identity, not
fabrication. Intake, grant mint, and token redemption now agree end-to-end.
No authority widening: grants still require consent-broker approval; tokens
still bind to the recorded identity; audit records it.
Proven live: `just local-carrier-setup-smoke` now passes on Linux (was the one
red step in `just verify`); replayed the failing grant against the live runtime
before/after (403 "no requester capsule identity" -> granted). Regression test
pins the identity on both scopes.
Gate: cargo test -p elastos-server --lib green (1044), clippy clean, fmt clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…t-scan invariant test The 0.5 merge left three first-party capsule providers declaring `provides: elastos://<name>/*` for names NOT in RESERVED_SUB_NAMES: `market` (content-market storefront — no boot fallback, route never exists), `object` (Library object authority) and `operator-drive-adapter` (both also register a boot main-provider but lose their VM sub-route). At capsule launch the supervisor's register_provider_route fails closed and the failure is warn-swallowed, so the provider silently goes dark — the same live-only class the dDRM-spine fix repaired, still open for these three. - Reserve the three names (strict superset; no capability removed). - Add pub is_reserved_sub_name() as the single-source-of-truth predicate. - Add test_all_capsule_provided_sub_schemes_are_reserved: scans every shipped capsule.json `provides` sub-scheme and asserts it is reserved — no boot needed. This is the general invariant the hardcoded dDRM-spine test only covered for three names; it would have caught all of this and reds on the next provider capsule that forgets to reserve its scheme. Gate: cargo test -p elastos-runtime --lib green (384), fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ore 6 inbox tests
Intake bug (Ravi P16/P11, KNOWN_GAPS G3): create_inspect_action_request only
checked plan.status=="ok", but the inspector's plan returns
{status:"ok", data:{valid:false, error:"unknown_operation"}} for an operation
the target authority never declared. That created a PENDING inbox approval with
an EMPTY gate preview — prompting a human to approve an act whose authority is
invisible. Consent requires visibility.
- Reject at intake when plan.data.valid != true, BEFORE persisting: no record,
no notification, no approvable row, no dispatch_approved reachability.
- Restore the 6 inbox-approval regression tests dropped in the 0.5 merge,
grafted from origin/review/0.5.0 against the existing merged harness — inbox
suite 4 -> 10.
- Add inspect_action_rejects_undeclared_operation_before_inbox: asserts the
undeclared op is rejected AND leaves zero approvable Inbox rows (structural
fail-closed, not a hidden display string).
Gate: cargo test -p elastos-server --lib inbox suite green (11 incl. new guard).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
register_sub_provider was last-write-wins, so a launched capsule whose manifest declares `provides: elastos://encrypt/*` (or key/decrypt/wallet/…) could seize the CEK-escrow / key / signing route from the trusted boot provider — ambient authority via registration order (Principle 3) and a break of the mediated key/decrypt plane (Principle 15). - Pin the escrow+keys+signing+mint spine (encrypt, publish, media, key, decrypt, drm, rights, wallet, chain): once bound at boot, a later registration of the same still-live name is refused structurally (Err), checked under the write lock (race-free). Non-pinned reserved names keep last-write-wins for hot-reload / test double-registration. - unregister_sub_provider frees the slot, so a genuine teardown→restart of the same provider re-mounts cleanly; only overwrite of a live pinned slot fails. - register_sub_provider now routes its reserved-name check through the new is_reserved_sub_name() predicate (single source of truth; also clears the dead-code warning). Validated empirically: `just local-carrier-setup-smoke` (full Linux boot + `elastos home`) passes with the guard live — boot registers each pinned name exactly once, so nothing legitimate is refused. Test proves refuse-overwrite, original-stays-bound, and restart-after-unregister. Gate: cargo test -p elastos-runtime --lib green; clippy -p elastos-runtime 0; smoke green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…-auth funnel Two same-class hygiene fixes surfaced by the audit (both "one canonical path", Principle 10): 1. DDRM test env-lock: mint/buy/rights/owned_ledger each held their OWN `static ENV_LOCK`, so a lock only serialized a module against itself while the mutated `ELASTOS_DDRM_*` vars are process-global — a reader in one module could observe another module's mid-test mutation and fail closed (the exact nondeterministic class the trusted-auth-env guard fixed). Replace the four disjoint statics with one shared `api::ddrm_env_lock()` so all DDRM env mutation serializes on a single lock instance. 2. Trusted-auth funnel: `room_transport_identity_data_dir` was a byte-identical copy of `home_launch_auth_data_dir` (env read + test guard). Delegate to the canonical one so the two can't drift; the entropy-check-pinned `home_launch_auth_data_dir` symbol is unchanged. Gate: cargo test -p elastos-server --lib green (1051), fmt clean, 0 warnings. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…f-tier
Audit surfaced a three-way contradiction: docs said "/api/provider/inspect/self
is System-only", but the code routes self to the app/browser tier
("self" => &[BROWSER_CAPSULE_ID]) AND the entropy-checker simultaneously pinned
BOTH the BROWSER-self code and the stale "System-only" doc line.
Decision (owner): keep the self-tier — a legitimate KEEP transparency capability,
fail-closed by construction (gateway injects the authenticated principal_id,
client-supplied id ignored, authorize_view enforces caller == target under
InspectScope::SelfOnly), already covered by
inspect_self_returns_own_record_and_ignores_client_id and
inspect_self_token_cannot_reach_system_capsule_op.
- docs/CAPSULE_INSPECTOR.md + docs/INSPECTOR_TESTING.md: self is a live,
caller-bound, fail-closed route (not System-only).
- home-entropy-check.mjs: pin the new fail-closed self-tier language instead of
the stale "System-only" phrase, so code, docs, checker, and tests all agree
(Principle 12). No code/behavior change.
Gate: home-entropy-check PASS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ants
serde's container `deny_unknown_fields` does NOT apply to UNIT variants of an
internally-tagged enum, so the quorum authority's Request::Status / ::Shutdown
silently accepted `{"op":"status","smuggled":true}` — a small fail-open seam on
an untrusted protocol surface (Principle 11). The authority-carrying variants
(Hello/Recover/RotateShare/…) are struct variants and already fail closed; only
the two empty ones leaked.
- Convert Status/Shutdown to empty STRUCT variants so deny_unknown_fields covers
them; update the four match sites.
- Add empty_variants_reject_unknown_fields (clean parse; hidden field rejected).
Scoped to only the logical change (no whole-file reformat, per the shared-tree
lesson). Gate: cargo test -p dkms-authority green (25); no new clippy warnings.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…in KNOWN_GAPS Turn the remaining audit finding into a build-visible, tracked contract rather than prose (LESSONS.md: audit → gap registry). server_infra warn-swallows a register_sub_provider Err at boot for ~22 providers; the capability still fails closed at route time (not fail-open), but a spawned-but-unregisterable boot-critical provider goes silently dark with only a warn. Row records the anchor, the distinction (absent-binary=warn ok vs spawned-but-rejected=loud), the close criteria, and a pending ratchet (needs a boot failure-injection seam). The other remaining finding — carrier-service launch skipping the author- signature gate — is already tracked as AUD-1 RESIDUAL (b); not duplicated. Docs-only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…is session's fixes Registry-truth sweep (LESSONS.md: audits feed resolutions back — a doc that rots is a liability). Reconcile every row whose truth changed under this session's commits: - G-ID residual: drop `attach.rs:63` from the "None-vm_id follow-ups" list — attach host sessions now carry an honest host-shell/host-client identity (`279dac1`), closing the live-only managed-home dead-end the smoke caught. - PRINCIPLES_CONFORMANCE §A RESERVED_SUB_NAMES: mark it DESIGN-gap-only now — the acute risks are build-guarded (manifest-scan invariant `1fc2a14`; first-writer-wins pin `8b688fc`); drop the stale `:448-476` line ref. - Enforced invariants (+3): every provider `provides` sub-scheme is reserved (no silent-dark); boot-critical sub-providers pinned first-writer-wins; request_act intake fails closed on an undeclared op. inspect/self tier was already reconciled in `e51be7b`; DDRM env-lock is test-infra (no row). Docs-only. Gates: home-entropy + wci-alignment PASS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ratchet AUD-6 seam + first fix. Boot-critical sub-provider registration was warn-swallowed at ~19 server_infra sites: a spawned-but-unregisterable provider (an invariant violation → a dark mint/keys/signing path) left the runtime up with only a warn. - `encrypt` (CEK escrow — the crown jewel) now PROPAGATES its register_sub_provider failure (`?`, boot fails loud) instead of warn-swallow. Only the registration-rejected branch changes; absent-binary stays the outer warn (genuinely optional). Smoke-validated: real boot registers encrypt once, no Err, boot proceeds — `just local-carrier-setup-smoke` green. - `#[ignore]`d ratchet `aud6_boot_critical_sub_provider_registration_fails_loud` scans for the warn-swallow line per boot-critical scheme; run with --ignored it FAILS today, listing publish/media/key/decrypt/drm/rights/wallet/chain (encrypt absent = fixed). Flips green — delete #[ignore] — when the rest are classified critical-vs-optional and rewired. Non-blocking in normal CI (ignored). - KNOWN_GAPS AUD-6 updated: PARTIAL (encrypt), ratchet named. Gate: cargo test -p elastos-server --bin green (96 pass, 1 ignored); smoke green; server_infra.rs rustfmt-clean (scoped). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…t response paths (DoS) Audit swarm finding (Priya, HIGH): the primary Carrier request path used unbounded `read_line` on remote-controlled streams. `handle_file_stream` accepts every inbound CARRIER_ALPN connection with no peer auth and then read a whole line into memory, so a remote peer could OOM the node pre-auth with a newline-less flood. The same class was already fixed for the WASM/microVM bridges (BUG-6, bounded `read_bounded_line`, 1 MB cap) but never applied here. The client-side response readers (release_head, provider_invoke, gossip push/pull, operator send_request) had the same gap against a malicious source we dialed. Fix (fail-closed, no protocol change): expose the existing bounded reader `pub(crate)` and funnel every Carrier newline-delimited control read through one shared `read_bounded_carrier_line` helper (1 MB cap; oversized/truncated = error, not a giant alloc). Carrier bulk bytes ride the separate length-prefixed path (already capped at 200 MB), so the 1 MB bound only ever constrains small JSON control lines. Sites: carrier.rs handle_file_stream (inbound, HIGH) + 4 client response readers; operator_control.rs inbound handler + peer response. Gate: cargo build -p elastos-server green; clippy -p elastos-server --lib clean; 2 new regression tests (oversized flood refused, normal line round-trips) pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…only ops (T1)
Audit swarm finding (Sol, CONFIRMED): `handle_file_connection` accepts every
inbound CARRIER_ALPN connection with NO peer authentication, and
`validate_carrier_provider_invocation` is self-referential (it checks
caller-supplied envelope fields against each other, not against a
runtime-issued capability). So any anonymous remote peer could invoke the
whole provider_invoke matrix — confirmed harm: `content:publish`/`import_exact`
pin arbitrary bytes into the node's store under a caller-supplied
`principal_id` (unauthorized write + quota-attribution abuse); critical
caveat: the `key`/`decrypt`/`drm` targets were reachable too.
Fix (fail-closed, default-DENY): `carrier_provider_plane_allows_unauthenticated`
is a strict allowlist — only `content:{fetch,status,admission}` (non-mutating
reads: fetch bytes, read status, quota *decision*) pass. Every write
(publish/import_exact/import_object/ensure/unpublish/repair) and every
key/decrypt/drm/rights/availability op is refused with
`unauthorized_provider_operation` BEFORE `send_raw` ever runs.
Trade-off (user-approved "lock read-only now"): authenticated push-replication
and cross-node key/rights flows over the plane are disabled until real Carrier
peer authentication lands — tracked as G-CARRIER-PEER in KNOWN_GAPS. Widening
the allowlist without peer auth reopens T1.
Gate: cargo clippy -p elastos-server --lib clean; full carrier test module
57/57 pass; 2 new refusal tests (write op refused, key/decrypt/drm refused) +
existing content:fetch dispatch test still green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
… (T3) Audit swarm finding (Nadia, HIGH, confirmed end-to-end): `validate_public_ip` checked only the native IPv6 predicates (loopback/unspecified/unique-local/ link-local), so IPv4-mapped IPv6 literals evaded every guard — `::ffff:169.254.169.254`, `::ffff:127.0.0.1`, `::ffff:192.168.1.1` all returned "public". The `url` crate preserves the mapped form through the host allowlist, DNS resolver, and connect; on a dual-stack host the kernel routes `::ffff:a.b.c.d` to the bare IPv4, so a capsule with a permissive `http_fetch` backend could read `http://[::ffff:169.254.169.254]/latest/meta-data/...` (cloud metadata / loopback services). Fix: in the V6 arm, normalize `to_ipv4_mapped()` (and the deprecated IPv4-compatible `::a.b.c.d` via `to_ipv4()`) FIRST and recurse into the full v4 private/loopback/link-local guard. Ordered so `::1`/`::` are still caught by the native predicates before the v4 fallback. Applied identically to exit-provider and net-provider (the two SSRF egress mediators). Gate: cargo test + clippy on both standalone capsule crates green; new regression test `validate_public_ip_blocks_ipv4_mapped_private_targets` (mapped metadata/loopback/RFC1918 refused; public v6 + public mapped v4 pass). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
Audit swarm finding (Vera+Dmitri, HIGH, confirmed): the audit-chain signature was strippable via an unauthenticated `alg` downgrade. `compute_record_hash` hashes only `domain ‖ seq ‖ prev_hash ‖ event_json` — `alg` and `sig` are NOT in the preimage — and `verify_chain` ran the ed25519 check only `if rec.alg == "ed25519"`. So an offline editor with NO signing key could rewrite the entire event history, recompute every (public) record_hash, relink the chain, set `alg="none"`, drop `sig`, and pass: `verify_chain` returned Ok, `chain_attestation` reported verified=true, still advertising the real signer. This defeated the module's own tamper-evidence guarantee — the EU AI Act durable-custody claim. Fix (no on-disk format change): make the decision to check the signature independent of the forgeable `alg`. When a verifying key is supplied (custody / tamper-evidence mode — both production callers, with_file_verified and chain_attestation, derive the key from self.signer, present iff the log is signed), EVERY record MUST be ed25519-signed and verify; a non-ed25519 alg in a signed chain is a downgrade and is refused fail-closed. The keyless (memory/unsigned) path is unchanged and still refuses to report a signed record as verified without its key. Gate: cargo clippy -p elastos-runtime --lib clean; all 19 audit tests pass, incl. new `signature_downgrade_forgery_is_refused` (full forgery: event edited, record_hash recomputed + relinked, sig stripped → refused; hash-chain is internally consistent so ONLY the mandatory-signature rule catches it). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
… charset guard (T6)
Two MEDIUM audit-swarm findings (Nadia):
T5 — exit-provider `http_fetch` auto-followed ureq's default 5 redirects. The
private agent has no IP-validating resolver on redirect hops, and the backend
host allowlist is only checked against the INITIAL URL, so an allowlisted host
could `302` the fetch to cloud metadata / any non-allowlisted host. Fix:
`.redirects(0)` on both agents — the mediator returns the 3xx to the caller
instead of following; the capsule re-issues `http_fetch` for the new URL, which
re-runs the full URL + host + allowlist + resolver validation per hop (each
egress individually capability-checked). All 29 exit-provider tests still pass.
T6 — the carrier `operation` was only checked non-empty, then interpolated into
`/api/provider/{scheme}/{operation}`; `Url::join` normalizes `..`, so
`x/../../capability/request` escaped the provider gate and reached arbitrary
local-API endpoints as the capsule's own token. Fix: restrict `operation` to a
single `[A-Za-z0-9_-]` segment in `carrier_invoke_dispatch`, rejecting
`/`/`.`/`%` etc. before it reaches the URL.
Gate: clippy clean on both crates; 8/8 carrier dispatch tests pass incl. new
`carrier_invoke_dispatch_rejects_path_traversal_operation` (traversal/dot/pct
refused, normal underscore op still parses); exit-provider 29/29 green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
`just verify`'s `cargo fmt --check` step flagged four non-canonical lines in the test code added by the audit-fix chunks (assert! wrap, .replacen args, Cursor::new arg, for-loop array). Formatting only — no logic change. Applied by hand (scoped to the exact lines) to respect shared-tree discipline; scoped `cargo fmt -p elastos-runtime -p elastos-server --check` now clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
Doc-truth reconcile: add the audit-swarm callout to the KNOWN_GAPS opening so the registry reflects the six confirmed reachable defects fixed this pass (T1 carrier plane lock, T2 bounded reads, T3 SSRF, T4 audit downgrade, T5 redirects, T6 operation traversal), the cleared-as-sound surfaces, and the deferred roadmap (T7 crypto migration, perf ceilings, quality cleanups). The open residual (T1 peer-auth) is already the G-CARRIER-PEER row. Gate: home + browser entropy checks, WCI alignment, and git diff-check all pass on the doc change; full `just verify` was green on the code at HEAD. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…yte copy Both VM-launch overlay sites (rootfs.rs get_or_create_overlay and the inline copy in supervisor.rs) did a full tokio::fs::copy of the ~335 MB rootfs.ext4 on every launch. Replace both with a shared reflink_or_copy helper: a copy-on-write clone via `cp --reflink=always` — an O(1) metadata op on CoW filesystems (btrfs/xfs/zfs/bcachefs) — that transparently falls back to the exact same pure-Rust full copy on any failure (non-CoW FS, cross-device, or `cp` absent). Correctness is identical on both paths: the result is an independent writable file with identical contents (a reflink gives copy semantics, not a shared mutable file). Only the cost changes. New unit test asserts independence — writing the clone leaves the source untouched — so it holds whichever path the host filesystem takes. Audit-swarm finding (Berger, HIGH, safe, free): the standout no-measurement-gate latency win — a full image copy on the launch hot path with a free O(1) replacement. mkfs.ext4 is already shelled out from this crate, so external-tool use here matches the established pattern. Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
The GAP-8/AUD-2 custody write on the dDRM open path called audit.content_open(...) synchronously inside the async handler; content_open -> emit does a full fsync, so every open parked a tokio worker thread on disk I/O. Wrap it in spawn_blocking with owned clones of the record fields (the Arc<AuditLog> handle is cloned in). The fail-closed contract is preserved exactly: the open proceeds ONLY on Ok(Ok(())); an emit error (Ok(Err)) refuses it as before, and a join failure (Err) is now also treated as a write failure and refuses the open — content whose open cannot be durably, tamper-evidently recorded still does not happen. The fsync itself is unchanged (custody durability is not weakened); it just no longer blocks an async runtime thread. Audit-swarm finding (Vyukov, HIGH, safe): custody fsync on the async worker on the open hot path. Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…tors
content.rs and carrier.rs each carried byte-for-byte copies of three
security-invariant validators: the SSRF egress URL guard (reject inline creds,
allow only https or loopback http), the HTTP-header CRLF-injection guard, and
the content path-traversal guard. Duplicated security logic drifts silently —
tightening one copy leaves the other on the weaker rule (the same class that let
an SSRF gap exist in two places).
Extract the logic into one `net_validation` module (with unit tests) and reduce
the six local functions to trivial label-passing delegators. Zero call-site
churn (~28 callers unchanged) and byte-identical error messages — the label
parameter reproduces each surface's exact prefix ("operator alert" /
"carrier external endpoint" / "carrier authorization header"). Behavior is
unchanged; the security rule now lives in exactly one place per invariant.
Audit-swarm finding (matklad, MED): security-validator duplication / drift.
Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke);
3 new net_validation unit tests pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.