Skip to content

Flint 0.5#9

Open
irzhywau wants to merge 436 commits into
upstream/0.6-devfrom
flint-0.5
Open

Flint 0.5#9
irzhywau wants to merge 436 commits into
upstream/0.6-devfrom
flint-0.5

Conversation

@irzhywau

@irzhywau irzhywau commented Jul 2, 2026

Copy link
Copy Markdown

No description provided.

claude and others added 30 commits June 18, 2026 13:30
Record the direction this branch is a foundation for, so the intent behind the
substrate isn't lost. Frames the work as completing one control loop —
reflect → preview → approve → act → audit — then putting selectable shells
(including an intent-led AI shell with a contained agent capsule) on top.

Contents: where we are (the built substrate); ordered roadmap (approval loop
next; dispatch merge-gated on DDRM; shell-manager + selectable shells;
intent-led AI shell; pluggable local/cloud intelligence; a Morphic/Godot
living-object canvas — presentation only, core stays the authority); the
experience we're building toward (authority made legible: trust as material,
gates as visible circuits, approval as a deliberate ceremony, audit as a
timeline); business model (shell tiers, DRM-self-enforced access, agent-safe
enterprise wedge); and the trust/security framing (build-time vs run-time
boundaries; open code != open authority; the real risks are the signing trust
root, automation bias, and TCB creep — not forking).

Direction, not a commitment. Honest real-vs-vision split; gaps stay tracked via
the KNOWN_GAPS ratchet pattern.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016ZKy5Cca9RzwDuLb1szdeq
…sible DCT-QIM)

Protected pixel-lock/HTML-lock pages now egress with TWO forensic marks carrying
the same identity, so a leaked frame stays attributable even if one is removed:

- Visible: faint tiled stamp (opacity 0.07) of the FULL owner EVM address +
  short content id + UTC open-minute, rendered in the anti-aliased DejaVu face
  (replaces the old elided address and blocky 8x8 bitmap).
- Invisible (render/invisible.rs): a blind DCT-domain QIM mark in luminance under
  perceptual masking (flat white margins left pristine). Carries a COMPACT 20-byte
  wallet (232-bit codeword) so it recovers from CONTENT-SPARSE pages (short
  code/config snippets), validated end-to-end against real rendered text AND code
  pages — not just synthetic images. QIM bounds the per-block nudge so a
  high-contrast block can never carry the wrong bit (the fixed-margin scheme's
  flaw). Survives q85 + recompression + brightness/contrast + same-res screenshot
  + vertical offset; rescale/rotation/width-crop out of scope (documented).

- Fail closed: pixel-lock (watermark::finalize) and HTML-lock (EPUB) both REFUSE
  to emit a protected page without a non-empty forensic stamp.
- Offline forensics: `decrypt-provider --extract-watermark <image>` prints the
  recovered 0x wallet.
- Helper stamps the full wallet + content id + UTC minute (quorum.rs).

Gates: fmt + clippy clean; decrypt-provider 76/76, media-authority 10/10.
Co-authored-by: Cursor <cursoragent@cursor.com>
…edge an open

The parallel 2-of-3 recover spawned a thread per node but the collector join()ed
ALL three before checking the threshold (the join was for the cheater cross-check).
A single dead/wedged node therefore held the whole release hostage for its full
per-node carrier timeout (~20s) — past the caller's open deadline — even when two
healthy shares were already in hand. That's the "stuck on Verifying access &
recovering keys" / 502 an open hit whenever one geo node was flaky.

Now the recover threads are DETACHED and feed an mpsc channel, and collection
RETURNS the instant the 2-of-3 threshold is met plus a short 1.5s grace to still
admit a promptly-arriving third share for the cheater cross-check — realizing the
long-documented "slowest of the two fastest" intent. A straggler simply stops
being waited on; its thread finishes and cleans up its own pooled connection.

Invariants preserved: still FAIL-CLOSED (< 2 served shares is refused); a recover
panic is caught and counted a fault (catch_unwind), never a share, so one bad node
can't abort the single-threaded warm-daemon loop; a spawn failure is a fault too.
Non-reporting nodes are recorded as timeout faults so a fail-closed message names
every node. 16 MiB recover stack retained (PQ-hybrid unseal is a stack hog).

The collection logic is factored into a pure `collect_quorum_shares` and unit-tested
over an mpsc channel: returns at threshold without waiting for a dead node, admits a
prompt third within grace, never counts a fault toward the threshold.

Gates: fmt + clippy clean (no new warnings); key-provider 52/52 (dev-modes).
Co-authored-by: Cursor <cursoragent@cursor.com>
Rasterizable types ship a two-layer per-buyer mark, but audio/video are
key-protected, NOT yet fingerprinted (the browser-MSE ceiling without EME).
Document the honest status and the forensic upgrade plan instead of overclaiming:

- New docs/AV_WATERMARKING.md: threat model; why the in-boundary image path can't
  transfer to streaming (per-segment decode→mark→re-encode breaks CENC/AAD); the
  chosen approach — A/B forensic variant watermarking (video) and spread-spectrum/
  echo-hiding (audio), produced once at mint, selected per buyer from their SIGNED
  grant at serve time (CEK boundary + one canonical path intact, fail-closed when
  variants are absent); offline buyer-recovery extractor; phased plan where each
  chunk has a one-sentence pass/fail check; principles-conformance self-review.
- PROTECTED_CONTENT.md: AV section now states "key-protected, not yet fingerprinted"
  and links the design.
- ROADMAP.md: AV watermarking added under stronger-attestation as a mint-time
  transcode-pipeline track (roadmap item, not a patch).

alignment-check: OK.
Co-authored-by: Cursor <cursoragent@cursor.com>
A creator controls the source file, so a tiny crafted input could declare enormous
dimensions and force a multi-GB allocation in the decrypt boundary (the most
security-sensitive process) — a "pixel bomb" OOM. Bound BEFORE allocating, three
chunks through one shared chokepoint, all fail-closed:

- Chunk 1 (raster decode): new render::decode_bounded routes the single-image and
  CBZ-page decoders through image::Limits (max dims MAX_DIM=10k, max alloc
  MAX_DECODE_BYTES=256MiB) so an oversized image is refused at header time, before
  the pixel buffer is built. Replaces the unbounded image::load_from_memory at
  image_page.rs and cbz.rs. Inner decode_with_limits is factored out so the
  rejection path is unit-tested with a tiny limit (CI never builds a real bomb).
- Chunk 2 (PDF): bounded_scale clamps the render scale on BOTH axes and by area
  (MAX_PIXELS=48MP), not width alone — an extreme aspect (tiny w x huge h) no
  longer rasterises to a gigapixel pixmap — plus a final predicted-size guard that
  fails closed. Pure fn, unit-tested over adversarial native sizes.
- Chunk 3 (CBZ): per-page (MAX_ENTRY_BYTES=64MiB) and aggregate
  (MAX_TOTAL_BYTES=512MiB) uncompressed caps via pure account_entry, enforced with
  an early declared-size reject + a take()-capped read so a lying header can't slip
  an under-declared bomb. Also bounds warm session memory (pages held for the open).

API notes vs. the spec: image::Limits is #[non_exhaustive] (built by mutating a
default(), not a struct literal) and reader.limits() is &mut self -> () (not
chainable) — adjusted accordingly. SVG keeps its own local MAX_DIM; per-format
decode *time* remains a wall-clock/watchdog follow-up (documented, not assumed).

Gate: decrypt-provider 138/138 (rail-stream,rail-mint,pdf-render); no new clippy
warnings in the changed files; alignment-check OK.

Co-authored-by: Cursor <cursoragent@cursor.com>
… & EPUB hardening)

Two boundary holes on the object egress path, both fail-closed:

- (3) Serve-time content sniff. The raw `/bytes` egress trusted the mint-time
  `pixel_locked` flag, which trusts the creator-declared mime — so a renderable/
  scriptable document mislabeled with a non-pixel-lock mime could egress as raw
  plaintext. `viewer_object_bytes` now sniffs the DECRYPTED bytes (after
  authority.object(), before octet_stream) via a pure magic-byte `sniffs_as_lockable`
  (PDF / ZIP / raster image / SVG-XML) and returns 403 if a "raw" asset's content
  looks pixel-lockable. The one exception is an explicitly-declared `application/zip`
  (generic archive download). Buyer-safe: the declared mime lives in the signed
  descriptor, not buyer-controlled. Verified a PDF mislabeled as a 3D model (which
  shares this decrypt-passthrough handler) is now caught.

- (5) HTML-lock CSP/nosniff. EPUB chapters served as sanitised HTML now carry an
  enforced HTTP `Content-Security-Policy` with a `sandbox` directive
  (`default-src 'none'; img-src data:; style-src 'unsafe-inline'; font-src data:;
  base-uri 'none'; form-action 'none'; frame-ancestors 'self'; sandbox`) plus
  `X-Content-Type-Options: nosniff` and `Referrer-Policy: no-referrer`, so the
  document is sandboxed at the RESOURCE level by the browser even if loaded directly
  or framed without the attribute — the hand-rolled sanitiser is no longer the sole
  barrier. JPEG pages get `nosniff` only.

Out of scope (tracked follow-ups, not silently skipped): the media/stream egress
(`viewer_media`/MSE) is a second egress door with no sniff yet; text/code mislabel
needs heuristics (no reliable magic byte). The render direction is already
fail-closed via the parsers.

Gate: elastos-server viewer_object 7/7; clippy -D warnings clean.
Co-authored-by: Cursor <cursoragent@cursor.com>
…l-lock CSP

Make the protected-content docs state the watermark's true strength and the new
boundary defenses exactly (Principle 12 — docs/code/threat-model agree):

- Watermark forensic scope & privacy (THREAT_MODEL §3 row + §6.6; PROTECTED_CONTENT
  "Forensic strength & privacy"): the mark is UNKEYED and CRC-protected (not signed),
  so it is forgeable and repudiable — a deterrent/tracer, NOT court-grade evidence;
  the authenticated record is the §4 signed custody log. It is also NOT anonymous:
  both layers embed the full opening wallet (visible layer human-readable), so anyone
  who sees a rendered page de-anonymizes the buyer — the deliberate leak-attribution
  trade. Names the roadmap upgrade (authenticate the payload: MAC/opaque token).
- Pixel-bomb resource bounds (PROTECTED_CONTENT): documents decode_bounded
  (image::Limits), the PDF both-axes+area scale clamp, and the CBZ per-page/total caps.
- HTML-lock CSP (PROTECTED_CONTENT): documents the enforced HTTP CSP `sandbox` +
  nosniff containment order (HTTP CSP true layer ▸ meta/iframe belt ▸ sanitiser DiD).

Docs-only; alignment-check OK.

Co-authored-by: Cursor <cursoragent@cursor.com>
…d grant

Tier C (1), chunks 1-4: upgrade the invisible pixel-lock watermark from an
unkeyed CRC-only mark (forgeable + repudiable) to one ANCHORED IN THE BUYER'S
OWN WALLET SIGNATURE — so a leaked frame is non-repudiable and forgery rises
from "anyone can plant any wallet" to "only a party holding the victim's signed
grant can." Code and docs land together (Principle 12).

- Shared digest (ddrm-envelope): `grant_watermark_digest16(delegation_sig_hex)`
  = SHA-256(normalized EIP-191 delegation signature)[..16]. Lives in the crate
  BOTH the embedder and the verifier link, so they cannot drift. No new deps
  (sha2 already present).
- Payload codec (decrypt-provider/render/invisible.rs): new TAG_GRANT_DIGEST
  carrying `[wallet_prefix(4) | grant_digest(16)]` = 21 B <= the 24 B CAP, so
  the 232-bit PERIOD (and sparse-page recovery) is unchanged. `embed` takes the
  digest; `extract` refactored into `extract_raw` + `parse_grant_mark` so the
  verifier reads the raw anchor. No-grant/local-dev opens fall back to the
  compact wallet (back-compat).
- Wire (watermark.rs + media-authority quorum.rs): the authority appends an
  invisible-only `\u{1F}gd:<hex>` token to the stamp; `finalize` splits it back
  off so the VISIBLE mark stays the clean human `wallet . content . time` and
  only the INVISIBLE layer carries the authenticated digest.
- Verifier (main.rs): `--extract-watermark <img> [--verify-grant <grant.json>]`
  prints the wallet prefix + digest and reports MATCH/NO MATCH by recomputing
  via the shared fn. Gated on pq-envelope (always in the shipped render binary).
- Docs: THREAT_MODEL S3 row / S6.6 refreshed to the authenticated state and S4
  records the chunk-5 retention decision (option C: fold the digest into the
  existing tamper-evident audit record, TTL + access-controlled; status pending
  wiring). PROTECTED_CONTENT forensic-strength block + the invisible-layer
  description match. Honest bound kept explicit: the delegation signature is not
  a hard secret, so this is non-repudiation + raised-forgery, NOT full
  anti-framing; a server-key MAC / opaque custody token remains the north star.

Gates (capsules are not -D warnings gated by `just`; verified directly):
decrypt-provider compiles clean + render tests 59/59; media-authority 12/12
(incl. cross-crate digest agreement); ddrm-envelope digest test + 60 existing;
alignment-check OK.

Co-authored-by: Cursor <cursoragent@cursor.com>
…stody chain

Wire Tier C-1 chunk 5: fold the 16-byte authenticated grant digest (a
non-reversible commitment to the buyer's signed delegation — the same value the
invisible pixel-lock watermark embeds) into the existing append-only content_open
custody record, so a leaked frame is verifiable against an audit row WITHOUT a
second who-opened log or any raw wallet/grant retention (option C).

- audit.rs: optional grant_digest on AuditEvent::ContentOpen, serde-skipped when
  absent so prior records hash-verify unchanged; content_open() takes it; test
  proves backward-compat + chain verification with and without the anchor.
- viewer_open.rs: resolve the wallet-signed grant (fresh AND cached paths) ABOVE
  the custody write and derive grant_digest from the EXACT signature forwarded to
  the quorum, so the §4 record carries the anchor; malformed fresh grant still
  fails before any "opened" record is written. Media/no-grant opens -> None.
- elastos-server cannot link the PQ ddrm-envelope crate, so it carries a
  no-shared-dep twin (grant_watermark_digest16_hex) guarded by a golden vector
  cross-checked against ddrm_envelope::grant_watermark_digest16 in BOTH crates,
  pinning the trim+lowercase normalization so the two sides cannot drift.
- THREAT_MODEL.md §4: retention entry updated to "option C, wired" —
  minimization-via-non-reversibility, not TTL (the chain is intentionally
  permanent); records a TTL-prunable index as explicitly rejected (Principle 10).

Gates: ddrm-envelope golden, elastos clippy -D warnings (workspace), runtime
audit chain test, elastos-server golden, decrypt-provider + media-authority
tests, alignment-check — all green.

Co-authored-by: Cursor <cursoragent@cursor.com>
…closed-by-construction

Close the last two audit loose ends.

(1) Lowercase-address normalize on compare. The invisible mark recovers the EVM
wallet LOWERCASED (the 20 raw bytes carry no EIP-55 checksum casing), so any
attribution compare against a stored/expected address must normalize both sides
or a checksummed address would false-mismatch.
- render/invisible.rs: add normalize_evm_hex() (trim, strip 0x, lowercase) + a
  one-line test proving checksum casing compares equal.
- main.rs --verify-grant: advisory wallet cross-check — when the candidate grant
  JSON declares owner_address, confirm it matches the recovered 4-byte wallet
  prefix (both normalized). Fail-safe: advisory only, never overrides the digest
  verdict; pq-envelope-absent still returns 2 (no silent pass).

(2) HiDPI/Retina screenshot doc nuance (invisible.rs header + PROTECTED_CONTENT.md):
"same-resolution screenshot" means a 1:1 pixel-grid capture; a HiDPI/Retina
screenshot resamples (~2x) = rescaling = the already-unsupported case, so most
real-world HiDPI screenshots will not recover. Don't over-rely on it.

(3) THREAT_MODEL.md: reclassify the media/stream egress as CLOSED BY CONSTRUCTION,
not an open guard gap. The media tier serves only fMP4 from the ffmpeg
transcode+fragment ingest (media-provider prod, ddrm-media-authority dev): a
non-media file fails transcoding (no asset), and the pipeline re-encodes (AV1/AAC)
rather than -c copy, so source bytes never survive into served segments even for a
polyglot. With documents confined to the object tier (content-sniff guarded), no
media-tier sniff guard is needed. Re-open only for a bring-your-own pre-segmented
ingest or an ffmpeg -c copy/remux fast-path (would warrant a segment-0 mdat sniff).

Gates: decrypt-provider invisible tests (pdf-render,pq-envelope) 13 pass incl new;
rustfmt --check clean on both touched files; clippy introduces no new warnings;
alignment-check OK.

Co-authored-by: Cursor <cursoragent@cursor.com>
…n Linux CI

The canonical gate and CI both scoped to `cd elastos && cargo --workspace`, which
does NOT reach the crates this branch's protected-content work lives in
(capsules/decrypt-provider, capsules/ddrm-envelope, scripts/dev/ddrm-media-authority).
Their 217 tests — watermark codec, grant-digest envelope, media-authority — had
ZERO automated coverage; they were gated by hand each commit.

- justfile: add `verify-capsules` (build+test the capsule crates under their
  CANONICAL feature sets, matching scripts/dev/run-creator-gateway.sh:
  decrypt-provider = rail-stream,rail-mint,pdf-render,pq-envelope;
  ddrm-envelope = access-grant; media-authority = default) and fold it into
  `verify`, so the repo's "definition of green" finally covers the whole surface
  (Principle 12: the gate must match reality). clippy -D warnings is deliberately
  held back for the capsules (pre-existing lint debt); build+test is the real
  regression gate. Verified: all three are rustc -D-warnings-clean under these
  features, so the workflow's global RUSTFLAGS does not break them.

- ci.yml: add a `verify` job (ubuntu, installs `just`, runs the full `just verify`
  incl. the Linux-only carrier smoke the macOS dev box can't run) and an isolated
  `capsules` job (`just verify-capsules`) so a heavy/flaky smoke run can never mask
  a capsule regression. Add `workflow_dispatch` so this feature branch can be put
  through the full Linux gate on demand before merge.

This is the last gate between the branch and truly-done: turns "manually covered"
into "full green on Linux".

Co-authored-by: Cursor <cursoragent@cursor.com>
Add the feature branch to the push trigger so the full Linux gate (verify +
capsules) runs on our own work in isolation, without a PR to main. This entry
lives only on the branch and does not affect main or other branches until merge.

Co-authored-by: Cursor <cursoragent@cursor.com>
First Linux CI run surfaced two real issues the macOS box could not (just verify
aborts at the Linux-only smoke before reaching fmt):

- viewer_object.rs (landed in the Tier B-3/D-5 commit) was not rustfmt-clean — 6
  long-line/comment violations. cargo fmt -p elastos-server fixes only that file.
- the `verify` job failed at its first step (just alignment-check) because the
  GitHub runner has no ripgrep, which check-wci-alignment.sh requires. Install it
  before `just verify`. (The capsules job needs no rg and already passed green.)

Co-authored-by: Cursor <cursoragent@cursor.com>
…ider binary

Linux CI surfaced this: chain_mode_without_wallet_fails_closed expected the
"wallet not linked" fail-closed error but instead hit "rights-provider not found"
because decide_owned_access resolved/checked the capsule binary BEFORE validating
the subject wallet. On a clean runner (no pre-built capsule) the binary check
fired first, the test panicked, and its panic poisoned ENV_LOCK — cascading into
release_build_defaults_to_chain_and_refuses_dev_rights_modes.

Reorder so subject/wallet validation runs first: a chain-mode request with no
linked wallet is invalid on its face and must fail closed before we resolve or
spawn any external binary. This is both more correct (don't spawn a subprocess for
an obviously-invalid request) and makes the unit test hermetic (it is not an
#[ignore]'d integration test, so it must not depend on a built capsule). Verified
with ELASTOS_RIGHTS_PROVIDER_BIN=/nonexistent: both tests pass.

Co-authored-by: Cursor <cursoragent@cursor.com>
The verify job got through alignment-check + ripgrep but failed in
local-carrier-setup-smoke with `error[E0463]: can't find crate for std`: the smoke
builds the Home capsules (capsules/home-cli and friends) to wasm32-wasip1, and the
runner's stable toolchain ships only the host target. Add `targets: wasm32-wasip1`
so the smoke's wasm build has std. The other four jobs are host-only and unaffected.

Co-authored-by: Cursor <cursoragent@cursor.com>
…hain has its std

The verify smoke still failed with E0463 after adding the target to the dtolnay
@stable step: rust-toolchain.toml pins channel 1.89.0, so every cargo invocation
uses 1.89.0 — not stable — and the wasm target had been added to the wrong
toolchain. Declare `targets = ["wasm32-wasip1"]` in rust-toolchain.toml so rustup
auto-installs the wasm std for the pinned toolchain everywhere (CI and local), and
drop the now-redundant `targets:` from the workflow step. Verified locally: the
home-cli wasm build compiles clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
… for GitHub Actions

The full `just verify` cannot complete on a stock GitHub runner: its
`local-carrier-setup-smoke` step fetches the net-provider artifact over Elastos
Carrier, which a clean runner can't reach (proven on CI: it builds + runs the
entire ~18-min gate and fails only there). Everything else a clean runner CAN
verify.

- justfile: add `verify-ci` = the full gate MINUS the carrier smoke, with a hidden
  `_verify-tail` shared by both `verify` and `verify-ci` so they can't drift.
  alignment-check stays first in both. `just verify` (with the carrier smoke) is
  unchanged for a Carrier-capable Linux box / self-hosted runner.
- ci.yml: the Linux job now runs `just verify-ci` (renamed "Verify (Linux CI
  gate)") and documents that the carrier smoke is covered separately.

This lands the branch's surface — incl. the 217-test capsule gate and the full
elastos workspace fmt/clippy/test — under an enforceable green GitHub Actions gate.

Co-authored-by: Cursor <cursoragent@cursor.com>
Fold the off-tree AV-watermarking feasibility study (verdict: GO) into the
roadmap doc, with the audit caveats baked in rather than the harness's headline
claims:

- New Phase 0 (top of §5): video survival matrix, audio matrix, registration
  result, and the grant-anchored Tardos collusion chain.
- FP correction: the harness's single-seed empirical threshold (mean+3.5sigma)
  is flagged as ~1.25% false-accusation (400-trial Monte-Carlo); a certified
  bound now requires the analytic Tardos threshold + an MC FP/FN sweep, and the
  per-asset bound is recomputed at the FP-controlled threshold (duration
  minimums move up).
- New §3.4 Channel coding (required): the leak channel is bursty (whole-segment
  loss) -> timeline interleaving + an erasure-aware code; wired into chunks 2/6.
- Audio re-validation made concrete (chunk 6): psychoacoustic masking model +
  PEAQ/ODG + human A/B/X on real music/speech/silence, and time-stretch/pitch.
- Multi-strategy collusion (random/minority/all-ones/interleaving) mandated
  before any certified bound.
- Registration -> Phase 5 gating DSP item (deterministic template/pilot or
  log-polar/Fourier-Mellin; brute search proven insufficient).
- Full-variant-set AAD weld in §3.1/§4 (CEK binds the complete variant set;
  per-buyer selection is post-unwrap routing).
- §8 resolved (ECC->Tardos, q-ary density lever, published per-asset bound at
  the FP-controlled threshold, channel-coding requirement); §7 honest-limits
  expanded; VMAF 96.7 demoted from gate to synthetic relative signal.

Doc-only; no shipped behaviour. alignment-check green. Fix Widevine typo in §7.

Co-authored-by: Cursor <cursoragent@cursor.com>
…review

The "approve" step of the control loop (reflect → preview → APPROVE → act),
parallel-safe and read-only.

- elastos-runtime::approval (new, pure): `decide(mode, approver)` is fail-closed
  — the only path to Approved without an explicit yes is an affordance declared
  as needing no approval; User/RuntimePolicy default to PendingApproval; an
  explicit no always wins. `required_approval(actions)` scales the requirement
  with action strength (anything beyond read/message needs a human). 3 tests.
- inspect/intent (new provider op, read-only): given a capsule + operation,
  derives the gate (via plan), the approval it requires, and the fail-closed
  default decision. Records nothing, dispatches nothing.
- Gated consistently: `intent` added to the canonical op→action contract (Read)
  and the System-only browser allow-list.
- Decisions: `revoke` and recorded approve/deny stay on the runtime/dispatch
  (mutation) path — the product InspectProvider remains a read-only projection.
  Recording pairs with dispatch (merge-gated).

fmt --check PASS; targeted tests green (approval 3, inspect incl. intent 31 +2
ratchets ignored, provider_resource contract 1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016ZKy5Cca9RzwDuLb1szdeq
- CAPSULE_INSPECTOR.md: add the inspect/intent wire contract (approval-intent
  preview); add a "path note" clarifying revoke + self are served on the embedded
  RequestHandler (shell) path while the product InspectProvider is a read-only
  projection (capsules/capsule/plan/intent) — closes the contract-honesty gap.
- KNOWN_GAPS.md: G4 decision core DONE (approval + intent, fail-closed, tested);
  remaining = recording a signed approve/deny, which pairs with dispatch (G3).

(An orchestrator CLAUDE.md was written locally but is .gitignored by repo policy,
so it stays a local contract and is not committed.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016ZKy5Cca9RzwDuLb1szdeq
Pre-mainnet hardening from the deep audit (none block the branch; ① is the
item to put in front of the external auditor):

① Document the dKMS re-seal AAD invariant — the node re-seals the recovered
   CEK under the caller-supplied aad_b64, which is NOT bound into the recover
   possession-proof; safe only because the decrypt boundary rebuilds the
   segment-bound AAD and fails closed. Loud comment at the seal_bound call +
   THREAT_MODEL §7 note. Binding it into the proof is scoped with the auditor.
② Lock the release-build invariant — a compile_error! rejects a release build
   (no debug_assertions) of dkms-authority with dev-modes/legacy-receipt-authz,
   and a new CI job (dkms-release-invariant) asserts both directions. Adds
   docs/DEPLOY_CHECKLIST.md (incl. the node-set-id authorize-time guard, which
   is release-only and not unit-testable under cfg(test)).
③ Redact key-provider Debug — manual Debug on Request/ReleaseSessionContext
   prints only the op name, so no CEK/escrow bytes can leak via {:?}.
④ viewer_open — log_fp(&object_cid) for the fresh-grant line (was the raw cid).
⑤ VENDORING.md — three.js r160 pin + periodic-refresh/upstream-watch plan.

Plus a fail-closed dKMS-open testing checklist in DKMS_OVER_CARRIER.md
(rights-mode + Carrier-rail must match how the asset was minted) so the
"foreign escrow" 502 diagnosis doesn't recur.

Gates: elastos-server fmt+clippy; dkms-authority build (debug+release) +
24/24 tests + guard verified; key-provider build + 52/52 tests; alignment-check.

Co-authored-by: Cursor <cursoragent@cursor.com>
…word + offline extractor

AV forensic-variant layer, tractable + pipeline-free pieces built on the proven
Phase-0/5 research. Feature-gated OFF by default (`av-variants`), so it cannot
destabilize the default build; chunks 3/4/5 (mint transcode DSP, full-variant-set
AAD weld, serve-time selector) are deferred to the live CENC/DASH/quorum pipeline.

Chunk 1 — variant manifest schema (`elastos.ddrm.av-variants/v1`) in
  capsules/ddrm-envelope/src/av.rs: marked subset, q-ary variant refs (+ segment
  digest for the chunk-4 weld), codeword scheme (length/interleave/erasure τ/bias
  commitment). serde round-trips; validate() fails closed; single_encode() is the
  honest `fingerprinted:false` default.
Chunk 2 — canonical, RNG-free codeword: asset_bias_vector / buyer_codeword (from
  grant_watermark_digest16, no per-buyer storage) / interleave_map / tardos_score.
  A domain-separated SHA-256 stream over integers (NOT any language's RNG), so the
  Rust serve selector and the Python extractor derive identical codewords. Replaces
  the Phase-0 numpy-RNG derivation.
Chunk 6 — offline forensic extractor as the proven Python reference under
  tools/av-forensics/ (offline, operator-run, no key material, not in the boundary),
  re-anchored to the chunk-2 canonical construction. The load-bearing FM fix is
  preserved: register() resolves the Fourier-Mellin scale/rotation ambiguity on the
  VALID (non-border) region. The Rust --extract-av-fingerprint CLI is deferred until
  the scheme is frozen/certified.

Cross-language anti-drift weld: tools/av-forensics/test_canonical.py asserts the same
golden vectors as av::tests::canonical_golden_vectors — change either side and both
fail. Wired into `just verify-capsules` (now also tests ddrm-envelope with
av-variants), so CI covers the new module + the weld. Pure stdlib (no numpy/ffmpeg).

Still uncertified (carried honestly in docs/AV_WATERMARKING.md): analytic Tardos
threshold + Monte-Carlo FP/FN sweep (argmax is not proof), rotation estimator
(out of envelope), audio on real content. AV remains key-protected, not fingerprinted,
until chunks 3/4/5 ship and the certification gates pass.

Gates: ddrm-envelope 51 tests (av-variants, incl. golden vectors); default build
unaffected (module gated off); av.rs clippy-clean; cross-language weld PASS; ported
extractor validated end-to-end (FM-reg → bitERR 0, leaker ranked top; no-reg fails
closed); just verify-capsules PASS; just alignment-check OK.

Co-authored-by: Cursor <cursoragent@cursor.com>
Replace the Phase-0 empirical mean+kσ accusation threshold (Monte-Carlo
showed ~1.25% FP — not certifiable) with the analytic, FP-controlled
threshold Z = √m·Φ⁻¹(1−ε/N): the innocent symmetric-Tardos score is
exactly mean-0, variance-1 per kept position ⇒ N(0,m).

- canonical.py: tardos_threshold + _inv_norm_cdf (Acklam, pure stdlib,
  extractor-side only — not a cross-language weld surface).
- extractor.py: accuse only above the analytic Z (erasure-aware m), not
  an ad-hoc gap.
- montecarlo.py: multi-strategy FP/FN sweep (random/majority/minority/
  all-ones/all-zeros/interleave). 2000 trials, m=2332 N=500 c=3 ε=1e-3
  BER=0.13 ⇒ FP ≤ ε with 100% detection across all six; old empirical
  threshold runs 2–10× over ε (majority ≈1.05%).
- test_canonical.py: stdlib threshold sanity (Φ⁻¹(0.975)≈1.96,
  monotonicity, Z(2332,500,1e-3)=222.69) — runs in the CI weld.

Code-level accusation statistics only; media-survival certification
(real content/screen-record/CMAF lengths) remains open. Docs updated.

Co-authored-by: Cursor <cursoragent@cursor.com>
Leads with the one deliberately-open invariant — the re-seal AAD is the
caller-supplied aad_b64 and is NOT bound into the recover possession-proof
(dkms-authority recover → seal_bound, src/main.rs:1028). Safe today only
because the single consumer (decrypt boundary) rebuilds the segment-bound
AAD and fails closed. Packages the SECURITY INVARIANT comment, THREAT_MODEL
§7, and the DEPLOY_CHECKLIST open item into one hand-off with the trust
boundary, crypto roots, CI-enforced release invariants, repro gates, and a
reviewer checklist (incl. the landing test: tampered aad_b64 fails the
possession-proof closed at the node).

Co-authored-by: Cursor <cursoragent@cursor.com>
…pre-mainnet invariant)

The dKMS node re-seals a recovered CEK under the caller-supplied `aad_b64`,
which was NOT bound into the recover possession-proof. A MITM that tampered
`aad_b64` in transit could make the node seal under an AAD of its choosing;
it was safe only because the decrypt boundary independently rebuilt the AAD
and failed closed (a compensating control, not a fix).

Now the canonical possession-proof preimage binds `sha256(reseal_aad)`
(`ddrm_envelope::recover_proof_message`, domain bumped v1 -> v2). The client
signs over the exact AAD it sends (key-provider), and the node verifies the
proof over the byte-identical `args.aad_b64` in `verify_session` BEFORE any
CEK is recovered or re-sealed. The AAD (DecryptTranscriptV1) already carries
`node_set_id` + `segment_digests`, so all three are bound transitively; the
32-byte digest keeps the preimage bounded for long presentations.

A MITM cannot re-sign the proof (it lacks the token-bound caller key), so a
tampered `aad_b64` now fails closed at the node (`session_invalid`). The
decrypt boundary's rebuild remains as defense-in-depth.

- ddrm-envelope: recover_proof_message/sign/verify take `reseal_aad`; bind
  sha256; bump DKMS_RECOVER_DOMAIN to /v2; unit test asserts tampered-AAD ->
  verify=false.
- dkms-authority: verify_session verifies over decode(args.aad_b64) before
  recover; SECURITY INVARIANT comment rewritten to CLOSED; landing test
  recover_fails_closed_on_a_tampered_aad (35 legacy / 25 default tests green).
- key-provider: recover_proof_b64 + both delegate paths sign over the
  request's aad_b64.
- dev harnesses (ddrm-runtime-open, dkms-live-recover): each direct node
  recover signs over its request AAD.
- docs: THREAT_MODEL §7 + DEPLOY_CHECKLIST + AUDITOR_PACKET §1 flipped
  open -> closed, with the landing test referenced.

Gates: ddrm-envelope + dkms-authority tests, key-provider/dev-script builds,
verify-capsules, alignment-check all green.

Co-authored-by: Cursor <cursoragent@cursor.com>
…3+4 core)

ddrm-envelope::av gains the pure, fail-closed serve-time selector
(select_symbols) and the full-variant-set commitment (variant_set_commitment)
that chunk 4 welds into the decrypt transcript. The selector binds the
per-asset bias commitment (wrong secret -> refuse), supports arity-2 A/B
(direct codeword->segment mapping, matching the proven tools/av-forensics
extractor), and returns an empty selection for an honest single-encode.

DecryptTranscriptV1 gains to_aad_with_all_bindings, a strictly-extending
encoder that appends the variant-set commitment AFTER the rights binding, so
a non-fingerprinted open stays byte-identical to to_aad_with_bindings (all
committed goldens replay unchanged) while a fingerprinted open is bound to the
exact published variant set (manifest swap / out-of-set variant fails the CEK
unwrap closed). Pure functions, fully unit-tested; no pipeline wiring yet.

Co-authored-by: Cursor <cursoragent@cursor.com>
asset_secret_from_master derives the per-asset watermark secret from a
node-held master + the content hash, so the mint embed and the serve selector
agree on the bias/codebook without ever publishing or per-asset-storing it
(the manifest carries only the bias commitment; rotating the master re-keys
every asset). build_manifest assembles + validates a fingerprinted
VariantManifestV1 from produced variants (canonical interleave + bias
commitment), or returns the honest single-encode for an empty marked set.

A round-trip test closes the mint->serve loop: build_manifest keyed by the
derived secret produces a manifest that select_symbols (same secret) accepts,
and distinct buyers select distinct variant sets. Pure functions, tested.

Co-authored-by: Cursor <cursoragent@cursor.com>
… open

Mark the pure core of chunks 3/4/5 as landed (selector, variant-set AAD weld
encoder, manifest builder, per-asset secret KDF — all in ddrm-envelope::av/
lib.rs, fail-closed + unit-tested) and spell out precisely what remains: the
pipeline WIRING (ddrm-media-authority serve selection, decrypt-provider AAD
rebuild, mint emit) plus the real perceptual DSP (bounded-placeholder seam now;
certified embed swaps in post media-survival cert). Adds a "remaining wiring"
section with exact files and the one thing needed to validate end-to-end (a
gateway bring-up with a synthetic asset; real media only for the perceptual
cert). Notes the interleave-application follow-up as tracked, not dropped.

Co-authored-by: Cursor <cursoragent@cursor.com>
The local 2-of-3 stand-in nodes need the dev-modes legacy-receipt path to
authorize an offline recover (the live quorum uses wallet-signed grants); the
gateway dev script already builds the node this way. Without it the smoke fails
closed ("legacy receipt authorization is disabled") even on an unmodified tree.
With it the helper recovers a minted asset byte-identically (3/3 served).

Co-authored-by: Cursor <cursoragent@cursor.com>
…eam)

embed_placeholder_variant appends an ignorable ISO-BMFF `free` box carrying the
variant symbol AFTER the mdat, so the fragment stays valid/playable but byte-
distinct per symbol; encrypt_fragment (CENC) and strip_senc (decrypt) both carry
it through verbatim, so the selected variant is byte-distinct end-to-end and the
symbol survives back to the clean fragment. read_placeholder_variant recovers it
(the placeholder stand-in for the offline extractor). Explicitly NOT a watermark
(no perceptual signal, no transcode survival) — it makes mint->serve->select->weld
real and testable; the certified DSP embed swaps in behind the same interface
post-cert. Tested end-to-end through the CENC rail on the real ffmpeg fixture.

Co-authored-by: Cursor <cursoragent@cursor.com>
claude and others added 30 commits July 1, 2026 14:56
…istrations

Chunk D of #2: the gateway inspector (server_infra) was already handing
`with_registry` a real `Arc::downgrade(&provider_registry)` — the merge's
passthrough was the only thing dropping it, so Chunk A already made gateway
dispatch live. This wires the remaining production serve paths (serve_cmd, 3
sites) the same way, so `dispatch_approved` is live wherever the inspector is
registered. Test and mcp-serve registrations stay unwired (fail-closed by
construction).

Completes flint-0.5 enforcement degradation #2. All three tracked degradations
are now RESTORED; flint-0.5 is a true superset of flint — 0.5's features plus
fully intact KEEP enforcement, no regression.

Gate: cargo test --workspace --lib green (all crates), esp npm test 89, fmt
clean, 0 warnings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
The 0.5 merge dropped the dDRM producer-spine names (encrypt/publish/media)
from the sub-provider allowlist while server_infra.rs still registers all
three at boot. register_sub_provider rejects non-reserved names and the
error is swallowed as a warn!, so on a real boot the Create-portal mint/
publish/media paths silently go dark (elastos://encrypt|publish|media/* ->
no provider). No unit test caught it because provider wiring only runs at
boot. Restores flint's capability (strict superset) and adds a regression
guard test so a future merge can't re-drop them.

Co-authored-by: Cursor <cursoragent@cursor.com>
The 0.5 auth path resolves the trusted-auth data dir from a process-global
env var (home_launch_auth_data_dir / room_transport_identity_data_dir). In
production that var is set per-child-process at spawn and never mutated
mid-process, but a few unit tests mutate the shared test process to exercise
the override path. Under parallel `cargo test` other auth-gated tests would
transiently read the half-set value and correctly fail closed (403) -- a
nondeterministic red that only shows under load (it slipped past the cloud's
scheduling; serial runs were always green).

Serialize test-side access at the two read funnels: mutating tests hold a
process-wide write guard for their duration; the funnels take a brief read
guard so no *other* thread observes the in-flight mutation. The mutating
thread is tracked so its own funnel reads skip the (non-reentrant) read lock.
Zero production behavior change (the guards compile only under cfg(test)).

Co-authored-by: Cursor <cursoragent@cursor.com>
- setup.rs: package identity requires a valid IPFS CID before skipping materialization
- auth.rs: atomic_write uses unique temp filenames so parallel writers cannot collide
- provider/bridge.rs: label bridges for latency tracing on the serial provider path
- viewer_media.rs: bound public cover fetches with a timeout so unresolvable CIDs
  cannot pin a gateway worker
- ipfs-provider: Cat timeout_ms, macOS path canonicalization, kubo usability probe
- run-creator-gateway.sh: ELASTOS_BUILD_PROFILE=release support, dev-modes only
  where the capsule declares them
- docs: release-profile bootstrap notes and the macOS IPFS data-dir migration

Co-authored-by: Cursor <cursoragent@cursor.com>
… in the 0.5 merge

Restores the shell-windows owned-open subsystem, loading-window styles, and the
ddrm-viewer/elacity-player entries in SHELL_MESSAGE_OPEN_TARGET_SOURCES so Library
and viewers open content again. Bumps home asset cache-busting to home-20260701c.

Co-authored-by: Cursor <cursoragent@cursor.com>
….5 merge

Storage authority on the localhost provider path must come from the carrier/provider
envelope token; a caller-supplied body token is stripped before dispatch. Restores the
redaction plus both regression tests, and updates the admin-locked discover test for
the fail-closed inspect_resource path.

Co-authored-by: Cursor <cursoragent@cursor.com>
…fail-closed scope rules

- plan emits elastos.inspect.gate-preview/v1 (capabilities, audit events, execution
  policy, dispatch:false) so inbox gate summaries show real authority again
- revoke is an explicit unsupported_operation, not a silent fallthrough
- provider_resource gains inspect_resource(op) so unknown inspect ops fail closed
- restores the four inbox-approval gateway tests (fresh passkey, principal scoping,
  deny-without-dispatch) and provider authority/redaction tests
- docs: Act path and runtime scope-rule expectations, corrected inspect/self routing

Co-authored-by: Cursor <cursoragent@cursor.com>
…rces

- dkms-authority: deny_unknown_fields on the Request enum so hidden authority
  fields fail closed; lockfiles pick up elastos-common 0.5.0
- creator/ddrm-viewer: reword raw chain/backend references so app capsules stop
  claiming provider authority they route through the runtime
- library: replace platform-branded "Finder" wording with file-manager phrasing
- marketplace: classify providers via name.endsWith("-provider")

Co-authored-by: Cursor <cursoragent@cursor.com>
… post-merge truth

- home-entropy-check: current home asset version, expanded library open allowlist,
  post-merge inspector routing, act-emitter README in the Users/self allowlist
- check-wci-alignment: justified exclusions for chain-native crates, backend-scheme
  elacity pattern instead of the bare word
- command-smoke/installed-command-audit: hermetic HOME on macOS and a portable
  timeout (timeout/gtimeout/perl alarm) so the gates run off-Linux
- state.md: restore the canonical journey proof records lost in the merge
- docs: unlink gitignored CLAUDE.md, point DDRM rail table at per-capsule
  wasm-smoke scripts

Co-authored-by: Cursor <cursoragent@cursor.com>
filter/map instead of bool::then in filter_map for browser session listing, tail
expression instead of return in the cfg-split supports_hibernation, and indented
doc-comment link definitions in elastos-vz.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ricks `elastos home`

Root cause of the local-carrier-setup-smoke failure ("Capability request still
pending after 3s"): the G-ID flip fail-closes every identity gate for sessions
with no capsule identity, and /api/auth/attach created exactly such sessions
(vm_id: None). The managed-home flow then dead-ended three ways: capability
intake recorded no requester identity, the consent-broker's grant POST 403'd
fail-closed ("no requester capsule identity") in an infinite retry loop, and
even a minted token would have been unredeemable ("session has no capsule
identity"). Fail-closed did its job; the flow lost its identity plumbing.
Predates the 0.5 merge — the smoke was never re-run on Linux after G-ID landed
(the Mac cannot run it), so it slipped every gate until now.

Fix at the root seam: attach-authenticated sessions record an HONEST host
identity ("host-client" / "host-shell") — the attach secret is owner-only
(chmod 600), so the caller IS the host user; this is truthful identity, not
fabrication. Intake, grant mint, and token redemption now agree end-to-end.
No authority widening: grants still require consent-broker approval; tokens
still bind to the recorded identity; audit records it.

Proven live: `just local-carrier-setup-smoke` now passes on Linux (was the one
red step in `just verify`); replayed the failing grant against the live runtime
before/after (403 "no requester capsule identity" -> granted). Regression test
pins the identity on both scopes.

Gate: cargo test -p elastos-server --lib green (1044), clippy clean, fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…t-scan invariant test

The 0.5 merge left three first-party capsule providers declaring
`provides: elastos://<name>/*` for names NOT in RESERVED_SUB_NAMES: `market`
(content-market storefront — no boot fallback, route never exists), `object`
(Library object authority) and `operator-drive-adapter` (both also register a
boot main-provider but lose their VM sub-route). At capsule launch the
supervisor's register_provider_route fails closed and the failure is
warn-swallowed, so the provider silently goes dark — the same live-only class
the dDRM-spine fix repaired, still open for these three.

- Reserve the three names (strict superset; no capability removed).
- Add pub is_reserved_sub_name() as the single-source-of-truth predicate.
- Add test_all_capsule_provided_sub_schemes_are_reserved: scans every shipped
  capsule.json `provides` sub-scheme and asserts it is reserved — no boot
  needed. This is the general invariant the hardcoded dDRM-spine test only
  covered for three names; it would have caught all of this and reds on the
  next provider capsule that forgets to reserve its scheme.

Gate: cargo test -p elastos-runtime --lib green (384), fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ore 6 inbox tests

Intake bug (Ravi P16/P11, KNOWN_GAPS G3): create_inspect_action_request only
checked plan.status=="ok", but the inspector's plan returns
{status:"ok", data:{valid:false, error:"unknown_operation"}} for an operation
the target authority never declared. That created a PENDING inbox approval with
an EMPTY gate preview — prompting a human to approve an act whose authority is
invisible. Consent requires visibility.

- Reject at intake when plan.data.valid != true, BEFORE persisting: no record,
  no notification, no approvable row, no dispatch_approved reachability.
- Restore the 6 inbox-approval regression tests dropped in the 0.5 merge,
  grafted from origin/review/0.5.0 against the existing merged harness — inbox
  suite 4 -> 10.
- Add inspect_action_rejects_undeclared_operation_before_inbox: asserts the
  undeclared op is rejected AND leaves zero approvable Inbox rows (structural
  fail-closed, not a hidden display string).

Gate: cargo test -p elastos-server --lib inbox suite green (11 incl. new guard).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
register_sub_provider was last-write-wins, so a launched capsule whose manifest
declares `provides: elastos://encrypt/*` (or key/decrypt/wallet/…) could seize
the CEK-escrow / key / signing route from the trusted boot provider — ambient
authority via registration order (Principle 3) and a break of the mediated
key/decrypt plane (Principle 15).

- Pin the escrow+keys+signing+mint spine (encrypt, publish, media, key, decrypt,
  drm, rights, wallet, chain): once bound at boot, a later registration of the
  same still-live name is refused structurally (Err), checked under the write
  lock (race-free). Non-pinned reserved names keep last-write-wins for
  hot-reload / test double-registration.
- unregister_sub_provider frees the slot, so a genuine teardown→restart of the
  same provider re-mounts cleanly; only overwrite of a live pinned slot fails.
- register_sub_provider now routes its reserved-name check through the new
  is_reserved_sub_name() predicate (single source of truth; also clears the
  dead-code warning).

Validated empirically: `just local-carrier-setup-smoke` (full Linux boot +
`elastos home`) passes with the guard live — boot registers each pinned name
exactly once, so nothing legitimate is refused. Test proves refuse-overwrite,
original-stays-bound, and restart-after-unregister.

Gate: cargo test -p elastos-runtime --lib green; clippy -p elastos-runtime 0;
smoke green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…-auth funnel

Two same-class hygiene fixes surfaced by the audit (both "one canonical path",
Principle 10):

1. DDRM test env-lock: mint/buy/rights/owned_ledger each held their OWN
   `static ENV_LOCK`, so a lock only serialized a module against itself while
   the mutated `ELASTOS_DDRM_*` vars are process-global — a reader in one module
   could observe another module's mid-test mutation and fail closed (the exact
   nondeterministic class the trusted-auth-env guard fixed). Replace the four
   disjoint statics with one shared `api::ddrm_env_lock()` so all DDRM env
   mutation serializes on a single lock instance.

2. Trusted-auth funnel: `room_transport_identity_data_dir` was a byte-identical
   copy of `home_launch_auth_data_dir` (env read + test guard). Delegate to the
   canonical one so the two can't drift; the entropy-check-pinned
   `home_launch_auth_data_dir` symbol is unchanged.

Gate: cargo test -p elastos-server --lib green (1051), fmt clean, 0 warnings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…f-tier

Audit surfaced a three-way contradiction: docs said "/api/provider/inspect/self
is System-only", but the code routes self to the app/browser tier
("self" => &[BROWSER_CAPSULE_ID]) AND the entropy-checker simultaneously pinned
BOTH the BROWSER-self code and the stale "System-only" doc line.

Decision (owner): keep the self-tier — a legitimate KEEP transparency capability,
fail-closed by construction (gateway injects the authenticated principal_id,
client-supplied id ignored, authorize_view enforces caller == target under
InspectScope::SelfOnly), already covered by
inspect_self_returns_own_record_and_ignores_client_id and
inspect_self_token_cannot_reach_system_capsule_op.

- docs/CAPSULE_INSPECTOR.md + docs/INSPECTOR_TESTING.md: self is a live,
  caller-bound, fail-closed route (not System-only).
- home-entropy-check.mjs: pin the new fail-closed self-tier language instead of
  the stale "System-only" phrase, so code, docs, checker, and tests all agree
  (Principle 12). No code/behavior change.

Gate: home-entropy-check PASS.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ants

serde's container `deny_unknown_fields` does NOT apply to UNIT variants of an
internally-tagged enum, so the quorum authority's Request::Status / ::Shutdown
silently accepted `{"op":"status","smuggled":true}` — a small fail-open seam on
an untrusted protocol surface (Principle 11). The authority-carrying variants
(Hello/Recover/RotateShare/…) are struct variants and already fail closed; only
the two empty ones leaked.

- Convert Status/Shutdown to empty STRUCT variants so deny_unknown_fields covers
  them; update the four match sites.
- Add empty_variants_reject_unknown_fields (clean parse; hidden field rejected).

Scoped to only the logical change (no whole-file reformat, per the shared-tree
lesson). Gate: cargo test -p dkms-authority green (25); no new clippy warnings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…in KNOWN_GAPS

Turn the remaining audit finding into a build-visible, tracked contract rather
than prose (LESSONS.md: audit → gap registry). server_infra warn-swallows a
register_sub_provider Err at boot for ~22 providers; the capability still fails
closed at route time (not fail-open), but a spawned-but-unregisterable
boot-critical provider goes silently dark with only a warn. Row records the
anchor, the distinction (absent-binary=warn ok vs spawned-but-rejected=loud),
the close criteria, and a pending ratchet (needs a boot failure-injection seam).

The other remaining finding — carrier-service launch skipping the author-
signature gate — is already tracked as AUD-1 RESIDUAL (b); not duplicated.

Docs-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…is session's fixes

Registry-truth sweep (LESSONS.md: audits feed resolutions back — a doc that rots is a
liability). Reconcile every row whose truth changed under this session's commits:

- G-ID residual: drop `attach.rs:63` from the "None-vm_id follow-ups" list — attach host
  sessions now carry an honest host-shell/host-client identity (`279dac1`), closing the
  live-only managed-home dead-end the smoke caught.
- PRINCIPLES_CONFORMANCE §A RESERVED_SUB_NAMES: mark it DESIGN-gap-only now — the acute
  risks are build-guarded (manifest-scan invariant `1fc2a14`; first-writer-wins pin
  `8b688fc`); drop the stale `:448-476` line ref.
- Enforced invariants (+3): every provider `provides` sub-scheme is reserved (no silent-dark);
  boot-critical sub-providers pinned first-writer-wins; request_act intake fails closed on an
  undeclared op.

inspect/self tier was already reconciled in `e51be7b`; DDRM env-lock is test-infra (no row).
Docs-only. Gates: home-entropy + wci-alignment PASS.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ratchet

AUD-6 seam + first fix. Boot-critical sub-provider registration was warn-swallowed
at ~19 server_infra sites: a spawned-but-unregisterable provider (an invariant
violation → a dark mint/keys/signing path) left the runtime up with only a warn.

- `encrypt` (CEK escrow — the crown jewel) now PROPAGATES its register_sub_provider
  failure (`?`, boot fails loud) instead of warn-swallow. Only the
  registration-rejected branch changes; absent-binary stays the outer warn
  (genuinely optional). Smoke-validated: real boot registers encrypt once, no Err,
  boot proceeds — `just local-carrier-setup-smoke` green.
- `#[ignore]`d ratchet `aud6_boot_critical_sub_provider_registration_fails_loud`
  scans for the warn-swallow line per boot-critical scheme; run with --ignored it
  FAILS today, listing publish/media/key/decrypt/drm/rights/wallet/chain (encrypt
  absent = fixed). Flips green — delete #[ignore] — when the rest are classified
  critical-vs-optional and rewired. Non-blocking in normal CI (ignored).
- KNOWN_GAPS AUD-6 updated: PARTIAL (encrypt), ratchet named.

Gate: cargo test -p elastos-server --bin green (96 pass, 1 ignored); smoke green;
server_infra.rs rustfmt-clean (scoped).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…t response paths (DoS)

Audit swarm finding (Priya, HIGH): the primary Carrier request path used
unbounded `read_line` on remote-controlled streams. `handle_file_stream`
accepts every inbound CARRIER_ALPN connection with no peer auth and then
read a whole line into memory, so a remote peer could OOM the node pre-auth
with a newline-less flood. The same class was already fixed for the
WASM/microVM bridges (BUG-6, bounded `read_bounded_line`, 1 MB cap) but
never applied here. The client-side response readers (release_head,
provider_invoke, gossip push/pull, operator send_request) had the same gap
against a malicious source we dialed.

Fix (fail-closed, no protocol change): expose the existing bounded reader
`pub(crate)` and funnel every Carrier newline-delimited control read through
one shared `read_bounded_carrier_line` helper (1 MB cap; oversized/truncated
= error, not a giant alloc). Carrier bulk bytes ride the separate
length-prefixed path (already capped at 200 MB), so the 1 MB bound only ever
constrains small JSON control lines.

Sites: carrier.rs handle_file_stream (inbound, HIGH) + 4 client response
readers; operator_control.rs inbound handler + peer response.

Gate: cargo build -p elastos-server green; clippy -p elastos-server --lib
clean; 2 new regression tests (oversized flood refused, normal line
round-trips) pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…only ops (T1)

Audit swarm finding (Sol, CONFIRMED): `handle_file_connection` accepts every
inbound CARRIER_ALPN connection with NO peer authentication, and
`validate_carrier_provider_invocation` is self-referential (it checks
caller-supplied envelope fields against each other, not against a
runtime-issued capability). So any anonymous remote peer could invoke the
whole provider_invoke matrix — confirmed harm: `content:publish`/`import_exact`
pin arbitrary bytes into the node's store under a caller-supplied
`principal_id` (unauthorized write + quota-attribution abuse); critical
caveat: the `key`/`decrypt`/`drm` targets were reachable too.

Fix (fail-closed, default-DENY): `carrier_provider_plane_allows_unauthenticated`
is a strict allowlist — only `content:{fetch,status,admission}` (non-mutating
reads: fetch bytes, read status, quota *decision*) pass. Every write
(publish/import_exact/import_object/ensure/unpublish/repair) and every
key/decrypt/drm/rights/availability op is refused with
`unauthorized_provider_operation` BEFORE `send_raw` ever runs.

Trade-off (user-approved "lock read-only now"): authenticated push-replication
and cross-node key/rights flows over the plane are disabled until real Carrier
peer authentication lands — tracked as G-CARRIER-PEER in KNOWN_GAPS. Widening
the allowlist without peer auth reopens T1.

Gate: cargo clippy -p elastos-server --lib clean; full carrier test module
57/57 pass; 2 new refusal tests (write op refused, key/decrypt/drm refused) +
existing content:fetch dispatch test still green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
… (T3)

Audit swarm finding (Nadia, HIGH, confirmed end-to-end): `validate_public_ip`
checked only the native IPv6 predicates (loopback/unspecified/unique-local/
link-local), so IPv4-mapped IPv6 literals evaded every guard —
`::ffff:169.254.169.254`, `::ffff:127.0.0.1`, `::ffff:192.168.1.1` all returned
"public". The `url` crate preserves the mapped form through the host allowlist,
DNS resolver, and connect; on a dual-stack host the kernel routes
`::ffff:a.b.c.d` to the bare IPv4, so a capsule with a permissive `http_fetch`
backend could read `http://[::ffff:169.254.169.254]/latest/meta-data/...`
(cloud metadata / loopback services).

Fix: in the V6 arm, normalize `to_ipv4_mapped()` (and the deprecated
IPv4-compatible `::a.b.c.d` via `to_ipv4()`) FIRST and recurse into the full v4
private/loopback/link-local guard. Ordered so `::1`/`::` are still caught by
the native predicates before the v4 fallback. Applied identically to
exit-provider and net-provider (the two SSRF egress mediators).

Gate: cargo test + clippy on both standalone capsule crates green; new
regression test `validate_public_ip_blocks_ipv4_mapped_private_targets`
(mapped metadata/loopback/RFC1918 refused; public v6 + public mapped v4 pass).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
Audit swarm finding (Vera+Dmitri, HIGH, confirmed): the audit-chain signature
was strippable via an unauthenticated `alg` downgrade. `compute_record_hash`
hashes only `domain ‖ seq ‖ prev_hash ‖ event_json` — `alg` and `sig` are NOT
in the preimage — and `verify_chain` ran the ed25519 check only
`if rec.alg == "ed25519"`. So an offline editor with NO signing key could
rewrite the entire event history, recompute every (public) record_hash, relink
the chain, set `alg="none"`, drop `sig`, and pass: `verify_chain` returned Ok,
`chain_attestation` reported verified=true, still advertising the real signer.
This defeated the module's own tamper-evidence guarantee — the EU AI Act
durable-custody claim.

Fix (no on-disk format change): make the decision to check the signature
independent of the forgeable `alg`. When a verifying key is supplied (custody /
tamper-evidence mode — both production callers, with_file_verified and
chain_attestation, derive the key from self.signer, present iff the log is
signed), EVERY record MUST be ed25519-signed and verify; a non-ed25519 alg in a
signed chain is a downgrade and is refused fail-closed. The keyless
(memory/unsigned) path is unchanged and still refuses to report a signed record
as verified without its key.

Gate: cargo clippy -p elastos-runtime --lib clean; all 19 audit tests pass,
incl. new `signature_downgrade_forgery_is_refused` (full forgery: event edited,
record_hash recomputed + relinked, sig stripped → refused; hash-chain is
internally consistent so ONLY the mandatory-signature rule catches it).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
… charset guard (T6)

Two MEDIUM audit-swarm findings (Nadia):

T5 — exit-provider `http_fetch` auto-followed ureq's default 5 redirects. The
private agent has no IP-validating resolver on redirect hops, and the backend
host allowlist is only checked against the INITIAL URL, so an allowlisted host
could `302` the fetch to cloud metadata / any non-allowlisted host. Fix:
`.redirects(0)` on both agents — the mediator returns the 3xx to the caller
instead of following; the capsule re-issues `http_fetch` for the new URL, which
re-runs the full URL + host + allowlist + resolver validation per hop (each
egress individually capability-checked). All 29 exit-provider tests still pass.

T6 — the carrier `operation` was only checked non-empty, then interpolated into
`/api/provider/{scheme}/{operation}`; `Url::join` normalizes `..`, so
`x/../../capability/request` escaped the provider gate and reached arbitrary
local-API endpoints as the capsule's own token. Fix: restrict `operation` to a
single `[A-Za-z0-9_-]` segment in `carrier_invoke_dispatch`, rejecting
`/`/`.`/`%` etc. before it reaches the URL.

Gate: clippy clean on both crates; 8/8 carrier dispatch tests pass incl. new
`carrier_invoke_dispatch_rejects_path_traversal_operation` (traversal/dot/pct
refused, normal underscore op still parses); exit-provider 29/29 green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
`just verify`'s `cargo fmt --check` step flagged four non-canonical lines in
the test code added by the audit-fix chunks (assert! wrap, .replacen args,
Cursor::new arg, for-loop array). Formatting only — no logic change. Applied
by hand (scoped to the exact lines) to respect shared-tree discipline; scoped
`cargo fmt -p elastos-runtime -p elastos-server --check` now clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
Doc-truth reconcile: add the audit-swarm callout to the KNOWN_GAPS opening so
the registry reflects the six confirmed reachable defects fixed this pass
(T1 carrier plane lock, T2 bounded reads, T3 SSRF, T4 audit downgrade, T5
redirects, T6 operation traversal), the cleared-as-sound surfaces, and the
deferred roadmap (T7 crypto migration, perf ceilings, quality cleanups). The
open residual (T1 peer-auth) is already the G-CARRIER-PEER row.

Gate: home + browser entropy checks, WCI alignment, and git diff-check all
pass on the doc change; full `just verify` was green on the code at HEAD.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…yte copy

Both VM-launch overlay sites (rootfs.rs get_or_create_overlay and the inline
copy in supervisor.rs) did a full tokio::fs::copy of the ~335 MB rootfs.ext4 on
every launch. Replace both with a shared reflink_or_copy helper: a copy-on-write
clone via `cp --reflink=always` — an O(1) metadata op on CoW filesystems
(btrfs/xfs/zfs/bcachefs) — that transparently falls back to the exact same
pure-Rust full copy on any failure (non-CoW FS, cross-device, or `cp` absent).

Correctness is identical on both paths: the result is an independent writable
file with identical contents (a reflink gives copy semantics, not a shared
mutable file). Only the cost changes. New unit test asserts independence —
writing the clone leaves the source untouched — so it holds whichever path the
host filesystem takes.

Audit-swarm finding (Berger, HIGH, safe, free): the standout no-measurement-gate
latency win — a full image copy on the launch hot path with a free O(1)
replacement. mkfs.ext4 is already shelled out from this crate, so external-tool
use here matches the established pattern.

Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
The GAP-8/AUD-2 custody write on the dDRM open path called
audit.content_open(...) synchronously inside the async handler; content_open ->
emit does a full fsync, so every open parked a tokio worker thread on disk I/O.
Wrap it in spawn_blocking with owned clones of the record fields (the Arc<AuditLog>
handle is cloned in).

The fail-closed contract is preserved exactly: the open proceeds ONLY on
Ok(Ok(())); an emit error (Ok(Err)) refuses it as before, and a join failure
(Err) is now also treated as a write failure and refuses the open — content
whose open cannot be durably, tamper-evidently recorded still does not happen.
The fsync itself is unchanged (custody durability is not weakened); it just no
longer blocks an async runtime thread.

Audit-swarm finding (Vyukov, HIGH, safe): custody fsync on the async worker on
the open hot path.

Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…tors

content.rs and carrier.rs each carried byte-for-byte copies of three
security-invariant validators: the SSRF egress URL guard (reject inline creds,
allow only https or loopback http), the HTTP-header CRLF-injection guard, and
the content path-traversal guard. Duplicated security logic drifts silently —
tightening one copy leaves the other on the weaker rule (the same class that let
an SSRF gap exist in two places).

Extract the logic into one `net_validation` module (with unit tests) and reduce
the six local functions to trivial label-passing delegators. Zero call-site
churn (~28 callers unchanged) and byte-identical error messages — the label
parameter reproduces each surface's exact prefix ("operator alert" /
"carrier external endpoint" / "carrier authorization header"). Behavior is
unchanged; the security rule now lives in exactly one place per invariant.

Audit-swarm finding (matklad, MED): security-validator duplication / drift.

Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke);
3 new net_validation unit tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants