Fix: MPEG-DASH compliance and dkms fix#11
Open
irzhywau wants to merge 441 commits into
Open
Conversation
… & EPUB hardening) Two boundary holes on the object egress path, both fail-closed: - (3) Serve-time content sniff. The raw `/bytes` egress trusted the mint-time `pixel_locked` flag, which trusts the creator-declared mime — so a renderable/ scriptable document mislabeled with a non-pixel-lock mime could egress as raw plaintext. `viewer_object_bytes` now sniffs the DECRYPTED bytes (after authority.object(), before octet_stream) via a pure magic-byte `sniffs_as_lockable` (PDF / ZIP / raster image / SVG-XML) and returns 403 if a "raw" asset's content looks pixel-lockable. The one exception is an explicitly-declared `application/zip` (generic archive download). Buyer-safe: the declared mime lives in the signed descriptor, not buyer-controlled. Verified a PDF mislabeled as a 3D model (which shares this decrypt-passthrough handler) is now caught. - (5) HTML-lock CSP/nosniff. EPUB chapters served as sanitised HTML now carry an enforced HTTP `Content-Security-Policy` with a `sandbox` directive (`default-src 'none'; img-src data:; style-src 'unsafe-inline'; font-src data:; base-uri 'none'; form-action 'none'; frame-ancestors 'self'; sandbox`) plus `X-Content-Type-Options: nosniff` and `Referrer-Policy: no-referrer`, so the document is sandboxed at the RESOURCE level by the browser even if loaded directly or framed without the attribute — the hand-rolled sanitiser is no longer the sole barrier. JPEG pages get `nosniff` only. Out of scope (tracked follow-ups, not silently skipped): the media/stream egress (`viewer_media`/MSE) is a second egress door with no sniff yet; text/code mislabel needs heuristics (no reliable magic byte). The render direction is already fail-closed via the parsers. Gate: elastos-server viewer_object 7/7; clippy -D warnings clean. Co-authored-by: Cursor <cursoragent@cursor.com>
…l-lock CSP Make the protected-content docs state the watermark's true strength and the new boundary defenses exactly (Principle 12 — docs/code/threat-model agree): - Watermark forensic scope & privacy (THREAT_MODEL §3 row + §6.6; PROTECTED_CONTENT "Forensic strength & privacy"): the mark is UNKEYED and CRC-protected (not signed), so it is forgeable and repudiable — a deterrent/tracer, NOT court-grade evidence; the authenticated record is the §4 signed custody log. It is also NOT anonymous: both layers embed the full opening wallet (visible layer human-readable), so anyone who sees a rendered page de-anonymizes the buyer — the deliberate leak-attribution trade. Names the roadmap upgrade (authenticate the payload: MAC/opaque token). - Pixel-bomb resource bounds (PROTECTED_CONTENT): documents decode_bounded (image::Limits), the PDF both-axes+area scale clamp, and the CBZ per-page/total caps. - HTML-lock CSP (PROTECTED_CONTENT): documents the enforced HTTP CSP `sandbox` + nosniff containment order (HTTP CSP true layer ▸ meta/iframe belt ▸ sanitiser DiD). Docs-only; alignment-check OK. Co-authored-by: Cursor <cursoragent@cursor.com>
…d grant
Tier C (1), chunks 1-4: upgrade the invisible pixel-lock watermark from an
unkeyed CRC-only mark (forgeable + repudiable) to one ANCHORED IN THE BUYER'S
OWN WALLET SIGNATURE — so a leaked frame is non-repudiable and forgery rises
from "anyone can plant any wallet" to "only a party holding the victim's signed
grant can." Code and docs land together (Principle 12).
- Shared digest (ddrm-envelope): `grant_watermark_digest16(delegation_sig_hex)`
= SHA-256(normalized EIP-191 delegation signature)[..16]. Lives in the crate
BOTH the embedder and the verifier link, so they cannot drift. No new deps
(sha2 already present).
- Payload codec (decrypt-provider/render/invisible.rs): new TAG_GRANT_DIGEST
carrying `[wallet_prefix(4) | grant_digest(16)]` = 21 B <= the 24 B CAP, so
the 232-bit PERIOD (and sparse-page recovery) is unchanged. `embed` takes the
digest; `extract` refactored into `extract_raw` + `parse_grant_mark` so the
verifier reads the raw anchor. No-grant/local-dev opens fall back to the
compact wallet (back-compat).
- Wire (watermark.rs + media-authority quorum.rs): the authority appends an
invisible-only `\u{1F}gd:<hex>` token to the stamp; `finalize` splits it back
off so the VISIBLE mark stays the clean human `wallet . content . time` and
only the INVISIBLE layer carries the authenticated digest.
- Verifier (main.rs): `--extract-watermark <img> [--verify-grant <grant.json>]`
prints the wallet prefix + digest and reports MATCH/NO MATCH by recomputing
via the shared fn. Gated on pq-envelope (always in the shipped render binary).
- Docs: THREAT_MODEL S3 row / S6.6 refreshed to the authenticated state and S4
records the chunk-5 retention decision (option C: fold the digest into the
existing tamper-evident audit record, TTL + access-controlled; status pending
wiring). PROTECTED_CONTENT forensic-strength block + the invisible-layer
description match. Honest bound kept explicit: the delegation signature is not
a hard secret, so this is non-repudiation + raised-forgery, NOT full
anti-framing; a server-key MAC / opaque custody token remains the north star.
Gates (capsules are not -D warnings gated by `just`; verified directly):
decrypt-provider compiles clean + render tests 59/59; media-authority 12/12
(incl. cross-crate digest agreement); ddrm-envelope digest test + 60 existing;
alignment-check OK.
Co-authored-by: Cursor <cursoragent@cursor.com>
…stody chain Wire Tier C-1 chunk 5: fold the 16-byte authenticated grant digest (a non-reversible commitment to the buyer's signed delegation — the same value the invisible pixel-lock watermark embeds) into the existing append-only content_open custody record, so a leaked frame is verifiable against an audit row WITHOUT a second who-opened log or any raw wallet/grant retention (option C). - audit.rs: optional grant_digest on AuditEvent::ContentOpen, serde-skipped when absent so prior records hash-verify unchanged; content_open() takes it; test proves backward-compat + chain verification with and without the anchor. - viewer_open.rs: resolve the wallet-signed grant (fresh AND cached paths) ABOVE the custody write and derive grant_digest from the EXACT signature forwarded to the quorum, so the §4 record carries the anchor; malformed fresh grant still fails before any "opened" record is written. Media/no-grant opens -> None. - elastos-server cannot link the PQ ddrm-envelope crate, so it carries a no-shared-dep twin (grant_watermark_digest16_hex) guarded by a golden vector cross-checked against ddrm_envelope::grant_watermark_digest16 in BOTH crates, pinning the trim+lowercase normalization so the two sides cannot drift. - THREAT_MODEL.md §4: retention entry updated to "option C, wired" — minimization-via-non-reversibility, not TTL (the chain is intentionally permanent); records a TTL-prunable index as explicitly rejected (Principle 10). Gates: ddrm-envelope golden, elastos clippy -D warnings (workspace), runtime audit chain test, elastos-server golden, decrypt-provider + media-authority tests, alignment-check — all green. Co-authored-by: Cursor <cursoragent@cursor.com>
…closed-by-construction Close the last two audit loose ends. (1) Lowercase-address normalize on compare. The invisible mark recovers the EVM wallet LOWERCASED (the 20 raw bytes carry no EIP-55 checksum casing), so any attribution compare against a stored/expected address must normalize both sides or a checksummed address would false-mismatch. - render/invisible.rs: add normalize_evm_hex() (trim, strip 0x, lowercase) + a one-line test proving checksum casing compares equal. - main.rs --verify-grant: advisory wallet cross-check — when the candidate grant JSON declares owner_address, confirm it matches the recovered 4-byte wallet prefix (both normalized). Fail-safe: advisory only, never overrides the digest verdict; pq-envelope-absent still returns 2 (no silent pass). (2) HiDPI/Retina screenshot doc nuance (invisible.rs header + PROTECTED_CONTENT.md): "same-resolution screenshot" means a 1:1 pixel-grid capture; a HiDPI/Retina screenshot resamples (~2x) = rescaling = the already-unsupported case, so most real-world HiDPI screenshots will not recover. Don't over-rely on it. (3) THREAT_MODEL.md: reclassify the media/stream egress as CLOSED BY CONSTRUCTION, not an open guard gap. The media tier serves only fMP4 from the ffmpeg transcode+fragment ingest (media-provider prod, ddrm-media-authority dev): a non-media file fails transcoding (no asset), and the pipeline re-encodes (AV1/AAC) rather than -c copy, so source bytes never survive into served segments even for a polyglot. With documents confined to the object tier (content-sniff guarded), no media-tier sniff guard is needed. Re-open only for a bring-your-own pre-segmented ingest or an ffmpeg -c copy/remux fast-path (would warrant a segment-0 mdat sniff). Gates: decrypt-provider invisible tests (pdf-render,pq-envelope) 13 pass incl new; rustfmt --check clean on both touched files; clippy introduces no new warnings; alignment-check OK. Co-authored-by: Cursor <cursoragent@cursor.com>
…n Linux CI The canonical gate and CI both scoped to `cd elastos && cargo --workspace`, which does NOT reach the crates this branch's protected-content work lives in (capsules/decrypt-provider, capsules/ddrm-envelope, scripts/dev/ddrm-media-authority). Their 217 tests — watermark codec, grant-digest envelope, media-authority — had ZERO automated coverage; they were gated by hand each commit. - justfile: add `verify-capsules` (build+test the capsule crates under their CANONICAL feature sets, matching scripts/dev/run-creator-gateway.sh: decrypt-provider = rail-stream,rail-mint,pdf-render,pq-envelope; ddrm-envelope = access-grant; media-authority = default) and fold it into `verify`, so the repo's "definition of green" finally covers the whole surface (Principle 12: the gate must match reality). clippy -D warnings is deliberately held back for the capsules (pre-existing lint debt); build+test is the real regression gate. Verified: all three are rustc -D-warnings-clean under these features, so the workflow's global RUSTFLAGS does not break them. - ci.yml: add a `verify` job (ubuntu, installs `just`, runs the full `just verify` incl. the Linux-only carrier smoke the macOS dev box can't run) and an isolated `capsules` job (`just verify-capsules`) so a heavy/flaky smoke run can never mask a capsule regression. Add `workflow_dispatch` so this feature branch can be put through the full Linux gate on demand before merge. This is the last gate between the branch and truly-done: turns "manually covered" into "full green on Linux". Co-authored-by: Cursor <cursoragent@cursor.com>
Add the feature branch to the push trigger so the full Linux gate (verify + capsules) runs on our own work in isolation, without a PR to main. This entry lives only on the branch and does not affect main or other branches until merge. Co-authored-by: Cursor <cursoragent@cursor.com>
First Linux CI run surfaced two real issues the macOS box could not (just verify aborts at the Linux-only smoke before reaching fmt): - viewer_object.rs (landed in the Tier B-3/D-5 commit) was not rustfmt-clean — 6 long-line/comment violations. cargo fmt -p elastos-server fixes only that file. - the `verify` job failed at its first step (just alignment-check) because the GitHub runner has no ripgrep, which check-wci-alignment.sh requires. Install it before `just verify`. (The capsules job needs no rg and already passed green.) Co-authored-by: Cursor <cursoragent@cursor.com>
…ider binary Linux CI surfaced this: chain_mode_without_wallet_fails_closed expected the "wallet not linked" fail-closed error but instead hit "rights-provider not found" because decide_owned_access resolved/checked the capsule binary BEFORE validating the subject wallet. On a clean runner (no pre-built capsule) the binary check fired first, the test panicked, and its panic poisoned ENV_LOCK — cascading into release_build_defaults_to_chain_and_refuses_dev_rights_modes. Reorder so subject/wallet validation runs first: a chain-mode request with no linked wallet is invalid on its face and must fail closed before we resolve or spawn any external binary. This is both more correct (don't spawn a subprocess for an obviously-invalid request) and makes the unit test hermetic (it is not an #[ignore]'d integration test, so it must not depend on a built capsule). Verified with ELASTOS_RIGHTS_PROVIDER_BIN=/nonexistent: both tests pass. Co-authored-by: Cursor <cursoragent@cursor.com>
The verify job got through alignment-check + ripgrep but failed in local-carrier-setup-smoke with `error[E0463]: can't find crate for std`: the smoke builds the Home capsules (capsules/home-cli and friends) to wasm32-wasip1, and the runner's stable toolchain ships only the host target. Add `targets: wasm32-wasip1` so the smoke's wasm build has std. The other four jobs are host-only and unaffected. Co-authored-by: Cursor <cursoragent@cursor.com>
…hain has its std The verify smoke still failed with E0463 after adding the target to the dtolnay @stable step: rust-toolchain.toml pins channel 1.89.0, so every cargo invocation uses 1.89.0 — not stable — and the wasm target had been added to the wrong toolchain. Declare `targets = ["wasm32-wasip1"]` in rust-toolchain.toml so rustup auto-installs the wasm std for the pinned toolchain everywhere (CI and local), and drop the now-redundant `targets:` from the workflow step. Verified locally: the home-cli wasm build compiles clean. Co-authored-by: Cursor <cursoragent@cursor.com>
… for GitHub Actions The full `just verify` cannot complete on a stock GitHub runner: its `local-carrier-setup-smoke` step fetches the net-provider artifact over Elastos Carrier, which a clean runner can't reach (proven on CI: it builds + runs the entire ~18-min gate and fails only there). Everything else a clean runner CAN verify. - justfile: add `verify-ci` = the full gate MINUS the carrier smoke, with a hidden `_verify-tail` shared by both `verify` and `verify-ci` so they can't drift. alignment-check stays first in both. `just verify` (with the carrier smoke) is unchanged for a Carrier-capable Linux box / self-hosted runner. - ci.yml: the Linux job now runs `just verify-ci` (renamed "Verify (Linux CI gate)") and documents that the carrier smoke is covered separately. This lands the branch's surface — incl. the 217-test capsule gate and the full elastos workspace fmt/clippy/test — under an enforceable green GitHub Actions gate. Co-authored-by: Cursor <cursoragent@cursor.com>
Fold the off-tree AV-watermarking feasibility study (verdict: GO) into the roadmap doc, with the audit caveats baked in rather than the harness's headline claims: - New Phase 0 (top of §5): video survival matrix, audio matrix, registration result, and the grant-anchored Tardos collusion chain. - FP correction: the harness's single-seed empirical threshold (mean+3.5sigma) is flagged as ~1.25% false-accusation (400-trial Monte-Carlo); a certified bound now requires the analytic Tardos threshold + an MC FP/FN sweep, and the per-asset bound is recomputed at the FP-controlled threshold (duration minimums move up). - New §3.4 Channel coding (required): the leak channel is bursty (whole-segment loss) -> timeline interleaving + an erasure-aware code; wired into chunks 2/6. - Audio re-validation made concrete (chunk 6): psychoacoustic masking model + PEAQ/ODG + human A/B/X on real music/speech/silence, and time-stretch/pitch. - Multi-strategy collusion (random/minority/all-ones/interleaving) mandated before any certified bound. - Registration -> Phase 5 gating DSP item (deterministic template/pilot or log-polar/Fourier-Mellin; brute search proven insufficient). - Full-variant-set AAD weld in §3.1/§4 (CEK binds the complete variant set; per-buyer selection is post-unwrap routing). - §8 resolved (ECC->Tardos, q-ary density lever, published per-asset bound at the FP-controlled threshold, channel-coding requirement); §7 honest-limits expanded; VMAF 96.7 demoted from gate to synthetic relative signal. Doc-only; no shipped behaviour. alignment-check green. Fix Widevine typo in §7. Co-authored-by: Cursor <cursoragent@cursor.com>
…review The "approve" step of the control loop (reflect → preview → APPROVE → act), parallel-safe and read-only. - elastos-runtime::approval (new, pure): `decide(mode, approver)` is fail-closed — the only path to Approved without an explicit yes is an affordance declared as needing no approval; User/RuntimePolicy default to PendingApproval; an explicit no always wins. `required_approval(actions)` scales the requirement with action strength (anything beyond read/message needs a human). 3 tests. - inspect/intent (new provider op, read-only): given a capsule + operation, derives the gate (via plan), the approval it requires, and the fail-closed default decision. Records nothing, dispatches nothing. - Gated consistently: `intent` added to the canonical op→action contract (Read) and the System-only browser allow-list. - Decisions: `revoke` and recorded approve/deny stay on the runtime/dispatch (mutation) path — the product InspectProvider remains a read-only projection. Recording pairs with dispatch (merge-gated). fmt --check PASS; targeted tests green (approval 3, inspect incl. intent 31 +2 ratchets ignored, provider_resource contract 1). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016ZKy5Cca9RzwDuLb1szdeq
- CAPSULE_INSPECTOR.md: add the inspect/intent wire contract (approval-intent preview); add a "path note" clarifying revoke + self are served on the embedded RequestHandler (shell) path while the product InspectProvider is a read-only projection (capsules/capsule/plan/intent) — closes the contract-honesty gap. - KNOWN_GAPS.md: G4 decision core DONE (approval + intent, fail-closed, tested); remaining = recording a signed approve/deny, which pairs with dispatch (G3). (An orchestrator CLAUDE.md was written locally but is .gitignored by repo policy, so it stays a local contract and is not committed.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016ZKy5Cca9RzwDuLb1szdeq
Pre-mainnet hardening from the deep audit (none block the branch; ① is the
item to put in front of the external auditor):
① Document the dKMS re-seal AAD invariant — the node re-seals the recovered
CEK under the caller-supplied aad_b64, which is NOT bound into the recover
possession-proof; safe only because the decrypt boundary rebuilds the
segment-bound AAD and fails closed. Loud comment at the seal_bound call +
THREAT_MODEL §7 note. Binding it into the proof is scoped with the auditor.
② Lock the release-build invariant — a compile_error! rejects a release build
(no debug_assertions) of dkms-authority with dev-modes/legacy-receipt-authz,
and a new CI job (dkms-release-invariant) asserts both directions. Adds
docs/DEPLOY_CHECKLIST.md (incl. the node-set-id authorize-time guard, which
is release-only and not unit-testable under cfg(test)).
③ Redact key-provider Debug — manual Debug on Request/ReleaseSessionContext
prints only the op name, so no CEK/escrow bytes can leak via {:?}.
④ viewer_open — log_fp(&object_cid) for the fresh-grant line (was the raw cid).
⑤ VENDORING.md — three.js r160 pin + periodic-refresh/upstream-watch plan.
Plus a fail-closed dKMS-open testing checklist in DKMS_OVER_CARRIER.md
(rights-mode + Carrier-rail must match how the asset was minted) so the
"foreign escrow" 502 diagnosis doesn't recur.
Gates: elastos-server fmt+clippy; dkms-authority build (debug+release) +
24/24 tests + guard verified; key-provider build + 52/52 tests; alignment-check.
Co-authored-by: Cursor <cursoragent@cursor.com>
…word + offline extractor AV forensic-variant layer, tractable + pipeline-free pieces built on the proven Phase-0/5 research. Feature-gated OFF by default (`av-variants`), so it cannot destabilize the default build; chunks 3/4/5 (mint transcode DSP, full-variant-set AAD weld, serve-time selector) are deferred to the live CENC/DASH/quorum pipeline. Chunk 1 — variant manifest schema (`elastos.ddrm.av-variants/v1`) in capsules/ddrm-envelope/src/av.rs: marked subset, q-ary variant refs (+ segment digest for the chunk-4 weld), codeword scheme (length/interleave/erasure τ/bias commitment). serde round-trips; validate() fails closed; single_encode() is the honest `fingerprinted:false` default. Chunk 2 — canonical, RNG-free codeword: asset_bias_vector / buyer_codeword (from grant_watermark_digest16, no per-buyer storage) / interleave_map / tardos_score. A domain-separated SHA-256 stream over integers (NOT any language's RNG), so the Rust serve selector and the Python extractor derive identical codewords. Replaces the Phase-0 numpy-RNG derivation. Chunk 6 — offline forensic extractor as the proven Python reference under tools/av-forensics/ (offline, operator-run, no key material, not in the boundary), re-anchored to the chunk-2 canonical construction. The load-bearing FM fix is preserved: register() resolves the Fourier-Mellin scale/rotation ambiguity on the VALID (non-border) region. The Rust --extract-av-fingerprint CLI is deferred until the scheme is frozen/certified. Cross-language anti-drift weld: tools/av-forensics/test_canonical.py asserts the same golden vectors as av::tests::canonical_golden_vectors — change either side and both fail. Wired into `just verify-capsules` (now also tests ddrm-envelope with av-variants), so CI covers the new module + the weld. Pure stdlib (no numpy/ffmpeg). Still uncertified (carried honestly in docs/AV_WATERMARKING.md): analytic Tardos threshold + Monte-Carlo FP/FN sweep (argmax is not proof), rotation estimator (out of envelope), audio on real content. AV remains key-protected, not fingerprinted, until chunks 3/4/5 ship and the certification gates pass. Gates: ddrm-envelope 51 tests (av-variants, incl. golden vectors); default build unaffected (module gated off); av.rs clippy-clean; cross-language weld PASS; ported extractor validated end-to-end (FM-reg → bitERR 0, leaker ranked top; no-reg fails closed); just verify-capsules PASS; just alignment-check OK. Co-authored-by: Cursor <cursoragent@cursor.com>
Replace the Phase-0 empirical mean+kσ accusation threshold (Monte-Carlo showed ~1.25% FP — not certifiable) with the analytic, FP-controlled threshold Z = √m·Φ⁻¹(1−ε/N): the innocent symmetric-Tardos score is exactly mean-0, variance-1 per kept position ⇒ N(0,m). - canonical.py: tardos_threshold + _inv_norm_cdf (Acklam, pure stdlib, extractor-side only — not a cross-language weld surface). - extractor.py: accuse only above the analytic Z (erasure-aware m), not an ad-hoc gap. - montecarlo.py: multi-strategy FP/FN sweep (random/majority/minority/ all-ones/all-zeros/interleave). 2000 trials, m=2332 N=500 c=3 ε=1e-3 BER=0.13 ⇒ FP ≤ ε with 100% detection across all six; old empirical threshold runs 2–10× over ε (majority ≈1.05%). - test_canonical.py: stdlib threshold sanity (Φ⁻¹(0.975)≈1.96, monotonicity, Z(2332,500,1e-3)=222.69) — runs in the CI weld. Code-level accusation statistics only; media-survival certification (real content/screen-record/CMAF lengths) remains open. Docs updated. Co-authored-by: Cursor <cursoragent@cursor.com>
Leads with the one deliberately-open invariant — the re-seal AAD is the caller-supplied aad_b64 and is NOT bound into the recover possession-proof (dkms-authority recover → seal_bound, src/main.rs:1028). Safe today only because the single consumer (decrypt boundary) rebuilds the segment-bound AAD and fails closed. Packages the SECURITY INVARIANT comment, THREAT_MODEL §7, and the DEPLOY_CHECKLIST open item into one hand-off with the trust boundary, crypto roots, CI-enforced release invariants, repro gates, and a reviewer checklist (incl. the landing test: tampered aad_b64 fails the possession-proof closed at the node). Co-authored-by: Cursor <cursoragent@cursor.com>
…pre-mainnet invariant) The dKMS node re-seals a recovered CEK under the caller-supplied `aad_b64`, which was NOT bound into the recover possession-proof. A MITM that tampered `aad_b64` in transit could make the node seal under an AAD of its choosing; it was safe only because the decrypt boundary independently rebuilt the AAD and failed closed (a compensating control, not a fix). Now the canonical possession-proof preimage binds `sha256(reseal_aad)` (`ddrm_envelope::recover_proof_message`, domain bumped v1 -> v2). The client signs over the exact AAD it sends (key-provider), and the node verifies the proof over the byte-identical `args.aad_b64` in `verify_session` BEFORE any CEK is recovered or re-sealed. The AAD (DecryptTranscriptV1) already carries `node_set_id` + `segment_digests`, so all three are bound transitively; the 32-byte digest keeps the preimage bounded for long presentations. A MITM cannot re-sign the proof (it lacks the token-bound caller key), so a tampered `aad_b64` now fails closed at the node (`session_invalid`). The decrypt boundary's rebuild remains as defense-in-depth. - ddrm-envelope: recover_proof_message/sign/verify take `reseal_aad`; bind sha256; bump DKMS_RECOVER_DOMAIN to /v2; unit test asserts tampered-AAD -> verify=false. - dkms-authority: verify_session verifies over decode(args.aad_b64) before recover; SECURITY INVARIANT comment rewritten to CLOSED; landing test recover_fails_closed_on_a_tampered_aad (35 legacy / 25 default tests green). - key-provider: recover_proof_b64 + both delegate paths sign over the request's aad_b64. - dev harnesses (ddrm-runtime-open, dkms-live-recover): each direct node recover signs over its request AAD. - docs: THREAT_MODEL §7 + DEPLOY_CHECKLIST + AUDITOR_PACKET §1 flipped open -> closed, with the landing test referenced. Gates: ddrm-envelope + dkms-authority tests, key-provider/dev-script builds, verify-capsules, alignment-check all green. Co-authored-by: Cursor <cursoragent@cursor.com>
…3+4 core) ddrm-envelope::av gains the pure, fail-closed serve-time selector (select_symbols) and the full-variant-set commitment (variant_set_commitment) that chunk 4 welds into the decrypt transcript. The selector binds the per-asset bias commitment (wrong secret -> refuse), supports arity-2 A/B (direct codeword->segment mapping, matching the proven tools/av-forensics extractor), and returns an empty selection for an honest single-encode. DecryptTranscriptV1 gains to_aad_with_all_bindings, a strictly-extending encoder that appends the variant-set commitment AFTER the rights binding, so a non-fingerprinted open stays byte-identical to to_aad_with_bindings (all committed goldens replay unchanged) while a fingerprinted open is bound to the exact published variant set (manifest swap / out-of-set variant fails the CEK unwrap closed). Pure functions, fully unit-tested; no pipeline wiring yet. Co-authored-by: Cursor <cursoragent@cursor.com>
asset_secret_from_master derives the per-asset watermark secret from a node-held master + the content hash, so the mint embed and the serve selector agree on the bias/codebook without ever publishing or per-asset-storing it (the manifest carries only the bias commitment; rotating the master re-keys every asset). build_manifest assembles + validates a fingerprinted VariantManifestV1 from produced variants (canonical interleave + bias commitment), or returns the honest single-encode for an empty marked set. A round-trip test closes the mint->serve loop: build_manifest keyed by the derived secret produces a manifest that select_symbols (same secret) accepts, and distinct buyers select distinct variant sets. Pure functions, tested. Co-authored-by: Cursor <cursoragent@cursor.com>
… open Mark the pure core of chunks 3/4/5 as landed (selector, variant-set AAD weld encoder, manifest builder, per-asset secret KDF — all in ddrm-envelope::av/ lib.rs, fail-closed + unit-tested) and spell out precisely what remains: the pipeline WIRING (ddrm-media-authority serve selection, decrypt-provider AAD rebuild, mint emit) plus the real perceptual DSP (bounded-placeholder seam now; certified embed swaps in post media-survival cert). Adds a "remaining wiring" section with exact files and the one thing needed to validate end-to-end (a gateway bring-up with a synthetic asset; real media only for the perceptual cert). Notes the interleave-application follow-up as tracked, not dropped. Co-authored-by: Cursor <cursoragent@cursor.com>
The local 2-of-3 stand-in nodes need the dev-modes legacy-receipt path to
authorize an offline recover (the live quorum uses wallet-signed grants); the
gateway dev script already builds the node this way. Without it the smoke fails
closed ("legacy receipt authorization is disabled") even on an unmodified tree.
With it the helper recovers a minted asset byte-identically (3/3 served).
Co-authored-by: Cursor <cursoragent@cursor.com>
…eam) embed_placeholder_variant appends an ignorable ISO-BMFF `free` box carrying the variant symbol AFTER the mdat, so the fragment stays valid/playable but byte- distinct per symbol; encrypt_fragment (CENC) and strip_senc (decrypt) both carry it through verbatim, so the selected variant is byte-distinct end-to-end and the symbol survives back to the clean fragment. read_placeholder_variant recovers it (the placeholder stand-in for the offline extractor). Explicitly NOT a watermark (no perceptual signal, no transcode survival) — it makes mint->serve->select->weld real and testable; the certified DSP embed swaps in behind the same interface post-cert. Tested end-to-end through the CENC rail on the real ffmpeg fixture. Co-authored-by: Cursor <cursoragent@cursor.com>
A gated integration test (cargo test -p ddrm-media --features av-variants) that
runs the exact functions the production wiring will call, on the real ffmpeg
fragmented-MP4 fixture + real CENC: mint per-segment {A,B} variants -> build
manifest -> per-buyer select_symbols -> read selected ciphertext -> weld
segment-digests + variant_set_commitment into the transcript AAD -> extract.
Proves: two buyers get distinct codewords -> distinct served bytes + distinct
welded AADs (identical only where symbols coincide); substituting a served
variant OR forging the manifest changes the AAD (fail-closed at the CEK unwrap);
decrypting a served variant recovers that buyer's symbol; and the single-encode
path is byte-identical (the fingerprint layer is strictly additive). The server
IPC wiring (creator mint / media-authority serve / decrypt-provider rebuild) now
plugs into proven libraries.
Co-authored-by: Cursor <cursoragent@cursor.com>
The re-seal AAD hardening (39fead5) bumped the recover possession-proof preimage from v1 -> v2 (added sha256(reseal_aad)). The live geo nodes still verify v1, so every recover proof from a freshly-built key-provider was rejected on all nodes (0-of-N served) -> 502 "could not open owned media from the dKMS quorum". The session handshake uses the unchanged session domain, so it passed, masking this as an open bug rather than a protocol skew. Revert the client + local node to v1 so opens succeed against the deployed quorum. The AAD-binding hardening (a real pre-mainnet invariant) is now STAGED, not active: it must ship together with a coordinated geo-node redeploy, never client-only on a branch that opens the live quorum. Adds the failure mode to the Carrier runbook symptom->cause table and a "Protocol compatibility invariant" section so this cannot silently recur. Co-authored-by: Cursor <cursoragent@cursor.com>
…live gateway Production wiring on top of the AV core (chunks 3/4/5): - encrypt-provider: opt-in `av_variants` on seal_segments_threshold emits per-segment byte-distinct variants + a bias-committed manifest, keyed by an asset secret derived from ELASTOS_AV_MASTER_B64 (the master never crosses the server boundary). Honest no-op when not provisioned. - creator.rs: forwards the emitted variant files + av-variants.json INTO the published DASH directory (inside the asset CID), gated by ELASTOS_AV_VARIANTS. - ddrm-media-authority: apply_variant_selection picks the buyer's variant from the wallet grant before the 2-of-3 recover/CENC weld; fail-closed on a bias-commitment (wrong-master) mismatch, honest single-encode fallback when there is no manifest / master / grant. Surfaces `fingerprinted` on the media descriptor. - run-creator-gateway.sh: enables AV with a persistent bias master shared by the mint boundary and the serve helper (both inherit the gateway env). Verified live: a minted asset's CID carries av-variants.json (fingerprinted, arity 2, bias commitment) + two byte-distinct variant segments; the open recovers 2-of-3 and serves the per-buyer variant. NOTE: variants are the bounded placeholder embed (ignorable ISO `free` box) -- routing + crypto weld proven; perceptual DSP survival is still the deferred certification step. Co-authored-by: Cursor <cursoragent@cursor.com>
…DID memoization Two serve-hot-path optimizations. Both are pure efficiency: no change to the authorization decision, the CEK containment/zeroization, or the served bytes. 1. decrypt-provider CENC: decrypt the mdat range IN PLACE instead of copying the mdat out, decrypting it, then rebuilding the whole segment. Collapses two full-segment copies into one per served segment (MB-scale per fragment). `decrypt_samples` keeps its exact Vec-returning signature (the PC2 conformance driver pins it); the live path uses the new `decrypt_samples_in_place`. Output is byte-identical — proven by the end-to-end golden segment test and the CEK-containment smoke check (143/143 decrypt-provider tests green). 2. gateway: memoize the boot-stable gateway runtime DID so per-request home launch-token verification no longer re-reads device.key + re-derives the ed25519 identity on every protected-asset fetch. The DID is deterministically derived from the device key (see docs/CARRIER.md) and is boot-stable. Only the POSITIVE result is cached — a missing identity keeps being re-checked, and the cached value can never widen authority. The signature / expiry / session-active checks that actually authorize a request stay fully per-request. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01VjFQt6DK9ZGnLs4ykUWsuX
…anchor verdict Commit 2 of the audit follow-up, run through the orchestrator pipeline. Scope shrank honestly: the headline finding H1 traced to safe-by-construction, so it is CLEARED (documented + pinned), not "fixed" with churn. M3 (real, reachable by untrusted input): parse_trun / parse_senc read a u32 `sample_count` straight from the (untrusted) segment and drove an unbounded pre-allocation AND read loop before the truncated-buffer read fails — a forged count (e.g. 0xFFFFFFFF), or a degenerate no-flags trun that reads 0 bytes/entry, OOMs via push-growth. Reject an implausible count up front (fail-closed) with a generous 1<<20 ceiling that never rejects real fMP4 (fragments carry at most a few thousand samples). The subsample count is a u16 and already self-bounding — left as-is. New tests assert a huge count fails closed and a normal count still parses. H1 (cleared, safe-by-construction): the forensic watermark anchor is derived from the client-supplied delegation signature before the gateway verifies it, which *looks* like it trusts an unverified sig. Traced to ground: a forged signature fails the dKMS node's own verify_access_grant (EIP-191 owner recovery) + on-chain hasAccessByContentId, so no CEK is recovered, no decrypt happens, and the watermark embed (only after a successful quorum recover) never runs — a forged anchor can never reach an egressed, decrypted frame. Pinned by a new access.rs invariant test (delegation_sig_from_wrong_wallet_fails_closed) and recorded in the PRINCIPLES_CONFORMANCE "do not re-churn" register so a future pass doesn't re-open it. Gate: just verify-capsules components all green — decrypt-provider 146, ddrm-envelope 76, ddrm-media-authority 15, python canonical weld PASS. No CEK-path or served-byte change; the cross-language AV golden weld is unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01VjFQt6DK9ZGnLs4ykUWsuX
…fail-closed scope rules - plan emits elastos.inspect.gate-preview/v1 (capabilities, audit events, execution policy, dispatch:false) so inbox gate summaries show real authority again - revoke is an explicit unsupported_operation, not a silent fallthrough - provider_resource gains inspect_resource(op) so unknown inspect ops fail closed - restores the four inbox-approval gateway tests (fresh passkey, principal scoping, deny-without-dispatch) and provider authority/redaction tests - docs: Act path and runtime scope-rule expectations, corrected inspect/self routing Co-authored-by: Cursor <cursoragent@cursor.com>
…rces
- dkms-authority: deny_unknown_fields on the Request enum so hidden authority
fields fail closed; lockfiles pick up elastos-common 0.5.0
- creator/ddrm-viewer: reword raw chain/backend references so app capsules stop
claiming provider authority they route through the runtime
- library: replace platform-branded "Finder" wording with file-manager phrasing
- marketplace: classify providers via name.endsWith("-provider")
Co-authored-by: Cursor <cursoragent@cursor.com>
… post-merge truth - home-entropy-check: current home asset version, expanded library open allowlist, post-merge inspector routing, act-emitter README in the Users/self allowlist - check-wci-alignment: justified exclusions for chain-native crates, backend-scheme elacity pattern instead of the bare word - command-smoke/installed-command-audit: hermetic HOME on macOS and a portable timeout (timeout/gtimeout/perl alarm) so the gates run off-Linux - state.md: restore the canonical journey proof records lost in the merge - docs: unlink gitignored CLAUDE.md, point DDRM rail table at per-capsule wasm-smoke scripts Co-authored-by: Cursor <cursoragent@cursor.com>
filter/map instead of bool::then in filter_map for browser session listing, tail expression instead of return in the cfg-split supports_hibernation, and indented doc-comment link definitions in elastos-vz. Co-authored-by: Cursor <cursoragent@cursor.com>
…ricks `elastos home`
Root cause of the local-carrier-setup-smoke failure ("Capability request still
pending after 3s"): the G-ID flip fail-closes every identity gate for sessions
with no capsule identity, and /api/auth/attach created exactly such sessions
(vm_id: None). The managed-home flow then dead-ended three ways: capability
intake recorded no requester identity, the consent-broker's grant POST 403'd
fail-closed ("no requester capsule identity") in an infinite retry loop, and
even a minted token would have been unredeemable ("session has no capsule
identity"). Fail-closed did its job; the flow lost its identity plumbing.
Predates the 0.5 merge — the smoke was never re-run on Linux after G-ID landed
(the Mac cannot run it), so it slipped every gate until now.
Fix at the root seam: attach-authenticated sessions record an HONEST host
identity ("host-client" / "host-shell") — the attach secret is owner-only
(chmod 600), so the caller IS the host user; this is truthful identity, not
fabrication. Intake, grant mint, and token redemption now agree end-to-end.
No authority widening: grants still require consent-broker approval; tokens
still bind to the recorded identity; audit records it.
Proven live: `just local-carrier-setup-smoke` now passes on Linux (was the one
red step in `just verify`); replayed the failing grant against the live runtime
before/after (403 "no requester capsule identity" -> granted). Regression test
pins the identity on both scopes.
Gate: cargo test -p elastos-server --lib green (1044), clippy clean, fmt clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…t-scan invariant test The 0.5 merge left three first-party capsule providers declaring `provides: elastos://<name>/*` for names NOT in RESERVED_SUB_NAMES: `market` (content-market storefront — no boot fallback, route never exists), `object` (Library object authority) and `operator-drive-adapter` (both also register a boot main-provider but lose their VM sub-route). At capsule launch the supervisor's register_provider_route fails closed and the failure is warn-swallowed, so the provider silently goes dark — the same live-only class the dDRM-spine fix repaired, still open for these three. - Reserve the three names (strict superset; no capability removed). - Add pub is_reserved_sub_name() as the single-source-of-truth predicate. - Add test_all_capsule_provided_sub_schemes_are_reserved: scans every shipped capsule.json `provides` sub-scheme and asserts it is reserved — no boot needed. This is the general invariant the hardcoded dDRM-spine test only covered for three names; it would have caught all of this and reds on the next provider capsule that forgets to reserve its scheme. Gate: cargo test -p elastos-runtime --lib green (384), fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ore 6 inbox tests
Intake bug (Ravi P16/P11, KNOWN_GAPS G3): create_inspect_action_request only
checked plan.status=="ok", but the inspector's plan returns
{status:"ok", data:{valid:false, error:"unknown_operation"}} for an operation
the target authority never declared. That created a PENDING inbox approval with
an EMPTY gate preview — prompting a human to approve an act whose authority is
invisible. Consent requires visibility.
- Reject at intake when plan.data.valid != true, BEFORE persisting: no record,
no notification, no approvable row, no dispatch_approved reachability.
- Restore the 6 inbox-approval regression tests dropped in the 0.5 merge,
grafted from origin/review/0.5.0 against the existing merged harness — inbox
suite 4 -> 10.
- Add inspect_action_rejects_undeclared_operation_before_inbox: asserts the
undeclared op is rejected AND leaves zero approvable Inbox rows (structural
fail-closed, not a hidden display string).
Gate: cargo test -p elastos-server --lib inbox suite green (11 incl. new guard).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
register_sub_provider was last-write-wins, so a launched capsule whose manifest declares `provides: elastos://encrypt/*` (or key/decrypt/wallet/…) could seize the CEK-escrow / key / signing route from the trusted boot provider — ambient authority via registration order (Principle 3) and a break of the mediated key/decrypt plane (Principle 15). - Pin the escrow+keys+signing+mint spine (encrypt, publish, media, key, decrypt, drm, rights, wallet, chain): once bound at boot, a later registration of the same still-live name is refused structurally (Err), checked under the write lock (race-free). Non-pinned reserved names keep last-write-wins for hot-reload / test double-registration. - unregister_sub_provider frees the slot, so a genuine teardown→restart of the same provider re-mounts cleanly; only overwrite of a live pinned slot fails. - register_sub_provider now routes its reserved-name check through the new is_reserved_sub_name() predicate (single source of truth; also clears the dead-code warning). Validated empirically: `just local-carrier-setup-smoke` (full Linux boot + `elastos home`) passes with the guard live — boot registers each pinned name exactly once, so nothing legitimate is refused. Test proves refuse-overwrite, original-stays-bound, and restart-after-unregister. Gate: cargo test -p elastos-runtime --lib green; clippy -p elastos-runtime 0; smoke green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…-auth funnel Two same-class hygiene fixes surfaced by the audit (both "one canonical path", Principle 10): 1. DDRM test env-lock: mint/buy/rights/owned_ledger each held their OWN `static ENV_LOCK`, so a lock only serialized a module against itself while the mutated `ELASTOS_DDRM_*` vars are process-global — a reader in one module could observe another module's mid-test mutation and fail closed (the exact nondeterministic class the trusted-auth-env guard fixed). Replace the four disjoint statics with one shared `api::ddrm_env_lock()` so all DDRM env mutation serializes on a single lock instance. 2. Trusted-auth funnel: `room_transport_identity_data_dir` was a byte-identical copy of `home_launch_auth_data_dir` (env read + test guard). Delegate to the canonical one so the two can't drift; the entropy-check-pinned `home_launch_auth_data_dir` symbol is unchanged. Gate: cargo test -p elastos-server --lib green (1051), fmt clean, 0 warnings. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…f-tier
Audit surfaced a three-way contradiction: docs said "/api/provider/inspect/self
is System-only", but the code routes self to the app/browser tier
("self" => &[BROWSER_CAPSULE_ID]) AND the entropy-checker simultaneously pinned
BOTH the BROWSER-self code and the stale "System-only" doc line.
Decision (owner): keep the self-tier — a legitimate KEEP transparency capability,
fail-closed by construction (gateway injects the authenticated principal_id,
client-supplied id ignored, authorize_view enforces caller == target under
InspectScope::SelfOnly), already covered by
inspect_self_returns_own_record_and_ignores_client_id and
inspect_self_token_cannot_reach_system_capsule_op.
- docs/CAPSULE_INSPECTOR.md + docs/INSPECTOR_TESTING.md: self is a live,
caller-bound, fail-closed route (not System-only).
- home-entropy-check.mjs: pin the new fail-closed self-tier language instead of
the stale "System-only" phrase, so code, docs, checker, and tests all agree
(Principle 12). No code/behavior change.
Gate: home-entropy-check PASS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ants
serde's container `deny_unknown_fields` does NOT apply to UNIT variants of an
internally-tagged enum, so the quorum authority's Request::Status / ::Shutdown
silently accepted `{"op":"status","smuggled":true}` — a small fail-open seam on
an untrusted protocol surface (Principle 11). The authority-carrying variants
(Hello/Recover/RotateShare/…) are struct variants and already fail closed; only
the two empty ones leaked.
- Convert Status/Shutdown to empty STRUCT variants so deny_unknown_fields covers
them; update the four match sites.
- Add empty_variants_reject_unknown_fields (clean parse; hidden field rejected).
Scoped to only the logical change (no whole-file reformat, per the shared-tree
lesson). Gate: cargo test -p dkms-authority green (25); no new clippy warnings.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…in KNOWN_GAPS Turn the remaining audit finding into a build-visible, tracked contract rather than prose (LESSONS.md: audit → gap registry). server_infra warn-swallows a register_sub_provider Err at boot for ~22 providers; the capability still fails closed at route time (not fail-open), but a spawned-but-unregisterable boot-critical provider goes silently dark with only a warn. Row records the anchor, the distinction (absent-binary=warn ok vs spawned-but-rejected=loud), the close criteria, and a pending ratchet (needs a boot failure-injection seam). The other remaining finding — carrier-service launch skipping the author- signature gate — is already tracked as AUD-1 RESIDUAL (b); not duplicated. Docs-only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…is session's fixes Registry-truth sweep (LESSONS.md: audits feed resolutions back — a doc that rots is a liability). Reconcile every row whose truth changed under this session's commits: - G-ID residual: drop `attach.rs:63` from the "None-vm_id follow-ups" list — attach host sessions now carry an honest host-shell/host-client identity (`279dac1`), closing the live-only managed-home dead-end the smoke caught. - PRINCIPLES_CONFORMANCE §A RESERVED_SUB_NAMES: mark it DESIGN-gap-only now — the acute risks are build-guarded (manifest-scan invariant `1fc2a14`; first-writer-wins pin `8b688fc`); drop the stale `:448-476` line ref. - Enforced invariants (+3): every provider `provides` sub-scheme is reserved (no silent-dark); boot-critical sub-providers pinned first-writer-wins; request_act intake fails closed on an undeclared op. inspect/self tier was already reconciled in `e51be7b`; DDRM env-lock is test-infra (no row). Docs-only. Gates: home-entropy + wci-alignment PASS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…ratchet AUD-6 seam + first fix. Boot-critical sub-provider registration was warn-swallowed at ~19 server_infra sites: a spawned-but-unregisterable provider (an invariant violation → a dark mint/keys/signing path) left the runtime up with only a warn. - `encrypt` (CEK escrow — the crown jewel) now PROPAGATES its register_sub_provider failure (`?`, boot fails loud) instead of warn-swallow. Only the registration-rejected branch changes; absent-binary stays the outer warn (genuinely optional). Smoke-validated: real boot registers encrypt once, no Err, boot proceeds — `just local-carrier-setup-smoke` green. - `#[ignore]`d ratchet `aud6_boot_critical_sub_provider_registration_fails_loud` scans for the warn-swallow line per boot-critical scheme; run with --ignored it FAILS today, listing publish/media/key/decrypt/drm/rights/wallet/chain (encrypt absent = fixed). Flips green — delete #[ignore] — when the rest are classified critical-vs-optional and rewired. Non-blocking in normal CI (ignored). - KNOWN_GAPS AUD-6 updated: PARTIAL (encrypt), ratchet named. Gate: cargo test -p elastos-server --bin green (96 pass, 1 ignored); smoke green; server_infra.rs rustfmt-clean (scoped). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…t response paths (DoS) Audit swarm finding (Priya, HIGH): the primary Carrier request path used unbounded `read_line` on remote-controlled streams. `handle_file_stream` accepts every inbound CARRIER_ALPN connection with no peer auth and then read a whole line into memory, so a remote peer could OOM the node pre-auth with a newline-less flood. The same class was already fixed for the WASM/microVM bridges (BUG-6, bounded `read_bounded_line`, 1 MB cap) but never applied here. The client-side response readers (release_head, provider_invoke, gossip push/pull, operator send_request) had the same gap against a malicious source we dialed. Fix (fail-closed, no protocol change): expose the existing bounded reader `pub(crate)` and funnel every Carrier newline-delimited control read through one shared `read_bounded_carrier_line` helper (1 MB cap; oversized/truncated = error, not a giant alloc). Carrier bulk bytes ride the separate length-prefixed path (already capped at 200 MB), so the 1 MB bound only ever constrains small JSON control lines. Sites: carrier.rs handle_file_stream (inbound, HIGH) + 4 client response readers; operator_control.rs inbound handler + peer response. Gate: cargo build -p elastos-server green; clippy -p elastos-server --lib clean; 2 new regression tests (oversized flood refused, normal line round-trips) pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…only ops (T1)
Audit swarm finding (Sol, CONFIRMED): `handle_file_connection` accepts every
inbound CARRIER_ALPN connection with NO peer authentication, and
`validate_carrier_provider_invocation` is self-referential (it checks
caller-supplied envelope fields against each other, not against a
runtime-issued capability). So any anonymous remote peer could invoke the
whole provider_invoke matrix — confirmed harm: `content:publish`/`import_exact`
pin arbitrary bytes into the node's store under a caller-supplied
`principal_id` (unauthorized write + quota-attribution abuse); critical
caveat: the `key`/`decrypt`/`drm` targets were reachable too.
Fix (fail-closed, default-DENY): `carrier_provider_plane_allows_unauthenticated`
is a strict allowlist — only `content:{fetch,status,admission}` (non-mutating
reads: fetch bytes, read status, quota *decision*) pass. Every write
(publish/import_exact/import_object/ensure/unpublish/repair) and every
key/decrypt/drm/rights/availability op is refused with
`unauthorized_provider_operation` BEFORE `send_raw` ever runs.
Trade-off (user-approved "lock read-only now"): authenticated push-replication
and cross-node key/rights flows over the plane are disabled until real Carrier
peer authentication lands — tracked as G-CARRIER-PEER in KNOWN_GAPS. Widening
the allowlist without peer auth reopens T1.
Gate: cargo clippy -p elastos-server --lib clean; full carrier test module
57/57 pass; 2 new refusal tests (write op refused, key/decrypt/drm refused) +
existing content:fetch dispatch test still green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
… (T3) Audit swarm finding (Nadia, HIGH, confirmed end-to-end): `validate_public_ip` checked only the native IPv6 predicates (loopback/unspecified/unique-local/ link-local), so IPv4-mapped IPv6 literals evaded every guard — `::ffff:169.254.169.254`, `::ffff:127.0.0.1`, `::ffff:192.168.1.1` all returned "public". The `url` crate preserves the mapped form through the host allowlist, DNS resolver, and connect; on a dual-stack host the kernel routes `::ffff:a.b.c.d` to the bare IPv4, so a capsule with a permissive `http_fetch` backend could read `http://[::ffff:169.254.169.254]/latest/meta-data/...` (cloud metadata / loopback services). Fix: in the V6 arm, normalize `to_ipv4_mapped()` (and the deprecated IPv4-compatible `::a.b.c.d` via `to_ipv4()`) FIRST and recurse into the full v4 private/loopback/link-local guard. Ordered so `::1`/`::` are still caught by the native predicates before the v4 fallback. Applied identically to exit-provider and net-provider (the two SSRF egress mediators). Gate: cargo test + clippy on both standalone capsule crates green; new regression test `validate_public_ip_blocks_ipv4_mapped_private_targets` (mapped metadata/loopback/RFC1918 refused; public v6 + public mapped v4 pass). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
Audit swarm finding (Vera+Dmitri, HIGH, confirmed): the audit-chain signature was strippable via an unauthenticated `alg` downgrade. `compute_record_hash` hashes only `domain ‖ seq ‖ prev_hash ‖ event_json` — `alg` and `sig` are NOT in the preimage — and `verify_chain` ran the ed25519 check only `if rec.alg == "ed25519"`. So an offline editor with NO signing key could rewrite the entire event history, recompute every (public) record_hash, relink the chain, set `alg="none"`, drop `sig`, and pass: `verify_chain` returned Ok, `chain_attestation` reported verified=true, still advertising the real signer. This defeated the module's own tamper-evidence guarantee — the EU AI Act durable-custody claim. Fix (no on-disk format change): make the decision to check the signature independent of the forgeable `alg`. When a verifying key is supplied (custody / tamper-evidence mode — both production callers, with_file_verified and chain_attestation, derive the key from self.signer, present iff the log is signed), EVERY record MUST be ed25519-signed and verify; a non-ed25519 alg in a signed chain is a downgrade and is refused fail-closed. The keyless (memory/unsigned) path is unchanged and still refuses to report a signed record as verified without its key. Gate: cargo clippy -p elastos-runtime --lib clean; all 19 audit tests pass, incl. new `signature_downgrade_forgery_is_refused` (full forgery: event edited, record_hash recomputed + relinked, sig stripped → refused; hash-chain is internally consistent so ONLY the mandatory-signature rule catches it). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
… charset guard (T6)
Two MEDIUM audit-swarm findings (Nadia):
T5 — exit-provider `http_fetch` auto-followed ureq's default 5 redirects. The
private agent has no IP-validating resolver on redirect hops, and the backend
host allowlist is only checked against the INITIAL URL, so an allowlisted host
could `302` the fetch to cloud metadata / any non-allowlisted host. Fix:
`.redirects(0)` on both agents — the mediator returns the 3xx to the caller
instead of following; the capsule re-issues `http_fetch` for the new URL, which
re-runs the full URL + host + allowlist + resolver validation per hop (each
egress individually capability-checked). All 29 exit-provider tests still pass.
T6 — the carrier `operation` was only checked non-empty, then interpolated into
`/api/provider/{scheme}/{operation}`; `Url::join` normalizes `..`, so
`x/../../capability/request` escaped the provider gate and reached arbitrary
local-API endpoints as the capsule's own token. Fix: restrict `operation` to a
single `[A-Za-z0-9_-]` segment in `carrier_invoke_dispatch`, rejecting
`/`/`.`/`%` etc. before it reaches the URL.
Gate: clippy clean on both crates; 8/8 carrier dispatch tests pass incl. new
`carrier_invoke_dispatch_rejects_path_traversal_operation` (traversal/dot/pct
refused, normal underscore op still parses); exit-provider 29/29 green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
`just verify`'s `cargo fmt --check` step flagged four non-canonical lines in the test code added by the audit-fix chunks (assert! wrap, .replacen args, Cursor::new arg, for-loop array). Formatting only — no logic change. Applied by hand (scoped to the exact lines) to respect shared-tree discipline; scoped `cargo fmt -p elastos-runtime -p elastos-server --check` now clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
Doc-truth reconcile: add the audit-swarm callout to the KNOWN_GAPS opening so the registry reflects the six confirmed reachable defects fixed this pass (T1 carrier plane lock, T2 bounded reads, T3 SSRF, T4 audit downgrade, T5 redirects, T6 operation traversal), the cleared-as-sound surfaces, and the deferred roadmap (T7 crypto migration, perf ceilings, quality cleanups). The open residual (T1 peer-auth) is already the G-CARRIER-PEER row. Gate: home + browser entropy checks, WCI alignment, and git diff-check all pass on the doc change; full `just verify` was green on the code at HEAD. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…yte copy Both VM-launch overlay sites (rootfs.rs get_or_create_overlay and the inline copy in supervisor.rs) did a full tokio::fs::copy of the ~335 MB rootfs.ext4 on every launch. Replace both with a shared reflink_or_copy helper: a copy-on-write clone via `cp --reflink=always` — an O(1) metadata op on CoW filesystems (btrfs/xfs/zfs/bcachefs) — that transparently falls back to the exact same pure-Rust full copy on any failure (non-CoW FS, cross-device, or `cp` absent). Correctness is identical on both paths: the result is an independent writable file with identical contents (a reflink gives copy semantics, not a shared mutable file). Only the cost changes. New unit test asserts independence — writing the clone leaves the source untouched — so it holds whichever path the host filesystem takes. Audit-swarm finding (Berger, HIGH, safe, free): the standout no-measurement-gate latency win — a full image copy on the launch hot path with a free O(1) replacement. mkfs.ext4 is already shelled out from this crate, so external-tool use here matches the established pattern. Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
The GAP-8/AUD-2 custody write on the dDRM open path called audit.content_open(...) synchronously inside the async handler; content_open -> emit does a full fsync, so every open parked a tokio worker thread on disk I/O. Wrap it in spawn_blocking with owned clones of the record fields (the Arc<AuditLog> handle is cloned in). The fail-closed contract is preserved exactly: the open proceeds ONLY on Ok(Ok(())); an emit error (Ok(Err)) refuses it as before, and a join failure (Err) is now also treated as a write failure and refuses the open — content whose open cannot be durably, tamper-evidently recorded still does not happen. The fsync itself is unchanged (custody durability is not weakened); it just no longer blocks an async runtime thread. Audit-swarm finding (Vyukov, HIGH, safe): custody fsync on the async worker on the open hot path. Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
…tors
content.rs and carrier.rs each carried byte-for-byte copies of three
security-invariant validators: the SSRF egress URL guard (reject inline creds,
allow only https or loopback http), the HTTP-header CRLF-injection guard, and
the content path-traversal guard. Duplicated security logic drifts silently —
tightening one copy leaves the other on the weaker rule (the same class that let
an SSRF gap exist in two places).
Extract the logic into one `net_validation` module (with unit tests) and reduce
the six local functions to trivial label-passing delegators. Zero call-site
churn (~28 callers unchanged) and byte-identical error messages — the label
parameter reproduces each surface's exact prefix ("operator alert" /
"carrier external endpoint" / "carrier authorization header"). Behavior is
unchanged; the security rule now lives in exactly one place per invariant.
Audit-swarm finding (matklad, MED): security-validator duplication / drift.
Gate: full `just verify` green (fmt/clippy -D warnings/test/carrier smoke);
3 new net_validation unit tests pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01FEL7iSfBWL2JiAFDy8fq5z
Produce a single MPEG-DASH/CENC-compliant asset (ISO-IEC 23001-7) for every media (DASH) mint, while keeping the server-decrypt rail's own player working by down-converting back to a plaintext-looking init at the fetch point. - ddrm-envelope: shared `pssh` module -- single source of truth for producer, runtime decrypt read-path, and playback clients. ELASTOS_PQ_SYSTEM_ID (b6e254ef-0dc5-47fe-94e7-0e72ed1dc7b0); build_pssh (v1 box, default-KID + opaque .asset.protections JSON) / parse_pssh (v0/1, trailing-moov tolerant). - ddrm-media: cenc_signal_init() (avc1->encv / mp4a->enca + sinf(frma/schm/tenc) + pssh moov child) and strip_cenc_signal() as a byte-exact inverse. Roundtrip tests assert strip(signal(x)) == x, no-op on unsignaled, fail-closed on double. - encrypt-provider: CencSignalInits op -- pure public box surgery (no CEK/secret), wraps the runtime-built PSSH envelope and rewrites each per-track init; returns transformed inits + pssh_b64 for the MPD. - creator (producer): after the threshold seal, build the PSSH envelope from dkms_protection, CENC-signal each init, and patch stream.mpd with <ContentProtection> (mp4protection:2011 + cenc:default_KID + per-system pssh). - ddrm-media-authority: read_dash_init strips CENC signaling at the fetch point so the seal-bound AAD init and the runtime player's served init both match the plaintext init the mint sealed (no AAD mismatch). - Flip on by default: drop the ELASTOS_DDRM_CENC_PSSH gate -- CENC signaling + MPD ContentProtection are now standard output. Additive; existing playback unchanged. Squashed from: d012fc4 047d38f 4d26798 d6fb99f 3ac5fdc 4edfd9e elastos-server 782+95 green; helper 15 green; fmt clean.
…TY-2282) Stop the dKMS quorum path from wedging and leaking processes under playback+reload, and make the local test suite pass off the Linux x86_64 gate. - dkms-authority (Defect A): serve each accepted connection on its own thread (serve_unix_listener / serve_tcp_listener) so an idle/slow/leaked client can no longer head-of-line-block the daemon in read_frame; revoked_callers becomes daemon-lifetime Arc<Mutex> shared state (additive+idempotent); 30s per-conn read timeout on both transports. Regression test drives the real Unix accept loop (RED pre-fix, GREEN after); 35/35 green. - key-provider: bound the Unix recover read in establish_dkms_session with the same DKMS_TCP_READ_TIMEOUT_MS (5s) the tcp/carrier branches use, so a wedged node fails fail-closed within a bounded window. 18/18 green. - dkms (Defect B): reap leaked quorum helper/provider processes -- add Drop for the helper Capsule (kills+reaps key-provider/decrypt-provider children on every path), and guard MediaAuthorityProc launch/launch_quorum with a ChildReaper so early-return/error paths no longer orphan the raw Child. - browser: keep the runtime stream socket path within the macOS sun_path limit (104) -- fall back to a short "/tmp" base when temp_dir() would overflow, fixing the 6 browser-open route tests on macOS arm64 (Linux unaffected). - test(elastos-server): key component-checksum fixtures by detect_platform() so verify/stamp and agent-binary tests run on any host without masking the check. Squashed from: 50cdc46 0c22718 46a7ba4 d93f673 a9283b5
…view Address correctness, security, and robustness findings across the DKMS and encryption/decryption workflows, each with regression tests. - dkms-authority: revocations now share ONE live Arc<Mutex<HashSet>> across all connection threads (was a per-connection snapshot merged only on close), so a revoke binds every open connection immediately — "revocation outranks a live session" holds under concurrency. Unify the Unix/TCP accept loops into one generic serve_accept_loop with a MAX_ACTIVE_CONNECTIONS cap + RAII slot guard, bounding the thread/memory-exhaustion (slow-loris) vector on the network node. - key-provider: distinguish a transport fault from a node rejection (NodeRecoverError). A warm pooled connection the node's idle timeout closed is re-established and retried ONCE; a genuine rejection still fails closed with no retry. Fixes the first open after a >30s idle gap failing below quorum. - ddrm-media: drive the enca/encv choice off the authoritative hdlr handler type (fallback to an expanded audio-4CC allowlist), and make parse_codec_string use the same allowlist so the two classifiers can't diverge — an uncommon audio codec is no longer mis-signaled as video (non-compliant init + strip missize). - ddrm-media-authority: read_dash_init propagates strip_cenc_signal errors instead of unwrap_or(raw), so a malformed init fails with a precise diagnosis rather than an opaque downstream decrypt/quorum failure. - encrypt-provider: decode_kid16 validates length AND ASCII-hex charset before byte-slicing, rejecting a multibyte KID instead of panicking the capsule. - elastos-server: browser stream sockets use a per-euid dir created 0700 and refuse any pre-existing dir not owned by us or group/other-writable, closing the world-writable /tmp squatting / socket-hijack vector.
92573d2 to
6798942
Compare
The ci.yml on this line (inherited from flint-0.5) invokes 'just verify-ci' and 'just verify-capsules', but the justfile never got the recipes; both CI jobs fail on every run with 'justfile does not contain recipe'. Port the recipes from feat/ddrm-hardening-and-creator-parity, whose feature sets and paths all exist on this branch (verified locally: verify-capsules, alignment-check, command-smoke, candidate-command-audit all pass).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This branch brings the runtime to 0.5.0, headlined by standards-compliant MPEG-DASH / CENC media DRM (ELACITY-2283) and a hardening pass on the DKMS quorum key-authority plane (ELACITY-2282). Along the way it lands the creator publishing flow, per-capsule WASM resource limits, forensic A/V watermarking, a batch of serve/library performance work, and a security-review hardening pass.
mainhas been merged in (--no-ff) and all conflicts resolved.What's in it
🎬 MPEG-DASH / CENC compliance (ELACITY-2283)
encv/encasample entries carryingsinf/schm/tenc,psshinjection) so a stock CENC player / FFmpeg keys decryption offtenc.enca/encvchoice is driven by the track's authoritativehdlrhandler type.ddrm-media,encrypt-provider,decrypt-provider,ddrm-envelope,elacity-player,ddrm-viewer.🔑 DKMS quorum reliability + hardening (ELACITY-2282)
🎨 Creator publishing flow
🧱 Per-capsule WASM resource limits
Engineacross capsules.🕵️ Forensic A/V variant watermarking
⚡ Performance
🛡️ Security-review hardening pass
A focused pass on the DKMS/encryption workflows, each with regression tests:
enca/encvclassification driven by the authoritativehdlrtype, with an expanded codec fallback.read_dash_initpropagates strip errors instead of masking them as opaque decrypt failures.decode_kid16rejects malformed (multibyte) KIDs instead of panicking the capsule.0700dirs and refuse foreign-owned/writable dirs (closes a/tmpsquatting vector).📚 Audit & conformance
Merge with
mainmainwas merged in (--no-ff); 25 files conflicted and were resolved considering both branches. Highlights:wasm.rs— unified ours' memory-clamp + epoch-termination with main's fuel metering, hostcall wiring, and wall-clock timeout into oneexecute_wasm.gateway_browser_stream.rs— folded ours' socket hardening into main'sbrowser_stream_socket_path(directory)refactor (covers both stream + adapter-IPC sockets).provider_resource.rs— extended main'speercapability allowlist with ours' gossip ops.gateway.rs— env trusted-signer override (main) before the DID cache (ours).runtime_control.rs— main's portablepid_is_alive();chat/session.rs— ours' fail-closed presence signing.auditrecipe;components.jsontookwallet-provider.Testing
elastosworkspace builds clean.elastos-server928 tests pass;elastos-compute16,chat31,ipfs-provider23,dkms-authority27,encrypt-provider31 (escrow) all pass.components.jsonvalidate.Risk / compatibility
recovery_kit_password_package_imports_with_password_only) fails only under full-suite parallelism (passes in isolation); unrelated to this branch.