Phase 1: cluster write-federation - trust multiple writers (edge parse + master-trust script)#73
Open
ehsan6sha wants to merge 4 commits into
Open
Phase 1: cluster write-federation - trust multiple writers (edge parse + master-trust script)#73ehsan6sha wants to merge 4 commits into
ehsan6sha wants to merge 4 commits into
Conversation
Edge (ipfs-cluster-container-init.d.sh): parse the new `ipfs-cluster-trustedpeers`
array from pools.fx.land/pools/{name}, fall back to the single `ipfs-cluster-peerid`
(backward-compatible), set consensus.crdt.trusted_peers to the full set, and keep the
bootstrap/tunnel/DNS pointed at the PRIMARY (first) peer so single-peer multiaddrs stay
valid. Deploys via OTA (watchtower + fula.sh) - no updater script.
Server op: update-scripts/phase-1-master-trust.sh appends a new writer's peer id to
CLUSTER_CRDT_TRUSTEDPEERS in the master systemd unit (Environment= + ExecStart -e);
additive, idempotent, backs up + restarts + verifies, halts without NEW_WRITER_PEERID.
Tests: tests/test-cluster-federation-parse.sh (jq parse + primary/split, 7/7) and
tests/test-phase-1-master-trust.sh (append to both lines, idempotency, halts, 6/6).
Part of #72.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…riter.sh) Provisions a 2nd ipfs-cluster WRITER on a plain Ubuntu/Debian cloud box (no Fula /uniondrive layout): installs Docker/curl/jq if missing, paths under /opt/fula-writer, default kubo datastore (writer stores ~nothing via the tag:group allocator), mirrors the master cluster env (secret=sha256(clustername), allocator, repl, FOLLOWERMODE=false), auto-reads the master cluster/kubo identity + bootstrap addr from the pool endpoint, joins via direct public bootstrap, prints the new cluster + kubo peer ids for phase-1-master-trust.sh + the pool-server. Dry-run + halts without PUBLIC_HOST. Tests: tests/test-phase-1-setup-writer.sh (dry-run: input validation, ip4/dns4 announce, secret derivation, zero side effects - 7/7). Part of #72. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…+ .env Add update-scripts/lib/phase-common.sh - shared helpers for re-runnable phase scripts: pc_load_env/pc_save_env (persist inputs; a CLI/env value wins over a saved one), pc_prompt (interactive prompt showing the saved value as default, Enter keeps it; non-interactive uses env/.env or halts - never guesses), pc_write_if_changed (rewrite + restart only when the unit actually changed, with backup), detection helpers. Refactor phase-1-setup-writer.sh + phase-1-master-trust.sh onto the lib: detect what is already installed and skip/reuse it (Docker, kubo repo, cluster identity), rewrite systemd units only when changed, prompt for params and remember them in ENV_FILE so a re-run just updates what is needed. Tests (all pass under WSL bash): test-phase-common 10/10, test-phase-1-setup-writer 9/9, test-phase-1-master-trust 7/7 - incl re-run-reuses-saved-value, non-interactive halt, and a fixed set -u unbound-variable bug in a combined local declaration. Part of #72. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n e2e suite phase-1-setup-writer.sh fixes found by live e2e on a clean Ubuntu 24.04 box: - bypass the kubo image auto-init entrypoint (--entrypoint ipfs) — plain `docker run ... init` double-inits and fails on a fresh repo - route `ipfs config` through the running daemon on re-runs (repo lock), one-shot otherwise (kubo_cfg helper); read peer id lock-free via jq tests/e2e/phase-1: isolated-cluster e2e (sim master w/ shifted ports + trust preservation, REAL setup-writer + master-trust runs, updated+old followers, drills D0-D4: failover write, mixed-fleet, reconvergence, idempotent re-runs). Result on test box: 14/14 pass. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Member
Author
Phase 1 e2e — real-daemon acceptance: 14/14 PASSRan on a clean Ubuntu 24.04 x86 box (isolated test cluster, never touching prod): simulated master (prod-shaped systemd unit, shifted ports) + this PR's real
Fixes the e2e surfaced (in this PR)
Finding for a follow-up (not this PR)
Unit suites ( 🤖 Generated with Claude Code |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part of #72.
What this delivers (standalone, verifiable)
Followers can trust more than one cluster writer, so a 2nd cloud writer's pins are accepted network-wide. Additive + backward-compatible - nothing changes until a 2nd writer is actually added and trusted.
Edge (deploys via OTA - no updater script)
File: docker/fxsupport/linux/ipfs-cluster/ipfs-cluster-container-init.d.sh
ipfs-cluster-trustedpeersarray from the pool-info endpoint (companion: functionland/join-server#2), falling back to the existing singleipfs-cluster-peerid-> identical to today when the API returns only the legacy field.consensus.crdt.trusted_peersis set to the full set;cluster.peer_addresses, the/x/fula-clusterbootstrap and the DNS fallback all use the PRIMARY (first) peer, so single-peer multiaddrs never receive a comma-list (the easy-to-miss wiring bug).Server op (non-OTA; master is systemd-managed)
update-scripts/phase-1-master-trust.sh - appends a new writer peer id to CLUSTER_CRDT_TRUSTEDPEERS in the master systemd unit (both Environment= and ExecStart -e), backs up, daemon-reload + restart, verifies. Additive, idempotent, reversible, halts without NEW_WRITER_PEERID. Datastore/identity/pinset untouched.
Tests (pass locally under WSL bash + jq 1.7)
Data-safety
Additive trusted_peers only (pebble/pinset/identity/secret untouched); backups + documented rollback; edge change backward-compatible. Validate on one test device before fleet/OTA rollout.
Still pending in #72 (follow-ups, not in this PR)
Generated with Claude Code.