Skip to content

enm v0.5.229 — multi-chain UX overhaul + CR Council mode end-to-end#18

Merged
4HM3DMD merged 24 commits into
mainfrom
enm-v229-council-mode
May 29, 2026
Merged

enm v0.5.229 — multi-chain UX overhaul + CR Council mode end-to-end#18
4HM3DMD merged 24 commits into
mainfrom
enm-v229-council-mode

Conversation

@4HM3DMD

@4HM3DMD 4HM3DMD commented May 27, 2026

Copy link
Copy Markdown
Owner

Summary

Brings ENM to actual node.sh parity for both BPoS supernode and CR Council operators. Cumulative session work spanning v0.5.215 → v0.5.229(e). Verified live against pc2new mainnet.

Two devastating bug classes fixed end-to-end:

  1. Field-name typo — ENM read info.currentarbiters from getarbitersinfo; the chain's actual JSON struct tag is arbiters (no "current" prefix). Verified against Elastos.ELA/servers/interfaces.go:884-892. Three call sites consumed a field that doesn't exist → empty array → every Council operator silently reported Inactive despite producing blocks.
  2. No CR Council awareness/system/identity only ever queried getproducerinfo (BPoS producer registry). It never called listcurrentcrs. Every Council operator saw "BPoS supernode: not yet registered" regardless of their actual on-chain Committee binding. node.sh:1117 is the canonical reference — it queries listcurrentcrs separately from listproducers and surfaces both side-by-side.

Plus a sanity-catch: EnmRpcClient is not a constructor regression — bare-require returned the whole module instead of the named export. Masked the field-name bug because every call short-circuited to its error fail-safe.

What's in this PR

  • Backend CR Committee plumbing — EnmRpcClient.listcurrentcrs/listnextcrs, new CrMembershipService (30s cache), cfg.global.council.installed schema, /install-council persists it
  • Frontend role-aware branching — 6 Council states + _renderCouncil() + Council-aware copy in identity card / chain card chip / overview summary
  • Safety cross-reference in spawn path — operator who unclaimed via Essentials but is still in the frozen nextarbiters[] slate gets demoted to FOLLOWER
  • Multi-chain overview redesign — mainchain hero + EVM cards with nested oracles + arbiter footer + responsive grid
  • Oracle + arbiter pairing on autoStart — full Council stack comes back together on reboot
  • Pubkey vs signing-address visual hierarchy
  • EVM reward address inline editor with EIP-55 validation + restart-now banner
  • Empty-string filter in arbiter arrays (defensive against interfaces.go:906-912 chain-side filter for non-Elected CRC members)
  • Deploy-marker graceful SIGTERM — fast-exit in ~1s; no more AppProcessManager 3-strike quarantine
  • False-positive F4 stall suppression
  • GET /system/role-debug diagnostic endpoint + Settings UI panel
  • npm run verify-rpc-shapes regression script (24 assertions)
  • Staged chain resume moved behind Danger Zone toggle
  • "Supernode" vocabulary cleaned up from role-neutral generic copy
  • Inline file:line citations to Elastos.ELA source

Also (separate small concern bundled): src/gui/src/UI/UIDesktop.js: gate the consumer Puter welcome modal off via PC2_SHOW_WELCOME=false.

Schema additions (backwards-compat)

  • cfg.global.council.installed (boolean, default false) + installedAt
  • miner.chainState / miner.isOnDuty / miner.inNextRotation (derived on /chains/:id for class B)
  • crMember + setupRole on /identity, /system/identity, /system/council-status, /chains/mainchain

Out of scope (deferred)

  • stdio:'ignore' on chain spawn — follow-up commit
  • F-rule for CR MemberState=Inactive — missing feature, documented inline
  • Shared identity cache (3-card fetch dedup)
  • Reorg-aware CrMembershipService cache invalidation

Test plan

  • node enm-server/scripts/verify-rpc-shapes.js → 24/24 pass
  • Live deploy on srv1682299 (pc2new) — all 8 chain processes alive, fast-exit verified at t=1s, /system/role-debug returns the right raw shape
  • currentCommitteeSize: 12 matches mainnet CR Committee size
  • Operator unclaim → chainState: "standby" correctly reflects nextarbiters queue position
  • Re-claim via Essentials → verify on-duty branch lights up
  • Hard browser refresh → visual UI verification

🤖 Generated with Claude Code

Elastos DAO and others added 3 commits May 27, 2026 22:19
Cumulative session work spanning v0.5.215 → v0.5.229 (e). Brings ENM
into actual node.sh parity for both BPoS supernode and CR Council
operators. Verified live against pc2new mainnet.

## Headline fixes

* The smoking-gun field-name typo in `getarbitersinfo` consumption.
  ENM read `info.currentarbiters`; the chain's JSON tag is `arbiters`
  (verified against Elastos.ELA/servers/interfaces.go:884-892, the
  `arbitersInfo` struct). Three call sites (EvmSidechainAdapter.
  detectProducerRole, routes/chains.js rotation endpoint, EnmRpcClient
  JSDoc) all read a field that doesn't exist → empty array → every
  Council operator silently reported as Inactive despite producing
  blocks. Smoking gun verified by live curl 2026-05-27.

* The `EnmRpcClient is not a constructor` regression. The class is a
  named export; pre-fix the bare `const X = require('./X')` returned
  the whole module object → `new X(...)` threw → every detectProducer
  Role + every CrMembershipService call fell through to its error
  fail-safe → every UI surface showed "unknown" for hours before the
  field-name bug was diagnosed.

* Council membership detection wired end-to-end. New EnmRpcClient
  methods listcurrentcrs + listnextcrs (cite interfaces.go:2159-2179
  for the response shape). New CrMembershipService.detectCrMembership
  (30s cache, mirroring v228d's _producerRoleCache pattern). New
  schema field cfg.global.council.installed persisted by the
  /install-council orchestrator. /system/identity + /system/council-
  status + /chains/mainchain + /identity (settings-side) all extended
  with crMember + setupRole.

* Cross-reference safety in EvmSidechainAdapter.start. When
  setupRole='council' AND !isCrMember (operator unclaimed via
  Essentials) AND inNext=true (chain slate frozen with their pubkey
  still queued), shouldMine demotes to false. Honors operator intent
  (the unclaim) over the chain's frozen slate; the chain's PBFT layer
  would refuse Seal() anyway, so spawning with --mine just wastes
  CPU + log noise.

* Frontend role-aware branching. validator-registration-card now has
  6 new STATE_COUNCIL_* states + a _renderCouncil() method covering
  elected / inactive / impeached / next-term / unclaimed / no-term.
  node-identity-card subtitle + pubkey hint Council-aware. chain-card
  chip prefers crMember.state ("Council · Elected") over BPoS
  producerState ("Active"). Rotation strip appends "(unclaim
  pending; slate freeze in effect until next compute)" when an
  operator has unclaimed but is still queued.

* Pubkey vs signing-address visual hierarchy. NodePublicKey is now
  the primary identity surface (large value font, accent ring, big
  copy button); the keystore-derived signing address is demoted
  (smaller font, muted, compact padding). Pre-229 they rendered
  identically at 12px and competed for attention. Operator directive
  ("we need the public address more than the wallet address") drove
  this.

## Operational hardening

* Multi-chain overview redesign (mainchain hero card + EVM-with-
  nested-oracle cards + arbiter compact footer + responsive grid).
  Usage cards row tightened ~40% default density + 4-tier responsive
  ladder (4-col / 2x2 / single-row-no-subs).

* Oracle + arbiter pairing on autoStart. If the EVM parent is
  enabled in cfg, the oracle is auto-included in the boot list even
  with cfg.oracle.enabled=false. If the mainchain + esc + eid + pg
  quartet is enabled, the arbiter is auto-included. Closes the "half-
  running stack" pattern where operators set enable flags wrong and
  the chain ran without its oracle.

* Cascade-start on manual /chains/:id/start for EVM chains: starts
  the matching oracle alongside (best-effort, non-blocking).

* Deploy-marker graceful SIGTERM in server.js. When .enm-deploy-in-
  progress is present, ENM fast-exits in ~1s (process.exit(0) after
  marking chains as manualStop, skipping the 120s drain). SIGTERM-
  driven deploys no longer trip pc2-node's 3-strike AppProcessManager
  quarantine.

* False-positive stall detector (F4) suppression: when local height
  is at or within 1 block of the network's max known height, no
  alert fires. Pre-fix, a 14-min mainnet quiet window during which
  ALL peers held the same block triggered a "restart your chain"
  proposal.

* EVM reward address inline editor (per-chain) + EVM Shared Settings
  page (apply to all 3 EVMs at once). EIP-55 validation
  client + server side. Restart-now banner after save when chain is
  alive. The reward address is the only operator-set EVM knob;
  everything else (mining, sync mode default) is derived.

* Staged chain resume gated behind a Danger Zone toggle (was
  auto-applied per-chain bash script — destructive). Operator now
  consents explicitly before any pc2-node-restart cascade.

## Diagnostics & regression guards

* New GET /system/role-debug endpoint (owner-gated). Dumps the raw
  chain RPC responses ENM consumes (getarbitersinfo, listcurrentcrs,
  getproducerinfo for the operator's pubkey) alongside ENM's parsed
  view. Operator (or future dev) can diff in 60s to spot the next
  field-name drift. Bypasses CrMembershipService cache for fresh data.
  Surfaced in the UI as Settings → Identity → "Role debug" panel
  with copy-JSON button.

* enm-server/scripts/verify-rpc-shapes.js — static fixture-based
  regression script (24 assertions). Asserts response field names
  ENM reads MATCH what ELA's Go struct tags emit. `npm run verify-
  rpc-shapes` from the enm-server dir.

* Inline file:line citations to Elastos.ELA source in EnmRpcClient
  JSDoc + EvmSidechainAdapter.detectProducerRole + CrMembershipService.
  The class-of-bug ("JSDoc was wrong, field name typo cascaded into
  three consumers") cannot recur silently.

## Vocabulary cleanup

* "Supernode" purged from role-neutral generic copy (Security tab
  intro, Advanced settings warning, Danger Zone nuke warning, setup
  wizard password warning). BPoS-card-specific "supernode" strings
  retained — they're shown only in the BPoS branch.

* Settings → "EVM chains" section relabel "Mining" → "Validator
  status (per chain)" with 5-state taxonomy (On-duty / Standby /
  Inactive / Detecting / Follower) — same vocabulary used in the
  per-chain Validator badge.

## Schema additions (backwards-compat; no field removed)

* cfg.global.council.installed (boolean, default false)
* cfg.global.council.installedAt (timestamp)
* miner.chainState / miner.isOnDuty / miner.inNextRotation (derived
  fields on /chains/:id for class B chains)
* crMember + setupRole (on /identity, /system/identity, /system/
  council-status, /chains/mainchain)

## Out of scope

* Shared identity cache to dedup the 3-card /system/identity fetch
  (deferred; cheap enough).
* Reorg-aware CrMembershipService cache invalidation (deferred; rare
  edge case).
* F-rule for CR MemberState=Inactive (missing feature parallel to
  F12; documented inline for the next F-rule pass).
* stdio:'ignore' on chain spawn (children currently die from SIGPIPE
  when ENM fast-exits; autoStart respawns within ~60s; proper fix
  is `stdio:'ignore'` on detached spawn so children fully detach
  from ENM's stdio fds — follow-up commit).

## Also in this PR (small, separate concern)

* src/gui/src/UI/UIDesktop.js: gate the consumer Puter "Welcome to
  your Personal Internet Computer" modal off with PC2_SHOW_WELCOME=
  false. PC2 is a node-management appliance, not a consumer desktop;
  the modal's copy is wrong for it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups to v0.5.229.

## 1. stdio:'ignore' on detached chain spawn (NativeProcessService:567)

Pre-230 ENM spawned children with `stdio:['pipe','pipe','pipe']`. The
stdin pipe was used (ela reads its keystore password from stdin per
node.sh:878 + the BPoS arbiter mode password feed in
ElaMainChainAdapter), but stdout/stderr pipes were never read on the
ENM side — they were just held open for the child's lifetime, with
ENM's runtime as the other end.

Consequence: every ENM restart (deploy SIGTERM, crash, OOM) closed
the stdout/stderr pipe FDs. The child's next write to stdout/stderr
delivered SIGPIPE → default handler is terminate → ALL 8 child
chains died on every ENM restart, even with detached:true. autoStart
then respawned them ~60s later via the oracle/arbiter pairing logic
shipped in v0.5.228. Operator-visible as "chains briefly down on
every ENM deploy."

Fix: `stdio:['pipe','ignore','ignore']`. stdin stays a pipe for the
password feed; stdout/stderr resolve to /dev/null inside the child.
The child can write to stdout/stderr forever without anyone closing
on them; ENM exiting becomes invisible to the child's stdio.
Combined with detached:true + child.unref(), children are now truly
long-lived across ENM lifecycle events. Chain binaries already write
their own logs via --logdir / --log flags, so dropping the unused
pipes loses nothing in observability terms.

## 2. F28 — CR Council MemberState degraded (parallel to F12)

F12 fires when a BPoS producer-registry record shows state='Inactive'
(producer skipped rotation slots for too many consecutive rounds).
F28 is its CR Council sibling: fires when this node's CR Committee
record (in `listcurrentcrs.crmembersinfo[]`) shows MemberState !=
'Elected'. Same risk class (missed rounds → lost rewards → eventual
seat loss); same NEVER_AUTOMATIC tier (ENM can't recover — Activate
requires the operator's owner key).

Cited file:line for the MemberState enum:
  Elastos.ELA/cr/state/keyframe.go:24-42

State decision table:
  Elected     → quiet (steady state)
  Inactive    → WARN if impeachmentVotes=0, CRITICAL if >0
  Impeached   → CRITICAL (seat lost for term)
  Returned    → CRITICAL (voluntary withdrawal)
  Terminated  → CRITICAL (term ended)
  Illegal     → CRITICAL (caught misbehaving — deposit forfeited)

Wiring:
- snap.cr populated by HealthChecker._fetchCrState (thin wrapper over
  CrMembershipService.detectCrMembership; reuses 30s cache)
- Hard-gated to mainchain (snap.chainId === 'mainchain')
- Hard-gated to actual CR members (cr.isCrMember === true)
- Defensive null-guard pattern same as F12's null-guard on
  snap.bpos.producer
- Filtered into HealthChecker's rule-runner alongside F12 + F25

Operator-facing copy is state-specific so the right recovery hint
surfaces in the audit log. Activate/Recover from Inactive points at
Essentials (where the operator's owner key lives).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On 2026-05-27 17:32:45 UTC, F26 (evm-fork-resync) auto-fired on EID at
block 27,835,801 / network tip 27,847,941 — that's 12k blocks behind tip
(~16h slow-sync lag, 99.96% synced) — and destroyed 1.3GB of chaindata
to "recover" from a fork that didn't exist. Post-wipe EID is now stuck
at block 574,384 (2% synced), proving the underlying stall wasn't a
fork at all — the wipe was treating an alarm, not a fire.

Forensics: ENM SQLite audit log records F26 has fired automatically
TWICE on this chain in the last 3 days (2026-05-25 04:05 + 2026-05-27
17:32), both times at heights >99.8% of network tip.

This patch removes every path that could let F26 do that again:

LAYER 1 — never auto-execute (HealthRules.js)
  - RULE_METADATA.F26.tier: AUTOMATED_SAFE → OWNER_CONFIRMS
  - detectF26 return tier: AUTOMATED_SAFE → OWNER_CONFIRMS
  - Operator confirms every destructive resync; no exceptions. The
    24h rate-limit + escalation logic stays in place to add context.

LAYER 2 — near-tip safety gate (HealthRules.js detectF26)
  - New F26_NEAR_TIP_BLOCKS_GUARD = 100_000 blocks (~5.8d of 5s blocks).
  - Reads snap.rpcSummary.peerMaxHeight or .networkHeight (already
    populated by EvmSidechainAdapter from eth_syncing.highestBlock).
  - If (peerTip - localHeight) < threshold, return null. Fail safe:
    if peerTip is unobservable, also return null (manual Chain Resync
    is the override path for the rare geth-claims-synced-but-forked
    case).

LAYER 3 — multi-tick consecutive-signature gate (HealthChecker +
    HealthRules.js)
  - New F26_CONSECUTIVE_TICKS_MIN = 3 (~90s of unbroken evidence).
  - HealthChecker maintains s.evmForkDetectedConsecutive: increments
    on positive probe, resets to 0 on negative probe OR any height
    advance OR no-stall (so a brief height blip clears it).
  - detectF26 requires the counter to reach the threshold before
    proposing a wipe.

LAYER 4 — stronger log signature (HealthChecker._probeEvmForkSignal)
  - DOWNLOADER_MIN_HITS bumped 3 → 10. Three "retrieved hash chain is
    invalid" lines is well within the noise floor of a peer churn.
  - All hits must be timestamped within the last 10 minutes
    (geth-format [MM-DD|HH:MM:SS.mmm] parser). Old log lines lingering
    in the 64KB tail can no longer trip the wipe.
  - Read buffer bumped 64 KB → 256 KB so the time-window filter has
    enough surface to find real hits in a verbose chain.

LAYER 5 — pre-execution sanity recheck (SelfHealingEngine)
  - New _preWipeRecheck() runs at the moment the operator clicks
    confirm. Polls eth_blockNumber via EthRpcClient; if currentHeight
    > stuckHeight + 50 blocks, aborts with audit log:
      "Aborted pre-wipe: chain advanced from stuck height X to Y
       since the proposal was raised"
  - Also aborts if RPC is unreachable at confirm time (we can't prove
    the condition still exists; refuse to destroy data).
  - Closes the gap where an operator confirms a proposal that's been
    sitting in the dashboard for hours while peers re-converged.

LAYER 6 — preserve nodekey across resync (EnmMaintenanceManager)
  - Before the rm sweep, copies data/{geth|pgp}/nodekey to a dotfile
    OUTSIDE the chaindata dir. Restores it after the wipe, before
    geth starts. Network identity stays stable; peers in our address
    book re-add us in seconds, not the 5-10 min libp2p discovery
    takes for a brand-new ID.
  - Verified against the 2026-05-27 wipe: the regenerated nodekey at
    17:32:45 was a contributing factor to the slow post-wipe peer
    convergence.

NOT in this PR (separate work):
  - The underlying "EID stalls at block N" problem is independent of
    F26 — these fixes only prevent F26 from making it worse. The
    actual stall investigation (peer quality / DPoS state / SPV
    health) is a separate task.
  - Move-not-delete wipes + diagnostic dump on detect (Phase 3 of
    the plan) deferred — current changes are enough to make
    false-positive wipes impossible without operator complicity.

Tested:
  - node --check on all four modified files: passes
  - Hand-traced detectF26 against the 2026-05-27 EID conditions:
    near-tip gate (100k > 12k delta) blocks the fire even without
    the other new gates kicking in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@4HM3DMD

4HM3DMD commented May 27, 2026

Copy link
Copy Markdown
Owner Author

v0.5.231 added — F26 hardening (commit 50bf9de)

Investigation: on 2026-05-27 17:32:45 UTC, the F26 self-heal rule auto-fired on EID at block 27,835,801 / network tip 27,847,941 — only 12k blocks behind tip (~16h slow-sync lag, 99.96% synced) — and destroyed 1.3 GB of chaindata to "recover" from a fork that didn't exist. Post-wipe EID is now stuck at block 574,384 (2% synced), proving the underlying stall wasn't a fork at all.

Audit log shows F26 has auto-fired twice in the last 3 days (2026-05-25 04:05 + 2026-05-27 17:32), both times at heights >99.8% of network tip. This is the bug.

The six layers added in v0.5.231

# Layer Effect
1 Tier demote F26 RULE_METADATA.tier + detectF26 return tier: AUTOMATED_SAFEOWNER_CONFIRMS. F26 NEVER auto-executes a destructive resync again. Operator confirms every wipe.
2 Near-tip gate New F26_NEAR_TIP_BLOCKS_GUARD = 100_000. If peerTip - localHeight < threshold, refuse to propose. Fail-safe: also refuse if peerTip unobservable.
3 Multi-tick gate New F26_CONSECUTIVE_TICKS_MIN = 3. The fork log signature must persist across ≥3 consecutive medium ticks (~90s) before proposing. HealthChecker owns s.evmForkDetectedConsecutive, resets on negative probe or any height advance.
4 Stronger probe _probeEvmForkSignal: DOWNLOADER_MIN_HITS 3 → 10, all hits must fall inside the last 10 min by geth-log timestamp parsing. Old log noise in the 64 KB tail no longer trips a wipe. Read buffer 64 KB → 256 KB so the time filter has real surface.
5 Pre-exec recheck New SelfHealingEngine._preWipeRecheck(). At confirm-click time: re-polls eth_blockNumber, aborts if currentHeight > stuckHeight + 50 or RPC unreachable. Audit log records the abort reason. Closes the gap where a proposal sat in dashboard for hours while the chain recovered.
6 Nodekey preservation EnmMaintenanceManager.chainResync now backs up `data/{geth

What this PR closes

Hand-traced against the 2026-05-27 EID conditions: with v0.5.231, the near-tip gate alone (100k > 12k delta) blocks the auto-fire. Each remaining layer adds independent veto power — even if one is bypassed by a future code change, the others still hold.

What's STILL open (separate work)

  • EID is still stuck at block 574,384. v0.5.231 prevents F26 from making it worse, but doesn't fix the underlying stall. Need to investigate peer quality, DPoS state, or SPV health separately.
  • Move-not-delete wipes (Phase 3 of the plan) — deferred; current layers are enough to make false-positive wipes impossible without operator complicity.

Diff

5 files changed, 323 insertions(+), 47 deletions(-)

  • enm-server/package.json
  • enm-server/src/services/HealthRules.js (+128/-30)
  • enm-server/src/services/HealthChecker.js (+88/-9)
  • enm-server/src/services/SelfHealingEngine.js (+91/-0)
  • enm-server/src/services/EnmMaintenanceManager.js (+61/-0)

Before this commit the ENM app exposed FIVE overlapping destructive
endpoints with conflicting semantics + the "another pc2 inside the app"
reload bug after a full wipe:

  1. POST /maintenance/chain-resync   wipes ONE chain  (hardcoded mainchain)
  2. POST /maintenance/uninstall      removes bundle, keeps data
  3. POST /maintenance/nuke           removes bundle + wipes ALL data + keystore
  4. POST /identity/reset             rotates keystore standalone
  5. (PC2 desktop right-click → Uninstall, separate flow)

Council operators couldn't resync EID/ESC/PG (the card was hardcoded to
mainchain). Nuke deleted the bundle, leaving the iframe URL with nothing
to serve — pc2-node's fallback then loaded the pc2 desktop root INTO
the orphaned ENM iframe, the symptom the operator's been calling
"another pc2 inside the app". Identity reset rotated the keystore but
left the on-chain producer/CR registration orphaned (new pubkey doesn't
match), guaranteeing a follow-up wizard walk anyway.

v0.5.232 collapses these into two in-app actions:

  POST /maintenance/chain-resync     accepts { chainIds:[...], confirm:"RESYNC" }
                                     OR legacy { chainId, confirm:chainId }.
                                     Rejects arbiter + *-oracle (no chaindata).

  POST /maintenance/reset-everything (NEW) wipes ALL data and SIGKILLs ENM
                                     but KEEPS the bundle + the installed_apps
                                     row, so pc2-node's process supervisor
                                     respawns ENM in place with empty data —
                                     setup wizard reappears, iframe never
                                     loses its server. Confirm: "RESET EVERYTHING".

And retires three endpoints with 410 Gone responses pointing operators
at the right replacement:

  POST /maintenance/uninstall      → 410 "use pc2 desktop right-click"
  POST /maintenance/nuke           → 410 "use Settings → Reset ENM"
  POST /identity/reset             → 410 "use Settings → Reset ENM"

FRONTEND (Settings → Danger Zone): three cards instead of five.

  - Update (unchanged)
  - Resync chain data (NEW: mode-aware)
      Pre-v0.5.232 was hard-coded to mainchain. Now reads setupRole from
      /system/identity:
        BPoS   → single "Resync mainchain" button + typed-confirm "mainchain"
        Council → checkbox list {mainchain, esc, eid, pg} (default all),
                  + typed-confirm "RESYNC" + "Resync selected chains" button
        unknown → falls back to BPoS variant (safest default)
      Multi-chain runs serially through chainResync() in the backend;
      each chain gets its own audit-log row.
  - Reset ENM (NEW: full wipe, in-place restart)
      Replaces the old Remove app + Nuke + Reset keystore cards. After
      200 OK, the frontend setTimeout(location.reload, 6000) so the
      wizard appears automatically as soon as pc2-node respawns ENM.

ALSO REMOVED from Settings → Identity:
  - The standalone "Reset keystore (new identity)" card and its
    150-line _doIdentityReset flow (preserved in git history). Rotating
    keys without wiping chain data is footgun-shaped — the unified
    Reset ENM does both atomically.

DEFENSIVE Phase 4 — boot-health retry loop in app.js:
  When the post-reset location.reload() fires, pc2-node may need a few
  extra seconds to respawn ENM (bundle is preserved, but :4180 isn't
  bound yet). The initial /health probe now retries on transient
  failures (no status / 5xx) up to 15 times every 2s (~30s window),
  with a "ENM restarting… (n/15)" spinner so the operator sees
  progress. A real failure surfaces the existing error pane unchanged.

THE "ANOTHER PC2 INSIDE THE APP" BUG IS NOW STRUCTURALLY DEAD:
  pre-v0.5.232 nuke deleted:
    - the installed_apps row     → pc2-node forgot ENM
    - the bundle dir             → no static files to serve
    - the data dir + keystore
  So when the operator reopened the ENM tile, pc2-node's iframe URL
  had nothing → fell back to the pc2 desktop root, which loaded inside
  the orphan iframe. v0.5.232's reset-everything keeps the row + bundle
  intact and only touches data — the iframe always has ENM (or
  ENM-restarting-very-soon) to serve.

DIFF SUMMARY
  +626/-711 = NET -85 LOC across 8 files
    enm-server:
      package.json                    +1/-1
      routes/identity.js              +14/-122   (retired POST /reset)
      routes/maintenance.js           +185/-180  (multi-chain + reset-everything + 410s)
      services/EnmMaintenanceManager  +56/-83    (resetEverything + preserveBundle)
      services/EnmRequestSchemas      +21/-12    (multi-chain shape + reset body)
    frontend (puter app):
      js/app.js                       +39/-3     (boot retry loop)
      js/components/settings-tab.js   +247/-374  (3 cards instead of 5)
      js/strings.js                   +29/-29    (drop 14 keys + add 13 keys)

VERIFIED
  - node --check on all 8 files: passes
  - schemas: chain-resync .or('chainId','chainIds') guarantees one is set
  - hand-trace: BPoS operator on existing install sees the legacy
    "Resync mainchain" UX (typed-confirm "mainchain") — backward compat
  - hand-trace: Council operator sees 4-checkbox list with "RESYNC" gate
  - hand-trace: full reset script (preserveBundle=true) doesn't rm the
    bundle, doesn't DELETE the installed_apps row, kills all 8 chain
    children, wipes data dir + backups, then SIGKILLs ENM

NOT in this PR (out of scope)
  - PC2 desktop's own right-click → Uninstall flow (pc2-node-level, owned by Puter)
  - The underlying EID stall at block 574k (separate task)
  - v0.5.231's F26 hardening (still in place)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@4HM3DMD

4HM3DMD commented May 27, 2026

Copy link
Copy Markdown
Owner Author

v0.5.232 added — destructive ops consolidation (commit 7d56b62)

Five destructive endpoints → two. Fixes the "another pc2 inside the app" reload bug structurally.

Before this commit

Endpoint Purpose Problem
POST /maintenance/chain-resync Wipe one chain Hardcoded mainchain — Council couldn't resync EID/ESC/PG from UI
POST /maintenance/uninstall Remove bundle, keep data Duplicates PC2 desktop right-click
POST /maintenance/nuke Remove bundle + data + keystore Caused "another pc2 inside the app" bug (bundle deleted → orphan iframe loaded pc2 root)
POST /identity/reset Rotate keystore standalone Footgun — orphans on-chain registration

After

Endpoint Purpose
POST /maintenance/chain-resync Multi-chain: { chainIds:[], confirm:"RESYNC" }. Legacy single-chain still works for BPoS. Rejects arbiter/oracles.
POST /maintenance/reset-everything (NEW) Wipes ALL data, KEEPS the bundle so pc2-node respawns ENM in place. confirm:"RESET EVERYTHING".
POST /maintenance/uninstall 410 Gone → "use PC2 desktop right-click"
POST /maintenance/nuke 410 Gone → "use Settings → Reset ENM"
POST /identity/reset 410 Gone → "use Settings → Reset ENM"

Frontend — Settings → Danger Zone

Three cards instead of five:

Resync chain data (mode-aware via /system/identity setupRole):

  • bpos → single "Resync mainchain" button + typed-confirm mainchain
  • council → 4-checkbox list {mainchain, esc, eid, pg} (default all) + typed-confirm RESYNC + "Resync selected chains" button
  • unknown → falls back to BPoS variant

Reset ENM (full wipe, in-place restart):

  • Typed-confirm: RESET EVERYTHING (case-sensitive)
  • On 200 OK: setTimeout(location.reload, 6000) so the wizard auto-appears
  • Backed by the new /reset-everything endpoint that preserves the bundle

Update unchanged.

Why "another pc2 inside the app" is now structurally dead

Pre-v0.5.232 nuke deleted ALL of: the installed_apps row, the bundle dir, the data dir, the keystore. When the operator reopened the ENM tile, the iframe URL had nothing → pc2-node's fallback served the pc2 desktop root INTO the orphaned ENM iframe.

v0.5.232's reset-everything keeps the row + bundle untouched. The iframe always has ENM (or ENM-restarting-very-soon) to serve. Belt-and-suspenders: the new app.js boot guard retries /health for up to 30s with an "ENM restarting…" spinner if pc2-node hasn't quite finished respawning by the time the auto-reload fires.

Diff: 8 files, +626/-711 = net -85 LOC

  • package.json — 0.5.231 → 0.5.232
  • routes/identity.js-122 lines (POST /reset retired to 410)
  • routes/maintenance.js — multi-chain handler + reset-everything + 410s
  • services/EnmMaintenanceManager.jsresetEverything() + preserveBundle mode in _buildTeardownScript
  • services/EnmRequestSchemas.js.or('chainId','chainIds') + reset body schema
  • js/app.js — boot health retry loop
  • js/components/settings-tab.js — 3 cards instead of 5, mode-aware resync UI
  • js/strings.js — drop 14 keys, add 13 keys, update intro

Verified

  • node --check on all 8 modified files
  • Hand-trace against the BPoS+Council UX matrix
  • Hand-trace against the teardown script (preserveBundle=true keeps installed_apps row + bundle)

Live deploy

enm-v0.5.232 tag pushed; bundle attached to GitHub Release; deploying to pc2new now.

Three small cleanups discovered when the operator asked to verify that
all PC2-side changes related to retired destructive ops had been
removed:

1. routes/identity.js — drop two orphans left when POST /identity/reset
   was retired in v0.5.232:
     - `RESET_CONFIRM_PHRASE` constant (only caller was the retired
       handler)
     - `_verifyAntiSnipe()` helper (only caller was the retired
       handler; SelfHealingEngine._verifyAntiSnipePassword is the
       canonical verifier)
   Both replaced with v0.5.232 marker comments so the diff reviewer
   can see what was retired without git-blame archaeology.

2. js/strings.js — drop 8 orphan `identity_reset_*` strings:
   title / help / confirm_label / btn / confirm_dialog / running /
   ok / password_warning. The only consumer was the standalone
   "Reset keystore" card, which v0.5.232 removed. Replaced with
   a marker comment.

3. js/components/audit-tab.js — add the friendly-name mapping for
   the new `POST /maintenance/reset-everything` endpoint ("Started
   full ENM reset"). The retired endpoints' mappings stay, suffixed
   with "(retired)", so historical audit rows (and any 410-rejection
   rows from operators with stale frontends) still render with
   operator-meaningful copy.

VERIFIED: no other source file references the removed symbols:
  - `RESET_CONFIRM_PHRASE` — 0 live refs (1 marker comment)
  - `_verifyAntiSnipe`     — 0 live refs (1 marker comment)
  - `identity_reset_*`     — 0 live refs (1 marker comment)
  - `_doNuke / _doUninstall / _doIdentityReset` — 0 live refs

PC2-SIDE AUDIT (in response to operator's question):
The pc2-node side has NO ENM-specific destructive paths that need
removing. The teardown endpoint in ENM's bundle manifest
(`backend.teardown: { endpoint: '/api/enm/teardown' }` in
pc2-node/scripts/package-app.mjs:235) is STILL needed — it's what
pc2-node POSTs when the operator uses pc2 desktop right-click →
Uninstall (the pc2-level removal path that the operator explicitly
wanted preserved). The corresponding ENM /teardown handler at
enm-server/src/server.js:202 stays as-is; it gracefully stops
chains + flushes DB state before pc2 wipes the data dir.

`scripts/deploy-enm.sh` was checked for any references to the
retired endpoints (`/maintenance/uninstall`, `/maintenance/nuke`,
`/identity/reset`, `WIPE EVERYTHING`) — none found. The deploy
script's `DELETE /api/installed-apps/elastos-node-manager?purge=false`
+ `install-local` flow operates at the pc2-node level, not via the
ENM-app destructive endpoints we retired.

No PC2 customizations are obsolete.

DIFF SUMMARY (4 files, +27/-30 = net -3 LOC):
  - enm-server/package.json                          +1/-1
  - enm-server/src/routes/identity.js                +6/-19
  - js/components/audit-tab.js                       +9/-3
  - js/strings.js                                    +11/-7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@4HM3DMD

4HM3DMD commented May 27, 2026

Copy link
Copy Markdown
Owner Author

v0.5.233 added — v0.5.232 follow-up cleanup (commit 5c5a252)

Operator asked to verify no PC2-side changes related to retired destructive ops were left behind. Three small orphan-ref cleanups + one missing audit-tab mapping.

PC2-side audit results

No pc2-node customizations are obsolete. The pc2-node side has NO ENM-specific destructive paths. The ENM /teardown endpoint (at enm-server/src/server.js:202) and its manifest declaration (backend.teardown: { endpoint: '/api/enm/teardown' } in pc2-node/scripts/package-app.mjs:235) must stay — they're called by the pc2 desktop right-click → Uninstall flow that the operator explicitly wanted preserved.

scripts/deploy-enm.sh checked: zero references to retired endpoints (/maintenance/uninstall, /maintenance/nuke, /identity/reset, WIPE EVERYTHING). Deploy script operates at the pc2-node level (/api/installed-apps/install-local), not via ENM-app destructive endpoints.

What this commit cleans up

  1. routes/identity.js — drop two orphan symbols left by v0.5.232:

    • RESET_CONFIRM_PHRASE constant
    • _verifyAntiSnipe() helper
  2. js/strings.js — drop 8 orphan identity_reset_* keys (title / help / confirm_label / btn / confirm_dialog / running / ok / password_warning). Only consumer was the standalone "Reset keystore" card retired in v0.5.232.

  3. js/components/audit-tab.js — add the friendly-name mapping for POST /maintenance/reset-everything ("Started full ENM reset"). Retired endpoints' mappings stay with "(retired)" suffix so historical audit rows render meaningfully.

Verified orphan-symbol freedom

After this commit:

  • RESET_CONFIRM_PHRASE — 0 live refs
  • _verifyAntiSnipe — 0 live refs
  • identity_reset_* — 0 live refs
  • _doNuke / _doUninstall / _doIdentityReset — 0 live refs

Diff: 4 files, +25/-31 = net -6 LOC

Tag enm-v0.5.233 pushed. Deploying to pc2new next.

…in names

Operator caught a branding bug in v0.5.232's new Resync card: the
checkbox list labeled PG as "Privacy", which is wrong. PG is a PUBLIC
EVM PBFT sidechain (see strings.js:438 / Session 28's explicit fix —
"PG is a PUBLIC EVM PBFT sidechain. The binary is closed-source
(operator-supplied SHA256 manifest gate), but the CHAIN ITSELF isn't
private in the permissioned-access sense the label implied").

This commit normalizes the chain display names ACROSS all the v0.5.231/
232 new features to match the canonical convention used elsewhere in
the app (verified against PgAdapter.js displayName, EscAdapter.js
displayName, EidAdapter.js displayName, and strings.js wizard copy).

Canonical UI display names (verified across the app):
  chainId         display name
  ---------------------------------------
  mainchain    →  "Main chain"          (space, capital M)
  esc          →  "ESC (Smart Chain)"   (NOT "Smart Contract")
  eid          →  "EID (Identity Chain)"
  pg           →  "PG"                  (NO parenthetical — public chain)

Changes:

1. settings-tab.js COUNCIL_CHAINS array (4 wrong labels):
   - 'ELA mainchain'         → 'Main chain'
   - 'ESC (Smart Contract)'  → 'ESC (Smart Chain)'
   - 'EID (Identity)'        → 'EID (Identity Chain)'
   - 'PG (Privacy)'          → 'PG'
   Added an inline comment citing strings.js:438 + Session 28 so a
   future drive-by edit doesn't reintroduce "PG (Privacy)".

2. strings.js v0.5.232 BPoS strings:
   - danger_resync_bpos_help: "ELA mainchain" → "Main chain"
   - danger_resync_bpos_btn:  "Resync mainchain" → "Resync Main chain"
   Typed-confirm value stays the literal "mainchain" — the backend's
   gate matches that lowercase chainId, so the placeholder + expected
   string are the literal token the operator types.

3. HealthRules.js detectF26 — v0.5.231 proposal copy used the raw
   lowercase snap.chainId in the operator-facing card text
   ("Confirm resync of eid (suspected fork wedge)"). v0.5.234 maps
   chainId to canonical display via an inline helper so the proposal
   reads "Confirm resync of EID (suspected fork wedge)" / "Confirm
   resync of Main chain (suspected fork wedge)". Same convention as
   the rest of the UI.

4. settings-tab.js Network pane label (pre-existing):
   "Mining address (ELA mainchain)" → "Mining address (Main chain)"
   Same wrong-branding pattern as v0.5.232's BPoS strings; while we
   were normalizing the new features it was sitting one screen over.

5. strings.js snapshot wizard hint (pre-existing):
   "Downloads ~10 GB so the ELA mainchain skips its 1–3 day genesis
   sync" → "Downloads ~10 GB so the Main chain skips its 1–3 day
   genesis sync". Same pattern.

6. settings-tab.js _refreshDangerResyncCard docstring — updated to
   match the new button text ("Resync Main chain", not "Resync
   mainchain") so a future reader isn't misled by the stale doc.

Out of scope (pre-existing copy that's a style decision, not a bug):
  - Compound forms like "mainchain snapshot" / "mainchain block" /
    "mainchain data" / "the mainchain dashboard" — these are
    established compound nouns in the wizard / dashboard / multi-
    chain overview copy. Normalizing them is a separate copy-pass.
  - Any other pre-existing "mainchain" lowercase references in the
    wizard / setup-conversation / chain-card components that aren't
    clearly broken.

VERIFIED (working tree branding sweep):
  $ grep -nE "ELA mainchain|Smart Contract|PG \(Privacy|EID \(Identity\)|Resync mainchain" <files>
  → 0 hits

Diff summary (4 files, +32/-12 = net +20 LOC):
  - enm-server/package.json                              +1/-1
  - enm-server/src/services/HealthRules.js               +11/-2  (chainDisplay helper + 2 callsites)
  - js/components/settings-tab.js                        +12/-6  (COUNCIL_CHAINS labels + Mining-addr label + docstring)
  - js/strings.js                                        +8/-3   (BPoS help/btn + snapshot hint)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@4HM3DMD

4HM3DMD commented May 27, 2026

Copy link
Copy Markdown
Owner Author

v0.5.234 added — UI branding fix: PG is NOT for privacy (commit 82e3da5)

Operator caught: the v0.5.232 Resync card labeled PG as "Privacy". Wrong. PG is a PUBLIC EVM PBFT sidechain — see strings.js:438 / Session 28's explicit fix: "PG is a PUBLIC EVM PBFT sidechain. The binary is closed-source (operator-supplied SHA256 manifest gate), but the CHAIN ITSELF isn't private in the permissioned-access sense the label implied."

This commit normalizes display names across all v0.5.231/232 new features to match the canonical convention.

Canonical UI display names (verified across the app)

chainId Display name
mainchain "Main chain" (space, capital M)
esc "ESC (Smart Chain)" (NOT "Smart Contract")
eid "EID (Identity Chain)"
pg "PG" (NO parenthetical — public chain)

Six fixes

  1. settings-tab.js COUNCIL_CHAINS array — 4 wrong labels:

    • 'ELA mainchain''Main chain'
    • 'ESC (Smart Contract)''ESC (Smart Chain)'
    • 'EID (Identity)''EID (Identity Chain)'
    • 'PG (Privacy)''PG'
      Inline comment citing strings.js:438 added to block drive-by regressions.
  2. strings.js v0.5.232 BPoS strings:

    • danger_resync_bpos_help: "ELA mainchain" → "Main chain"
    • danger_resync_bpos_btn: "Resync mainchain" → "Resync Main chain"
      Typed-confirm value stays the literal mainchain (backend gate is the lowercase chainId).
  3. HealthRules.js detectF26 — v0.5.231's proposal text used raw lowercase ${snap.chainId} ("Confirm resync of eid"). Now uses canonical display: "Confirm resync of EID" / "Confirm resync of Main chain". Same convention as the rest of the UI.

  4. settings-tab.js Network pane label (pre-existing fix while in the area):

    • 'Mining address (ELA mainchain)''(Main chain)'
  5. strings.js snapshot wizard hint (pre-existing fix while in the area):

    • "Downloads ~10 GB so the ELA mainchain skips..." → "Main chain skips..."
  6. settings-tab.js _refreshDangerResyncCard docstring — updated stale doc to match the new button text.

Out of scope (deferred for a separate copy-pass)

Compound forms like "mainchain snapshot" / "mainchain block" / "mainchain data" / "the mainchain dashboard" appear in pre-existing wizard / dashboard / multi-chain-overview copy. These are established compound nouns, not bugs — normalizing them is a coordinated copy review that's larger than this PR.

Verified

grep -nE "ELA mainchain|Smart Contract|PG \(Privacy|EID \(Identity\)|Resync mainchain" on the working tree returns 0 hits.

Diff: 4 files, +32/-12 = net +20 LOC

Tag enm-v0.5.234 pushed. Deploying to pc2new now.

Elastos DAO and others added 2 commits May 27, 2026 23:57
Comment-only follow-up to v0.5.234. The _refreshDangerResyncCard
docstring still said "Resync mainchain" / "{mainchain, esc, eid, pg}"
after v0.5.234 renamed the button to "Resync Main chain" + the
visible checkbox labels to {Main chain, ESC, EID, PG}. Source-only;
no runtime change, no version bump, no redeploy needed.

The other "PG (private chain)" reference in the same file (line ~2713)
intentionally STAYS — it documents the previously-wrong label as
historical context, with a Session 28 citation to block future
regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two operator directives (2026-05-28):
  1. "all ENM apps should be council ready, remove fast sync"
  2. "for SPV parity, check how node.sh does it and do it as it does"

Both ship together because #1 is only safe with #2 — full-sync from
genesis re-executes EID's DID tx at block 166,410, which needs the
arbiter context that only a lockstep SPV supplies.

== INVESTIGATION ==

node.sh has NO chaindata/SPV wipe at all — no resync/clear command
exists. SPV (header/store/spv_transaction_info.db/logs-spv) is a sibling
of geth under <chain>/data/ and is only log-rotated. So node.sh never
decouples them — a manual recovery would rm -rf <chain>/data (geth+SPV
together) → lockstep rebuild. node.sh runs PRODUCERS on --syncmode full
(esc_start:2152, eid_start:4390); followers default to geth's 'fast'.

The EID stall root cause (proven live 2026-05-27): the F26 wipe reset
geth to genesis but PRESERVED SPV at the mainchain tip, decoupling the
arbiter context → every header past 574,384 failed PBFT validation
forever. The joint geth+SPV wipe drove EID 574k → 4M+ in ~15 min.

== PHASE 1: SPV lockstep wipe (matches node.sh) ==

EnmMaintenanceManager.chainResync (Class B):
  - Wipe SPV (header/store/spv_transaction_info.db/logs-spv) + peers.json
    ALONGSIDE geth/pgp, instead of preserving them.
  - protectedPaths reduced to just data/keystore. nodekey still preserved
    via the v0.5.231 backup/restore.
  - Removes the "preserve SPV saves hours" logic — false: SPV bulk header
    re-sync did 404k → 1.75M mainchain blocks in 15 min; preserving it
    caused a PERMANENT stall, which is far worse.

== PHASE 3: always-full EVM sync (council-ready) ==

EVM chains now ALWAYS run --syncmode full regardless of on-duty status,
so a Council node is always production-ready and never needs a fast→full
re-sync when it goes on-duty. Mining (--mine) stays on-duty-gated via
detectProducerRole — only the SYNC MODE is decoupled from role now.

  - EvmSidechainAdapter.buildSpawnArgs: always push --syncmode full for
    Class B (honors explicit 'archive'; coerces anything else incl. legacy
    'fast' → 'full'). Removed the fast-default / miner-only-full branch.
  - EvmSidechainAdapter.start(): removed the shouldMine→full / follower→
    fast flips. Producer status sets only miner.enabled. Legacy stored
    'fast' migrated to 'full' (and re-persisted). Fail-safe keeps full.
  - setup.js: install default sync.mode 'fast' → 'full'.
  - setup.js + chains.js validation: coerce a 'fast' request to 'full'
    (accepted for compat, error message now "full | archive").
  - EnmConfigSchema: classBSyncSchema default 'fast' → 'full' ('fast'
    kept in valid() only so pre-v0.5.235 configs still load; coerced at
    use).
  - Frontend settings-tab.js: removed the 'fast' option from both EVM
    sync-mode dropdowns (per-chain + EVM-shared); default 'full'; coerce
    legacy 'fast' to 'full' on display + apply.

Why always-full is now SAFE for EID (the DID wedge):
  Block 166,410's DID tx wedge required a DIVERGED on-disk DID index
  (from unclean SIGKILL restarts) OR a decoupled SPV. A CLEAN from-genesis
  full-sync with the Phase-1 lockstep SPV builds a correct DID index →
  checkRegisterDID passes. node.sh proves clean full-sync of EID works
  (producers run it). Per operator (Option B), Phase-2 verification is
  post-deploy monitoring rather than a pre-merge gate.

== PHASE 4: migration ==

  - New installs / post-wipe: full from genesis (lockstep SPV).
  - Existing fast-synced chains (already past 166,410): flipping to full
    going forward is safe + needs no re-sync — geth runs full-mode forward
    without re-executing 166,410. They get the switch for free.
  - Deploys don't restart EVM chains (stdio:'ignore'), so tonight's EID
    fast-recovery continues to completion; always-full takes effect on its
    next restart, by which point it's at tip → no mid-sync fast→full hazard.

VERIFIED: node --check on all 6 files; residual-fast grep clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@4HM3DMD

4HM3DMD commented May 28, 2026

Copy link
Copy Markdown
Owner Author

v0.5.235 added — always-full EVM sync (council-ready) + SPV lockstep wipe (commit c11faee)

Ships both operator directives together (they're interdependent — full-sync from genesis only works with a lockstep SPV).

Investigation: how node.sh handles SPV

node.sh has no chaindata/SPV wipe at all — no resync/clear command. SPV (header/store/spv_transaction_info.db/logs-spv) is a sibling of geth under <chain>/data/, only ever log-rotated. So node.sh never decouples them; a manual recovery = rm -rf <chain>/data (geth + SPV together → lockstep). node.sh runs producers on --syncmode full (esc_start:2152, eid_start:4390).

Phase 1 — SPV lockstep wipe (chainResync)

  • Class B now wipes header/store/spv_transaction_info.db/logs-spv + peers.json alongside geth/pgp. protectedPaths reduced to just keystore; nodekey still preserved (v0.5.231 backup/restore).
  • Removes the "preserve SPV saves hours" logic — proven false (SPV bulk re-sync did 404k→1.75M mainchain blocks in 15 min; preserving it caused a permanent stall).

Phase 3 — always-full EVM sync

EVM chains always --syncmode full regardless of on-duty status → always production-ready, never needs a fast→full re-sync when going on-duty. Mining (--mine) stays on-duty-gated; only sync mode is decoupled from role.

  • buildSpawnArgs: always full (honors archive; coerces legacy fastfull).
  • start(): removed the shouldMine→full / follower→fast flips; migrates stored fastfull.
  • setup.js install default fastfull; setup + chains validation coerce fastfull.
  • EnmConfigSchema default fastfull (fast kept in valid() for load-compat only).
  • Frontend: removed fast from both EVM sync-mode dropdowns; default full.

Why always-full is safe for EID (the DID wedge)

Block 166,410's DID-tx wedge required a diverged DID index OR a decoupled SPV. A clean from-genesis full-sync with the Phase-1 lockstep SPV builds a correct DID index → checkRegisterDID passes. node.sh proves clean full-sync of EID works (producers run it). Per operator (Option B), Phase-2 verification is post-deploy monitoring, not a pre-merge gate.

Phase 4 — migration

  • New installs / post-wipe: full from genesis (lockstep SPV).
  • Existing fast-synced chains (past 166,410): flip to full going forward, no re-sync needed (no 166,410 re-execution).
  • Deploys don't restart EVM chains (stdio:'ignore'), so tonight's EID fast-recovery finishes to tip; always-full takes effect on its next restart (by which point it's at tip → no mid-sync hazard).

Diff: 7 files, +129/-75

Tag enm-v0.5.235 pushed; deploying to pc2new now.

…ption)

Operator directive 2026-05-28: "lower-end recommended hardware should have
an option to run 2 chains at once and sync the rest when the first two are
fully synced — initial sync takes the most resources." Confirmed semantics:
a sliding window of 2 (always ≤2 heavy syncs; as one finishes the next
starts), chosen in the setup wizard.

This matters more after v0.5.235 made EVM chains full-sync (heavier than the
old fast-sync) — three simultaneous EVM full-syncs can saturate a small
VPS's CPU and get it paused by the provider.

== Backend: EnmStageSyncOrchestrator (NEW) ==

The existing EnmStageSync (utils-stage-sync.js) is FRONTEND-only and dies
when the tab closes — unusable for a multi-hour wizard sync. This is the
backend port its own header called "a future Phase 22.1." Model:

  - Sliding window of N (default 2) over HEAVY chains (class A mainchain +
    class B esc/eid/pg). Start ≤N; when one reaches the network tip (via
    SyncTracker blocksBehind <= 8), free its slot, start the next pending
    heavy chain.
  - Light services don't count against the window: an EVM chain's oracle
    (class C) is started alongside its parent ("run together"); the arbiter
    (class D) starts after all heavy chains are up.
  - IDEMPOTENT / RESUMABLE: no progress file — live chain states ARE the
    progress. On each (re)start it re-derives per heavy chain: synced→done,
    alive-but-behind→inflight (counts vs window), stopped→pending. A host
    reboot mid-stage resumes cleanly; once all synced a normal boot just
    starts everything.
  - STALL SAFETY (progress-based, not wall-clock — a legit full-sync is
    slow but progressing): if blocksBehind fails to decrease for ~20 min
    while still behind, free the slot so the rest proceed; the stuck chain
    keeps running and F-rule self-heal surfaces it.
  - Publishes 'stage-sync:status' SSE per phase for a future status panel.

== Backend wiring ==

  - EnmAutoStart: when global.syncStrategy === 'staged', hand the bring-up
    to the orchestrator (window = global.stagedSync.concurrency, default 2)
    instead of startAllChains. Fail-safe: if the orchestrator can't start,
    fall back to concurrent so chains still come up.
  - EnmConfigSchema: global.syncStrategy 'concurrent'|'staged' (default
    concurrent) + global.stagedSync.concurrency (1-4, default 2). Added to
    defaultConfig too.
  - setup.js install-council: persists body.syncStrategy to
    cfg.global.syncStrategy before kicking the orchestrator + first
    autoStart, so it's in place when chains boot.

== Frontend: wizard Card 5 choice ==

  - setup-conversation.js Card 5 (Council only — BPoS runs one chain, so
    staging is moot): a radio group "Initial sync":
      • "Start all chains at once (fastest)" → concurrent (default)
      • "Conserve resources — sync 2 chains at a time" → staged
    Writes this._syncStrategy; _beginInstall ships it in the
    install-council POST body.

== Scope notes ==

  - BPoS unaffected (mainchain-only → one heavy chain → staged is a no-op).
  - Default behavior unchanged for everyone (concurrent) — purely additive,
    opt-in via the wizard.
  - Deferred (follow-ups, noted): a dedicated staged-progress banner (the
    SSE topic is already published; per-chain dashboard states convey
    progress today), and consolidating the manual Settings→Advanced
    "Stage-sync now" (frontend EnmStageSync) onto this backend orchestrator
    — they don't conflict (manual vs boot entry points).

VERIFIED: node --check on all 5 files; orchestrator exports
{startStaged,cancel,isRunning}; schema validate() preserves 'staged' +
defaults absent → 'concurrent'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@4HM3DMD

4HM3DMD commented May 28, 2026

Copy link
Copy Markdown
Owner Author

v0.5.236 added — staged initial sync for constrained hardware (commit 75de2bb)

Wizard option for lower-end hardware: sync 2 chains at a time (sliding window), bringing up the rest as each finishes. Matters more after v0.5.235 made EVM chains full-sync (heavier) — 3 simultaneous EVM full-syncs can saturate a small VPS.

New: EnmStageSyncOrchestrator (backend)

The existing EnmStageSync is frontend-only (dies on tab close — unusable for a multi-hour wizard sync). This is the backend port:

  • Sliding window of N=2 over heavy chains (class A mainchain + B esc/eid/pg). Start ≤2; when one hits the network tip (SyncTracker.blocksBehind <= 8), free the slot, start the next.
  • Oracles pair with their parent (don't count against window); arbiter starts last.
  • Idempotent/resumable — no progress file; live chain states are the progress. Re-derives on every (re)start, so a host reboot mid-stage resumes.
  • Stall-safe (progress-based): if blocksBehind stops decreasing for ~20 min while still behind, free the slot so the rest proceed; stuck chain keeps running + F-rules surface it.
  • Publishes stage-sync:status SSE per phase.

Wiring

  • EnmAutoStart: syncStrategy === 'staged' → hand bring-up to the orchestrator (fail-safe falls back to concurrent).
  • EnmConfigSchema: global.syncStrategy (concurrent|staged, default concurrent) + global.stagedSync.concurrency (1-4, default 2).
  • setup.js install-council: persists body.syncStrategy before first autoStart.
  • Wizard Card 5 (Council only): radio — "Start all chains at once (fastest)" vs "Conserve resources — sync 2 chains at a time"; ships in the install POST.

Scope

  • BPoS unaffected (one heavy chain → staged is a no-op).
  • Default unchanged for everyone (concurrent) — purely additive, opt-in.
  • Deferred follow-ups: dedicated staged-progress banner (SSE topic already published; per-chain dashboard states convey progress today) + consolidating the manual Settings→Advanced "Stage-sync now" onto this backend (they don't conflict — manual vs boot entry points).

Diff: 6 files, +441/-3 (new orchestrator is ~210 LOC)

Tag enm-v0.5.236; deploying to pc2new.

Elastos DAO and others added 15 commits May 28, 2026 21:22
POST /identity/reset was retired to a 410 stub in v0.5.232 (folded into
/maintenance/reset-everything) and its Joi body schema dropped from the
exports in the same change — but the const definition was left behind.

Dead-code audit (operator-requested, quadruple-checked before removal):
  [1] defined at EnmRequestSchemas.js — yes
  [2] referenced anywhere in src/ — NO (zero refs outside its own def)
  [3] referenced in tests — N/A (repo has no test files at all)
  [4] dynamic / bracket / string access (RequestSchemas[...]) — NONE
All four checks clean → safe to remove. Module still loads; the other 12
*Body/*Headers schemas are all exported + referenced.

Zero runtime behavior change (an unused Joi object) — no version bump,
no redeploy; rides the next release.

Audit also VERIFIED-SAFE (false alarms — NOT removed):
  - EnmStageSync (utils-stage-sync.js) — a first grep looked orphaned but
    the full grep proved it's LIVE: settings-tab.js:2557/2561 (Settings →
    Advanced manual stage-sync) + loaded by index.html:217. Kept.
  - All other EnmRequestSchemas consts — each has ≥2 refs. Kept.
  - No orphaned fast-sync code from the v0.5.235 removal (coerced, not
    stranded). No unused requires in the v0.5.236 new files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dead-code/refactor audit (operator-requested) found 4 hand-rolled
"skip-if-no-db → append → log-on-failure" wrappers around AuditLog.append:
  routes/maintenance.js:_audit, routes/identity.js:_audit,
  EnmAutoStart.js:safeAudit, EnmStageSyncOrchestrator.js:safeAudit

Added EnmAuditLog.safeAppend(db, log, entry) — the canonical null-guard +
try/catch wrapper (never throws; a lost audit row must not block the
authorised action). Callers still build their own entry (tier/ruleId/
defaults differ per caller, so field-building stays at the call site).

Migrated the TWO SERVICE wrappers (EnmAutoStart, EnmStageSyncOrchestrator)
to delegate — they used log.debug, matching safeAppend exactly, so behavior
is unchanged (same fields, same log level). Each drops ~6 lines of
boilerplate.

DELIBERATELY NOT migrated (would change behavior — flagged, not silently
altered): the two route _audit wrappers use log.WARN (not debug) + a lazy
getDb() resolver. Migrating them would downgrade their failure log to
debug. Left as-is; a behavior-preserving migration would need safeAppend to
take an optional log-level, which is a separate call.

Verified: 3 files node --check; safeAppend exported + returns false (no
throw) on null db. Behavior-preserving → no version bump / redeploy; rides
the next release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ault, one Sidechain settings

Navigation: delete the top-bar chain-selector dropdown. The multi-chain
overview is now the default Dashboard for a Council node with the 4 tabs
always visible; clicking a chain row drills into its dashboard with a
"Back to overview" control; a BPoS node lands on the single mainchain
dashboard (no overview/Back). Node mode is detected in PaneRouter (logic
ported from the removed selector); a static "Council node"/"BPoS node"
label replaces the dropdown. Drill-in is session-only.

Settings: now global (no longer per-chain). All sidechain config lives in
one "Sidechain settings" tab — one shared reward address + one shared sync
mode applied to all EVMs, read-only validator-status pills, and a per-chain
peers/bootnodes accordion (lazy-mounted). GET /chains/:id now returns
sync.mode so the tab shows real values. The dashboard EVM card is read-only;
all reward/sync editing lives only in Settings.

Logs: in-pane chain picker (replaces the static pill); also fixes a latent
bug where the overview's active chain 'all' hit /logs/all/tail.

Cleanup: delete chain-selector.js, the 5 dead per-class settings mounts, the
dormant inline reward editor + its CSS, and all selector CSS/strings
(~1500 net lines removed). Responsive down to compact width preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…" DAO Council/BPoS card

The deployed v0.5.237 overview had colliding/blank action buttons and no node
identity surfaced. Fixes:

- Row layout: .enm-overview-row converted from a fixed 6-column grid to flex.
  The grid's per-variant column counts didn't match the rendered cells (the EVM
  variant declared 5 columns for 6 cells), so the start/stop/restart buttons were
  crammed into a 14px track and collided with the sparkline (the "blank dark
  squares"). Flex lets the main column grow while the right cluster (sparkline ·
  uptime · actions · arrow) sizes to content and never collides, at any width.
  Action buttons lifted to an elevated surface + secondary glyph so they read as
  real controls.

- New "This node" identity card atop the overview: DAO Council membership
  (state + nickname) + BPoS producer status (state + name) side by side, plus
  mining key + address. Truthful for ENM's own node key — shows "Not a Council
  member" / "Not registered" with a hint that a Council node still registers
  BPoS separately (in Essentials) to earn staking rewards. Mirrors node.sh's
  CRC/BPoS status fields. Sourced from the existing /system/identity (crMember +
  producer) — no backend change.

- Relabel CR Council -> "DAO Council" in the overview.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…actions, labelled actions

Full overview redesign (designed in a local browser-preview mock, then ported
+ verified by rendering the real component with sample data at desktop + 375px):

- Health headline: one-line node verdict ("All services healthy" / "N services
  need attention · <names> · X of Y synced") + bulk Start all / Restart all.
- Vertical chain cards replace the flat horizontal rows: header (dot + name +
  state chip + optional "Update available" badge) -> block/height -> metrics
  (peers + RAM) -> folded collapsible oracle line -> labelled action footer
  (Start / Stop / Restart; Update only when the backend reports one). The
  mainchain hero gets a "Manage" link.
- Sparklines removed (the green-triangle noise); per-chain CPU + FD + disk
  removed from the overview (CPU is redundant with the top stat strip; the
  rest lives on the per-chain detail page).
- "Loud when not healthy": stopped/stalled chains get an amber accent and are
  named in the headline; healthy chains stay calm.
- Fully responsive to iPhone 7 (375px): cards stack, action buttons wrap to
  share rows, stat strip 2-up, health headline stacks.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lth headline, routable oracle, tab-switch pause, dead-code sweep

Fixes 8 findings from the v0.5.237→239 overview-redesign review:

- Bulk "Restart all" now confirms via enmDestructiveModal (it stops then
  restarts every running chain); extract _runBulk that actually sets/clears
  the pane-wide _pendingAction guard (the old _onBulk checked it but never
  set it, so SSE could wipe the in-flight buttons).
- Health headline: bucket every non-{synced,syncing,starting,disabled} state
  (error/recovering/unconfigured/loading/stopped/stalled) into "needs
  attention" so an errored chain no longer reads "All services healthy"; the
  healthy-branch detail now reports the real synced count, not total.
- Overview nested oracle line routes into the oracle's own dashboard on click
  (carries data-chain-id; the › caret means "open") — it used to only toggle a
  collapse, leaving the oracle detail page unreachable from the overview.
- Overview pane pauses its 3s /system/usage poll + council:overview SSE while
  the operator is on another in-app tab and resumes on return
  (enmUseVisibilityPause only covers document.hidden, not tab switches).
- SPV stub now appends instead of pane.innerHTML= so it can't wipe the
  "← Back to overview" button on a Council drill-in race.
- Add the overview_pane.{health_*,bulk_*,manage,update_available} +
  chain_actions.update strings (the redesign shipped reading them via tFb
  fallbacks).
- Remove dead code: _summaryLine/_summaryLineV2, _metaHtml/_syncBadgeHtml,
  the whole sparkline cluster (_reconcile/_teardownSparkline(s), heightSeries,
  cssEscape); fix the stale _routeToChain JSDoc.

Verified live in the static preview: real EnmMultiChainOverviewPane renders
with no runtime error at narrow width — accurate health verdict, labelled
bulk actions, all 3 oracle lines routable, identity card intact.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The per-chain action buttons carry both .enm-overview-action (the legacy
icon-button class, still used as the row click-delegation hook → width:26px)
and .enm-ovx-act (the v0.5.239 labelled-button class). .enm-ovx-act overrode
height/padding but never set width, so the old width:26px leaked in: "Restart"
overflowed the 26px box and overlapped the Stop button ("⟳ Rest■ Stop").

Only reproduced above 430px — the <430px @media gives those buttons
flex-basis 50%, which masked it at narrow widths (and in the narrow-only
static preview used to verify v0.5.240). The bulk "Restart all" button was
unaffected because it carries .enm-ovx-act alone.

Fix: .enm-ovx-act now sets width:auto so it fully defines its own box.
Verified at 1100px in the static preview: Restart 85px / Stop 73px, 8px gap,
no overlap, label fully shown.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rged + overview button dual-class cleanup

Primary bug (operator-reported): drilling into an individual EVM chain, the
lower detail content (the EVM account row of the Mining & rewards card, the
footer, system-status) was clipped and only appeared after enlarging the
window.

Root cause: #enm-pane-dashboard is `display:flex; flex-direction:column`, and
its card children default to `flex-shrink:1`. When the stacked cards are taller
than the pane — common on an EVM drill-in (hero + Mining&rewards + system-status)
and at narrow widths where text wraps taller — flexbox SHRINKS the cards to fit
the pane instead of letting the pane's `overflow:auto` scroll. `.enm-section-card`
has `overflow:hidden`, so the shrunk card clipped its own content. Enlarging the
window gave the pane enough height that no shrink was needed, revealing it.

Fix: `#enm-pane-dashboard > * { flex-shrink: 0 }` — every dashboard card keeps
its natural height and the pane scrolls. Verified in the static preview at 640px:
the EVM detail card went from a clipped 208px (content 371px) to its full 373px,
and the pane now scrolls (scrollH 1041 > clientH 876).

Bundled cleanup (removes the latent footgun behind the v0.5.241 button overlap):
the overview's per-chain action buttons no longer carry the legacy
`.enm-overview-action` icon-button class (width:26px) — they're solely the
labelled `.enm-ovx-act`. The row click-delegation hook moved to
`closest('.enm-ovx-act')`, and the dead `.enm-overview-action(s)` CSS was
deleted. Verified: clicking Restart fires _onAction('restart', chainId);
clicking the card body still routes via _routeToChain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The "This node" identity card showed: "A Council node still registers as a
BPoS producer separately (in Essentials) to earn staking rewards." That's
wrong. Verified against code:

- Elastos.ELA dpos/state/arbitrators.go (distributeWithNormalArbitratorsV3):
  an arbiter of type CRC whose crMember.MemberState == MemberElected is paid
  the per-block CRC reward automatically — no RegisterProducer involved. An
  elected Council member's node is auto-promoted to a CRC arbiter and earns
  as part of consensus.
- node.sh keeps CRC and BPoS as independent roles: register_crc (crc register
  tx, during the CR voting period) vs register_bpos (producer register v2).
  activate_crc simply calls activate_bpos — the shared `activate --nodepublickey`
  tx is the inactivity-recovery path (only after downtime flags the node),
  NOT a Council onboarding step.

So a Council node does not register as a BPoS producer to earn. Per operator
directive, the hint is removed outright (no replacement copy) — the DAO Council
pill already states the real status. Also drops the now-unused
overview_pane.identity.bpos_hint string and the dead .enm-identity-hint CSS.

Verified in the static preview: a CR-member-with-no-producer identity card
renders "DAO Council: Elected · GoldGuard" + "BPoS: Not registered" with no
hint and no error.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tion (BL-1)

BL-3 (a11y): overview EVM/oracle cards were mouse-only — a keyboard or
screen-reader user couldn't open an EVM/oracle/arbiter dashboard (only the
mainchain hero's "Manage" button was reachable). Now every routable chain
card gets a native "Manage ›" button (the card stays NOT role=button to avoid
nesting interactive content), and oracle lines (no inner buttons) become
role=button + tabindex=0 + aria-label with an Enter/Space keydown handler.
Stale class JSDoc corrected.

BL-1 (update detection): the overview's per-chain `updateAvailable` was never
populated, so the Update badge/button was dormant for every chain — and the
existing GitHub-based mainchain scanner returns nulls on egress-locked hosts.
New EnmChainUpdateScanner mirrors node.sh's get_elastos_ver_latest: lists
https://download.elastos.io/elastos-<name>/?F=1 for ela/esc/eid/pg, picks the
newest version, compares against the installed `--version` (ChainState.
snapshotVerified). Self-throttled 6h refresh kicked fire-and-forget from the
overview tick; buildChainEntry only reads the synchronous cache so the
snapshot stays cheap (no new RPC/spawn on-tick). Validated live: installed
v0.2.7.1 etc. compare directly against the mirror's elastos-esc-v0.2.7.1 dirs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lse positive

BL-2: the 9 Settings sections are now grouped under 4 nav-rail subheaders
(Node / Network & sidechains / Maintenance / Danger zone) instead of a flat
list. Sections are unchanged — just reordered into groups and a decorative
(aria-hidden) subheader emitted when the group changes. Default section stays
keyed ('network'), pills (narrow) keep working, role=tablist navigation
unaffected.

Scanner fix (BL-1 follow-up): EnmChainUpdateScanner.parseLatest accepted any
dir whose name started with a digit, so the mirror's commit-hash build
"elastos-ela-9dc17ff" parsed as major version 9 (parseInt) and outranked
v0.9.9.5 → mainchain falsely showed "update available". Tightened the filter
to dotted-numeric versions only (^v?\d+(\.\d+)+$), which also drops -hotfix/-rc
suffixed tags. Verified against the live mirror listing: latest now resolves
to v0.9.9.5 == installed → updateAvailable false.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ticated)

Operators monitor a fleet of validators from one place. node.sh's `all_status`
gave a whole-node roll-up (every chain + service, version, active/inactive) and
the monitor reached it by being IP-whitelisted AND holding the RPC
user/password (ela's RpcConfiguration {User,Pass,WhiteIPList}). ENM already
computes that picture (CouncilOverviewService + per-chain ChainState version)
but only behind the owner token on loopback. This exposes it for monitoring.

New EnmStatusEndpoint: a read-only http.Server on 0.0.0.0:20920 serving GET
/status only (404 else), gated by the SAME RPC-access policy the operator
already configures — IP allow-list (whiteIPList; real socket peer, X-Forwarded-
For ignored; loopback always) + HTTP Basic-Auth (rpc user + decrypted
password). Binds ONLY while rpc.enabled in a Council install (default off ⇒ no
open port). Payload: { ts, node:{mode}, components:[{id,name,class,version,
active,state,height,networkHeight,peers,updateAvailable}] } for mainchain, esc,
eid, pg, the 3 oracles, and arbiter. No secrets in the body; geth/ela/arbiter
RPC untouched (nothing proxied).

Wiring: EnmFirewallManager.reconcileSourceRules(port, ipList) opens the status
port per-source-IP (defense-in-depth; the endpoint also enforces IP+auth
in-process). server.js builds/starts/stops it; config.js PUT /config/mainchain
reloads it on save and now rejects whitelist prefixes broader than /24 (v4) //64
(v6). Frontend Access section wires the formerly-dead master enable toggle
(required to open RPC at all) and shows the monitor URL
http://<lan>:20920/status.

Verified: scanner/IP-match unit checks pass; settings _fillCreds reflects
enabled + resolves the monitor host from lanUrls.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
After a server restart the in-process binaryVersion cache (ChainState) is empty
until snapshotVerified runs, so GET /status briefly reported version:null for
most components. The handler now kicks a one-shot ChainState.snapshotVerified()
for any component whose cached version is null (guarded by a _warmed set so it
never re-spawns `--version`, incl. for oracle scripts with no resolvable
version), so versions fill in within a poll or two of enabling.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fixes the P1 findings from the final validator-readiness audit (no P0 found).

Reactivation (ActivateProducer is NODE-KEY signed — verified against
Elastos.ELA/core/transaction/activateproducertransaction.go:113/212):
- F12 (BPoS producer Inactive): copy now points to the in-app Activate
  control instead of falsely telling validators "owner key … ENM cannot do
  this for you" + steering them to ela-cli at the moment they lose their slot.
- Inactive CR Council member: new "Reactivate Council node" button on the
  validator card (POST /chains/mainchain/bpos/activate). F28 alert copy +
  head_sub_inactive corrected to match (was wrongly "via Essentials, ENM
  cannot do this"). Only the Inactive sub-state is offered (Impeached/
  Returned/Terminated terminal; Illegal height-gated on-chain).
- Add 3 missing bpos_card.activate_* strings that rendered as literal
  "[bpos_card.activate_btn_active]" placeholders in the live activate flow.

Silent-earning-loss detection:
- New F29 (Class B, Council-only, alert-only): warns when an EVM sidechain
  fell back to FOLLOWER because producer status was UNREADABLE (mainchain RPC
  down / creds undecryptable) rather than genuinely off-duty. Adapter records
  the role decision; HealthChecker plumbs it; gated to can't-read sources.

Hardening to node.sh parity:
- Mainchain HttpInfo/Rest/Ws servers OFF (node.sh parity) — unauthenticated
  0.0.0.0 listeners ENM never uses; gratuitous attack surface on a firewalless host.
- Import keystore → set dpos.enableArbiter=true (best-effort) so a keyless→keyed
  node actually signs.
- EVM archive mode = `--syncmode full --gcmode archive` (was invalid `--syncmode archive`).
- Binary installer: strict dotted-numeric version filter (no rc/hotfix/commit
  picks) + staged extract → smoke-test → atomic swap with .bak rollback.
- Council install preflight now runs the clock-skew probe (Card D.5) — a node
  outside ela's ~4.2s DPoS tolerance misses blocks on going on-duty.
- Boot-time, read-only systemd TimeoutStopSec assertion: warns if geth could
  be SIGKILLed mid-flush on stop/restart (the F26 corruption path).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…estart-modal "not running"

Two operator-reported bugs:

1. Overview sometimes shows "Update available" for a chain you JUST updated to
   the latest. Root cause: EnmChainUpdateScanner caches {installed, latest,
   updateAvailable} and only re-reads `installed` on its own 6h cadence —
   nothing invalidated it when the binary changed. So after an update the cache
   kept the pre-update `installed` (→ updateAvailable:true) until the next 6h
   refresh. Fix: add EnmChainUpdateScanner.invalidate(chainId) (drops the entry
   + resets _lastAttemptAt so the next ensureFresh re-polls immediately), and
   call it from EnmBinaryDownloader._run's DONE phase — the moment the new
   binary actually goes live, covering update / reinstall / setup uniformly.
   (Verified vs the live mirror + GitHub: installed v0.9.9.5 = mirror max,
   GitHub latest v0.9.9 < installed, so the steady-state compare is correct;
   the false positive was purely the stale post-update cache window.)

2. Access section's "Restart mainchain to apply" modal said "the chain isn't
   currently running, so there's nothing to restart" even while the mainchain
   was running. Root cause: the probe read `envelope.data`, but api.get()
   already unwraps the {success,result} envelope and returns `result` directly,
   so `envelope.data` was always undefined → alive=false for every chain. Fix:
   use the same (env.result)||(env.data)||env unwrap every other caller uses.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@4HM3DMD 4HM3DMD merged commit c9e40c3 into main May 29, 2026
2 checks passed
4HM3DMD pushed a commit that referenced this pull request May 29, 2026
Adds the Elastos Node Manager (ENM): a self-contained service-type PC2
app that installs, runs, and self-heals a full Elastos Council / BPoS
validator node (mainchain + ESC/EID/PG sidechains + 3 oracles + arbiter)
from the PC2 desktop.

Two parts:
- enm-server/ — standalone Express sidecar that supervises the chain
  processes, exposes /api/enm, and runs the health/self-heal engine.
- src/backend/apps/elastos-node-manager/ — the Puter app (the UI the
  operator opens from the desktop).

Plus build + deploy infra:
- .github/workflows/build-enm-bundle.yml — bundles the app for release.
- scripts/deploy-enm.sh — installs/upgrades the bundle on a PC2 node.

DEPENDS ON #18 (pc2-node service-type app support): ENM installs through
the service-app mechanism that PR adds. Merge #18 first.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant