Dynamic Epistemic State Machine for Score Governance
No system may claim certainty it cannot justify. No system may claim stability it has not stress-tested.
FSVE is a production-grade Python engine that determines whether a score is telling the truth — over time, under pressure, and across networks.
Most scoring systems produce a number. FSVE produces an Epistemic State Vector — a multi-dimensional trajectory that tracks not just what a claim scores, but how fast its validity is changing, how fragile it is to evidence retraction, and whether its evidence network is an isolated echo chamber rather than a genuine consensus.
FSVE v4.3 transitions from a static score governor to a Dynamic Epistemic State Machine:
Ψ(t) = ⟨ EV(t), T_m(t), F_x(t), Topology(t), Uq(t), Cq(t) ⟩
| Dimension | Symbol | Question It Answers |
|---|---|---|
| Epistemic Validity | EV |
What does the evidence say? |
| Temporal Momentum | T_m |
How fast is validity changing — and is that suspicious? |
| Fragility Axis | F_x |
What happens if a single critical premise is retracted? |
| Topology State | T |
Is this artifact embedded in genuine consensus or an echo chamber? |
| Uncertainty Quantification | Uq |
How wide are the real uncertainty bounds? |
| Consequence Severity | Cq |
What is the cost of being wrong here? |
Scoring systems fail in three structurally distinct ways that point-estimate engines cannot detect:
1. Temporal laundering — Evidence floods in rapidly, EV spikes to VALID, and no one asks whether the rate of convergence is physically plausible for the domain. FSVE tracks Temporal Momentum and enforces Lipschitz continuity bounds. If |T_m| > K · λ_E, FMIA-G17 fires and the artifact is routed to human review.
2. House-of-cards validity — An artifact achieves EV ≥ 0.70 on fifteen pieces of evidence, but thirteen of them cite the same foundational paper. Retract that paper and the EV collapses to 0.20. FSVE's Counterfactual Mode (C-Mode) executes do(c_i = ∅) for every critical evidence node, computes the Fragility Axis F_x, and classifies fragile artifacts as FRAGILE_VALID — preventing them from being used as critical prerequisites in downstream reasoning chains.
3. Echo-chamber consensus — A community of researchers mutually cites each other, producing a dense internal evidence graph with zero orthogonal bridges to outside domains. Standard scoring sees strong consensus. FSVE's Cross-Artifact Validation Topology Engine (CAVTE) computes Betti numbers: β₁ > 0 with no orthogonal bridges means a topological echo chamber is detected and penalized.
FSVE does not make decisions. It determines whether decisions can be scored without lying — and formally monitors itself for the same compliance it enforces on others.
# Coming to PyPI — pip install fsve
# Until then, install from source:
git clone https://github.com/AionSystem/FSVE.git
cd FSVE
pip install -e .Requirements: Python 3.11+ · networkx · numpy · scipy
Optional (full persistent homology):
pip install fsve[topology] # adds gudhi for production-grade Betti number computationfrom fsve import create_engine, GRADEBreakdown, ReviewResult, ReviewerRole
# Initialize the engine (includes CAVTE topology graph)
engine = create_engine()
# Define GRADE evidence breakdown (§23)
grade = GRADEBreakdown()
grade.risk_of_bias["score"] = 0.10 # Low risk
grade.inconsistency["score"] = 0.10 # Low inconsistency
grade.imprecision["score"] = 0.10 # Some imprecision
grade.large_effect["score"] = 0.00 # No large effect upgrade
# Define reviewer perspectives (§11)
reviewers = [
ReviewResult(role=ReviewerRole.HOSTILE, severity=0.30),
ReviewResult(role=ReviewerRole.NAIVE, severity=0.20),
ReviewResult(role=ReviewerRole.CONSTRUCTIVE, severity=0.15),
ReviewResult(role=ReviewerRole.PARANOID, severity=0.40),
ReviewResult(role=ReviewerRole.TEMPORAL, severity=0.25),
]
# Score — full 23-step pipeline
tensor = engine.score(
subject="Clinical AI Diagnostic System v3.1",
grade_breakdown=grade,
A=0.80, # Assumption Explicitness
C=0.75, # Constraint Stability
M=0.85, # Model Coherence
D=0.90, # Domain Fit
G=0.70, # Causal Grounding
X=0.80, # Explanatory Depth
U=0.75, # Update Responsiveness
L=0.20, # Abstraction Leakage (HIGH = BAD)
Y=0.85, # Ethical Alignment
H=0.80, # Hostility Resistance
Cq=0.75, # High consequence — clinical domain
reviewer_results=reviewers,
evidence_stability=0.90,
evidence_total=12,
evidence_examined=11,
)
print(f"EV: {tensor.value}")
print(f"Status: {tensor.validity_status.value}")
print(f"F_x: {tensor.fragility_axis.F_x_value}")
print(f"Confidence: {tensor.confidence_ceiling}")
print(f"RQS Quality: {tensor.rqs.rqs_quality.value}")
print(f"PPP Sealed: {tensor.ppp.priority_sealed}")
print(f"CGP Action: {tensor.consumer_guidance.recommended_action['action']}")Output:
EV: 0.5821
Status: SUSPENDED
F_x: 1.0
Confidence: 0.4703
RQS Quality: HIGH
PPP Sealed: True
CGP Action: BLOCK
SUSPENDEDbecause C-Mode was not executed (F_x defaults to 1.0 / maximum fragility when skipped). Run withcritical_nodesand ashadow_scoring_fnto perform counterfactual stress testing and potentially achieveVALIDorFRAGILE_VALIDstatus.
C-Mode is mandatory for VALID status in high-consequence deployments. It implements Pearl's do-calculus to stress-test the artifact against evidence retraction.
# Define which evidence nodes are CRITICAL (weight >= 0.85)
critical_nodes = [
{"id": "primary_trial", "description": "Phase III RCT — primary source", "weight": 0.92},
{"id": "meta_analysis_01", "description": "Cochrane systematic review", "weight": 0.88},
]
# Shadow scoring function: returns EV when a critical node is removed
def shadow_fn(excluded_node_ids: list[str]) -> float:
if "primary_trial" in excluded_node_ids:
return 0.28 # EV collapses without the primary trial
if "meta_analysis_01" in excluded_node_ids:
return 0.51 # Moderate collapse without meta-analysis
return 0.72
tensor = engine.score(
subject="Clinical AI Diagnostic System v3.1",
grade_breakdown=grade,
# ... axes ...
Cq=0.75,
reviewer_results=reviewers,
critical_nodes=critical_nodes,
shadow_scoring_fn=shadow_fn,
)
print(f"F_x: {tensor.fragility_axis.F_x_value:.3f}")
print(f"R_s: {tensor.fragility_axis.resilience_score_R_s:.3f}")
print(f"Status: {tensor.validity_status.value}")
print(f"Fragile: {tensor.fragility_axis.fragile_validity_triggered}")Output:
F_x: 0.611
R_s: 0.226
Status: FRAGILE_VALID
Fragile: True
FRAGILE_VALID— EV is above 0.70 but the artifact loses > 60% of its validity if the primary trial is retracted. It is blocked from serving as acritical_prerequisitein downstream lineage chains (Law 13 / Law 1 Override).
For live systems with continuous evidence streams:
# Attach a streaming engine to a domain
stream = engine.attach_stream(
stream_id="sensor_telemetry_v2",
context_half_life=3600.0, # 1-hour half-life
lipschitz_K=0.50, # Domain Lipschitz constant
)
from fsve.streaming import EvidenceEvent
import time
for reading in live_sensor_feed():
event = EvidenceEvent(
weight=reading.confidence,
domain="PHYSICAL",
payload=reading.metadata,
)
result = engine.ingest_event("sensor_telemetry_v2", event)
if result["lipschitz_violated"]:
# Temporal Momentum exceeded Lipschitz bound
# Route to HITL — possible evidence flooding or sensor malfunction
alert_human_reviewer(result)
print(f"Rolling EV: {result['rolling_ev']:.4f} | T_m: {result['temporal_momentum']:.4f}")CAVTE maintains a live knowledge graph of all scored artifacts. It computes persistent homology to detect systemic epistemic debt at the network level.
from fsve import CAVTEEngine
cavte = CAVTEEngine()
engine = create_engine(cavte=cavte)
# Score two artifacts
t1 = engine.score(subject="Foundational Study A", ...)
t2 = engine.score(subject="Replication Study B", ...)
# Link them — B cites A
cavte.add_edge(t2.id, t1.id, "references")
# Add orthogonal external verification
cavte.add_artifact("external_replication_C")
cavte.add_edge(t2.id, "external_replication_C", "orthogonal_bridge")
# Compute topology metrics for artifact B
metrics = cavte.compute_topology_metrics(t2.id)
print(f"β₀ (isolation): {metrics.betti_0_components}")
print(f"β₁ (echo chamber): {metrics.betti_1_cycles}")
print(f"Orthogonal bridge: {metrics.orthogonal_bridge_verified}")
print(f"p_topo penalty: {metrics.isolation_penalty_p_topo}")FSVE v4.3 — 23-Step Scoring Pipeline
─────────────────────────────────────
Step 1 Lineage & contamination flag inheritance (Laws 7, 8)
Step 2 Uncertainty mass baseline (MeasurementClass penalty)
Step 3 CQi — Convergence Quality Index classification
Step 4 E-Axis via GRADE + E_enhance hybrid formula (§23)
Step 5 Remaining 10 static axes (A, C, M, D, G, X, U, L, Y, H)
Step 6 CAVTE topology ingestion + Betti number computation
Step 7 C-Mode shadow scoring → Fragility Axis (F_x)
Step 8 Temporal Momentum (T_m) + Lipschitz bound check
Step 9 5-reviewer architecture (CRS, CRA, synergy detection)
Step 10 Assumption Load (Law 4)
Step 11 Coverage Cartography Protocol (CCP, §32)
Step 12 Run Quality Score (RQS, §29)
Step 13 Confidence Ceiling computation (20-penalty table, §5)
Step 14 Uncertainty mass inheritance (Law 7)
Step 15 Uq / Cq governance modifiers
Step 16 EV computation (13-axis weighted mean, bottleneck, §7)
Step 17 FMIA audit — all 18 gates (§31)
Step 18 Context drift / Freshness (Law 5)
Step 19 Survivor-Class Certification (SCC, §30)
Step 20 Epistemic State Vector finalization
Step 21 Consumer Guidance Protocol (CGP, §21)
Step 22 PPP seal (SHA-256 chain, §34)
Step 23 Caution Level assignment
fsve/
├── models/ ScoreTensor schema — all enums, dataclasses, state vector
├── laws/ §3 — Epistemic Laws 1–14 (including T_m, F_x, topology)
├── ceiling/ §5 — Confidence Ceiling (20-penalty table)
├── grade/ §23 — GRADE rubric + E_enhance (5 components)
├── axes/ §7 — 13-axis EV computation engine
├── fmia/ §31 — Failure Mode Inhibition Architecture (18 gates)
├── reviewers/ §11 — 5-reviewer architecture (CRS, CRA, synergy)
├── rqs/ §29 — Run Quality Score
├── ccp/ §32 — Coverage Cartography Protocol (v4.3 topology weight)
├── cqi/ §33 — Convergence Quality Index
├── ppp/ §34 — Priority Provenance Protocol (SHA-256 chain)
├── cavte/ §36 — Cross-Artifact Validation Topology Engine
├── cmode/ §37 — Counterfactual Mode (Pearl's do-calculus)
├── hitl/ §11.6 — Human-in-the-Loop queue and routing
├── streaming/ §4.7 — Streaming epistemic ingress (rolling EV, T_m)
├── cgp/ §21 — Consumer Guidance Protocol + laundering detection
├── scc/ §30 — Survivor-Class Certification
├── fcl/ §16 — Framework Calibration Log entry template
├── utils/ VK Self-Application Certificate
└── engine/ §0 — Main orchestrator (FSVEEngine)
tests/
└── test_fsve.py 59 tests — Laws 1–14, FMIA gates, PPP chain,
CAVTE, CQi, RQS, CCP, full pipeline
| # | Symbol | Axis | Direction | v4.3 Role |
|---|---|---|---|---|
| 1 | E | Evidence Strength (GRADE + E_enhance) | High = Good | Base factual grounding |
| 2 | A | Assumption Explicitness | High = Good | Structural transparency |
| 3 | C | Constraint Stability | High = Good | Boundary integrity |
| 4 | M | Model Coherence | High = Good | Internal consistency |
| 5 | D | Domain Fit | High = Good | Contextual relevance |
| 6 | G | Causal Grounding | High = Good | Causal depth |
| 7 | X | Explanatory Depth | High = Good | Interpretability |
| 8 | U | Update Responsiveness | High = Good | Adaptability |
| 9 | L | Abstraction Leakage | High = BAD | Implementation bleed |
| 10 | Y | Ethical Alignment | High = Good | Axiological coherence |
| 11 | H | Hostility Resistance | High = Good | Adversarial robustness |
| 12 | T_m | Temporal Momentum (v4.3) | High = Good | Velocity governance (Law 12) |
| 13 | F_x | Fragility Axis (v4.3) | High = Good | Counterfactual resilience (Law 13) |
⚠️ L-axis warning: L is the only axis where a higher raw score indicates worse performance. A high L score degrades EV as expected (inverted before weighting). A highT_m_axisscore means EV velocity is within Lipschitz bounds — stable and legitimate.
FSVE never silently fails. Every structural violation fires a named gate with an explicit inhibition type.
| Gate | Type | Failure Mode | Action |
|---|---|---|---|
| G01 | ABSOLUTE | Formula error / arithmetic invalidity | Block all outputs |
| G02 | ABSOLUTE | Domain error (floor/ceiling missing) | Block affected axis |
| G03 | ABSOLUTE | Division by zero | Block affected axis |
| G04 | THRESHOLD | Zero reviewers completed (RR = 0) | Downgrade to DEGRADED |
| G05 | ABSOLUTE | Measurement class undeclared | Block axis score |
| G06 | THRESHOLD | E-axis scored without GRADE rubric | +0.15 uncertainty mass |
| G07 | THRESHOLD | Assumption correlation unchecked | +0.10 uncertainty mass |
| G08 | ABSOLUTE | Gini laundering detected (Gini < 0.15) | Cap at DEGRADED |
| G09 | THRESHOLD | CDF independence undeclared | +0.10 uncertainty mass |
| G10 | ABSOLUTE | Lineage depth ≥ 6 | SUSPEND all descendants |
| G11 | ABSOLUTE | Cq ≥ 0.6 override without risk acceptance | Block override |
| G12 | THRESHOLD | SCC certification window elapsed | Block FCL qualification |
| G13 | THRESHOLD | RQS < 0.60 (low run quality) | CGP escalation |
| G14 | THRESHOLD | Coverage fraction < 0.50 without declaration | +0.10 uncertainty mass |
| G15 | THRESHOLD | E_enhance proportion dominance | Laundering flag |
| G16 | THRESHOLD | Kalman divergence > 0.15 | Recommend second run |
| G17 | ABSOLUTE | Temporal Momentum Lipschitz violation (v4.3) | Clamp to DEGRADED; route HITL |
| G18 | THRESHOLD / ABSOLUTE | C-Mode bypassed or F_x > 0.60 on Cq ≥ 0.8 (v4.3) | FRAGILE_VALID / SUSPEND |
| Status | Condition | Consumer Action (Standard) |
|---|---|---|
VALID |
EV ≥ 0.70, F_x ≤ 0.60, T_m within bounds | PROCEED |
FRAGILE_VALID |
EV ≥ 0.70, F_x > 0.60 | PROCEED_WITH_MONITORING |
DEGRADED |
EV in [0.40, 0.70) | PROCEED_WITH_WARNING |
SUSPENDED |
EV < 0.40 OR any ABSOLUTE FMIA gate active | BLOCK |
SUSPENDED_URGENT |
EV < 0.40 AND Cq ≥ 0.70 | BLOCK + immediate escalation |
TOPOLOGICALLY_ISOLATED |
β₁ > 0, no orthogonal bridges | HUMAN_REVIEW |
Cq (Consequence Severity) dynamically raises the validity threshold. At Cq = 0.80, VALID requires EV ≥ 0.94.
Every ScoreTensor is cryptographically sealed with a SHA-256 chain at the moment of scoring. Retroactive insertion is mathematically detectable.
# Each scored artifact produces a tamper-evident PPP block
print(tensor.ppp.chain_hash) # 64-char SHA-256 hex
print(tensor.ppp.entry_hash) # Entry-level hash
print(tensor.ppp.verification_status) # CHAIN_VALID
# Verify a chain of tensors — detects retroactive insertion
from fsve.ppp import PPPEngine
valid, message = PPPEngine.verify_chain([t1.ppp, t2.ppp, t3.ppp])
# Returns: (True, "CHAIN_VALID") or (False, "Block N: chain hash mismatch")The v4.3 seal includes the full State Vector Ψ(t) — making it impossible to retroactively claim a different Temporal Momentum, Fragility, or Topology State was present at scoring time.
FSVE tracks its own accuracy through a structured calibration log. Every FCL entry is a scored artifact with a verifiable ground-truth outcome recorded ≥ 6 months after scoring.
from fsve.fcl import FCLEntry
from datetime import datetime, timezone
# Create a draft FCL entry from a scored tensor
entry = FCLEntry.from_tensor(tensor)
# After ground truth is available (≥ 6 months later):
entry.outcome_date = datetime(2027, 1, 15, tzinfo=timezone.utc)
entry.outcome = "System deployed; 94% diagnostic agreement with specialist panel."
entry.validity_status_correct = True
entry.EV_delta = 0.04 # |predicted - observed|
entry.retraction_survival_observed = True
# Check FCL qualification gates
qualifies, failed_gates = entry.qualifies()
print(f"FCL Qualifies: {qualifies}")
# FCL Qualifies: TrueCurrent FCL status: 0 entries (M-MODERATE convergence). M-STRONG requires ≥ 5 entries with ground truth. M-VERY_STRONG requires ≥ 20 published entries with > 80% validity status prediction accuracy.
FSVE applies its own scoring engine to itself on every major release. The v4.3 self-assessment:
EV: 0.785 (VALID — provisional, bottleneck-capped at E=0.53)
Validity Status: VALID
Convergence: M-MODERATE
FCL Entries: 0
Primary Gap: E-axis (no external ground truth — FCL empty)
Path to M-STRONG: FCL entries ≥ 5 + external VK run
Run the VK self-test at any time:
from fsve import FSVEEngine
results = FSVEEngine.self_test()
print(results["15.1_verdict"]) # PASSFSVE's own claims carry epistemic tags per its specification:
| Claim | Tag | CF |
|---|---|---|
| EV threshold 0.70 for VALID status | [S] Strategic |
55 |
| Fragility threshold F_x > 0.60 | [S] Strategic |
45 |
| Echo chamber threshold θ_EC = 3.0 | [S] Strategic |
40 |
| FMIA 18-gate completeness | [R] Reasoned |
45 |
| PPP chain hash forge resistance | [R] Reasoned |
50 |
| CCP coverage fraction weights | [S] Strategic |
40 |
| Gini < 0.15 laundering threshold | [R] Reasoned |
55 |
All thresholds are calibration targets — not proven constants. They carry NBP falsification conditions. If FCL data shows a threshold is miscalibrated, a patch release adjusts it. The spec explicitly documents what would falsify each claim.
| Milestone | Status | Notes |
|---|---|---|
| Core engine — Laws 1–14 | ✅ Complete | Verified, 59/59 tests passing |
| 13-axis EV computation | ✅ Complete | Including T_m and F_x axes |
| 18-gate FMIA registry | ✅ Complete | G17 and G18 (v4.3) included |
| CAVTE topology engine | ✅ Complete | Betti numbers, echo chamber detection |
| C-Mode shadow scoring | ✅ Complete | Pearl's do-calculus, first-order |
| HITL module | ✅ Complete | Queue, routing, clearance |
| Streaming ingress | ✅ Complete | Rolling EV, Lipschitz-bounded |
| PPP SHA-256 chain | ✅ Complete | Tamper detection verified |
| FCL template | ✅ Complete | Qualification gates implemented |
pip install fsve |
🔄 Finalizing | PyPI registration pending |
| External VK run | 🔄 Pending | Required for M-STRONG convergence |
| FCL entries (≥ 5) | 🔄 Pending | Required for M-STRONG |
gudhi topology integration |
🔄 Planned | Full persistent homology (optional dep) |
| TypeScript API bindings | 🔄 Planned | Thin REST/JSON layer over Python core |
| FSVE-EXTENSIONS v1.0 | 🔄 Planned | Aviation, CARA, Monte Carlo, Lean |
Structural Honesty First. A structurally honest score of 0.40 is more valuable than a structurally dishonest score of 0.90.
Uncertainty Is Conserved. Uncertainty may be reduced, bounded, transferred, or deferred. It may never be erased silently.
Scores Are Claims, Not Truth. Every score must be explainable, reversible on new evidence, and degradable under contextual stress.
Invalidatability Is Required. Any scoring system that cannot produce the output "this score is invalid" is not a scoring system. It is decoration.
Truth Is a Trajectory. To measure a claim without measuring its velocity, its fragility, and its topology is to mistake a photograph for a living system.
Sheldon K. Salmon AI Reliability Architect · AI Certainty Engineer · Founder, AionSystem
ORCID: 0009-0005-8057-5115
Co-Author / Instrument: ALBEDO (SYNARA Session Architecture)
FSVE is a component of the AION Constitutional Stack — a sovereign AI governance and epistemic certainty architecture spanning 200+ components, 18+ frameworks, and 22 registered Zenodo DOIs.
This repository contains two distinct components with separate licensing to protect both the open dissemination of the specification and the commercial viability of the runtime engine.
-
The Specification (
.mdfiles): Licensed under CC BY-ND 4.0. You may read, cite, timestamp, and share this document, but you may not create derivative works (forked constitutions) or use it for commercial purposes without explicit written permission from the Architect. -
The Constitutional Engine (
.pyfiles): Licensed under the GNU AGPL v3.0. If you use this engine to provide a service over a network, you are legally required to open-source your entire application stack under the same license.
If you are a platform, enterprise, or organization that wishes to integrate the Constitutional Engine v2.2 into a proprietary, closed-source, or commercial product without triggering AGPL copyleft obligations, a commercial dual-license is available.
For commercial licensing inquiries, audit integration, or steward certification, contact: aionsystem@outlook.com
@software{salmon2026fsve,
author = {Salmon, Sheldon K.},
title = {{FSVE}: Foundational Scoring and Validation Engine v4.3},
year = {2026},
publisher = {AionSystem},
url = {https://github.com/AionSystem/FSVE},
note = {Dynamic Epistemic State Machine for Score Governance.
ORCID: 0009-0005-8057-5115}
}FSVE v4.3 — Major Release Candidate · Convergence: M-MODERATE · 59/59 tests passing
The mind keeps building. The product stays simple.