Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,14 @@ project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]

### Added
- **`refs` hub composition — validation pass (#4).** A hub's `refs:` now compose *other hubs*: a
path relative to the referencing hub (`./resolve.md`), optionally `> symbol` to address one claim
within the target (matched against that claim's `at:` anchor). `surf lint` blocks a ref that
doesn't resolve to a loaded hub, points at its own hub, or names a claim the target lacks — the
same fail-on-typo discipline as `covers`. The `check` verdict does **not** read `refs` yet
(staleness does not propagate across hubs); this ships the validated navigation graph first, per
the §9.3 unlock discipline. The repo's own hubs now declare their cross-hub `refs`, and the two
prior doc-pointing `refs` were reclassified to prose links.
- **`surf lint` consolidation nudges (#142).** Two advisory warnings push hubs away from the
"claim-log" shape (one claim per function) and toward onboarding docs: a *claim-log* warning
when a hub has several claims and never once uses a multi-site `at:` list, and a *thin-prose*
Expand Down
11 changes: 8 additions & 3 deletions docs/guides/authoring-hubs.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,14 @@ Prose a human (or agent) reads to understand this domain.
- **`at`** — the anchor: where the claim's logic lives (grammar below).
- **`hash`** — the seal. Absent until you `surf verify`; the gate treats a hashless claim as
*unverified*.
- **`refs` / `covers`** — forward-declared and currently inert. `refs` (hub composition) is parsed
but unused; `covers` (advisory file-scope globs) is parsed and lint-validated but never affects
`surf check`. Leave them empty unless you have a reason — the features that consume them aren't
- **`refs`** — hub composition: paths to *other hubs* this one builds on, written relative to this
hub (`./resolve.md`), optionally `> symbol` to point at one claim within the target
(`./resolve.md > resolve_nodes`, matched against that claim's `at:` anchor). `surf lint` blocks a
ref that doesn't resolve to a hub, points at this hub, or names a claim the target lacks — so a
typo can't rot silently. The `check` *verdict* doesn't read `refs` yet (a referenced hub going
stale won't flag this one); refs are a validated navigation graph for now.
- **`covers`** — advisory file-scope globs; parsed and lint-validated but never affects
`surf check`. Leave it empty unless you have a reason — the feature that consumes it isn't
shipped.

Where hubs live is configured by the `hubs` glob in `surf.toml` (default `hubs/*.md`); keep them
Expand Down
2 changes: 1 addition & 1 deletion hubs/anchor.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ anchors:
a 1-based `@N` positional suffix for genuine name collisions. Empty/zero/missing parts
are typed parse errors.
at: surf-core/src/anchor.rs > parse_anchor
hash: 2:5499582e3a55
hash: 2:a41aca45f340
refs: []
---

Expand Down
4 changes: 3 additions & 1 deletion hubs/cli-check.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ anchors:
clean anchors still stamped under v1, so run can nudge the one-time `surf verify` upgrade.
at: surf-cli/src/check.rs > check_workspace
hash: 2:4f5890aca70c
refs: []
refs:
- ./cli-git.md
- ./cli-verify.md
---

# surf check
Expand Down
3 changes: 2 additions & 1 deletion hubs/cli-git.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ anchors:
Best-effort: a pure mv with no content match may show as delete+add and go undetected.
at: surf-cli/src/git.rs > renamed_to
hash: 2:260267073598
refs: []
refs:
- ./rename.md
---

# git helpers
Expand Down
2 changes: 1 addition & 1 deletion hubs/cli-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ anchors:
before sealing.
at: surf-cli/src/main.rs > Command
hash: 2:1af394872add
refs: ["../docs/reference/commands.md"]
refs: []
---

# CLI reference surface
Expand Down
2 changes: 1 addition & 1 deletion hubs/cli-stats.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ anchors:
silent zero or a quietly-narrowed hub set.
at: surf-cli/src/stats.rs > compute
hash: 2:1422981eb9fa
refs: ["../docs/guides/stats.md"]
refs: []
---

# surf stats
Expand Down
5 changes: 4 additions & 1 deletion hubs/cli-workspace.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@ anchors:
deduped.
at: surf-cli/src/workspace.rs > Workspace > hub_paths
hash: 2:c69c8264bcfd
refs: []
refs:
- ./cli-check.md
- ./cli-lint.md
- ./config.md
---

# Workspace
Expand Down
3 changes: 2 additions & 1 deletion hubs/hash.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ anchors:
multiple sites combine order-sensitively, so the claim is stale if any listed span changes.
at: surf-core/src/hash.rs > combine_site_hashes
hash: 2:cbbbbc3b2237
refs: []
refs:
- ./cli-verify.md
---

# Canonical hashing
Expand Down
7 changes: 5 additions & 2 deletions hubs/hub-format.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ anchors:
replaces/inserts only its hash line, so an unchanged hash is byte-identical.
at: surf-core/src/hub.rs > set_anchor_hash
hash: 2:29805baa85ea
refs: []
refs:
- ./cli-lint.md
- ./cli-check.md
covers:
- surf-core/src/hub.rs
---
Expand All @@ -23,7 +25,8 @@ A hub is the unit every command reads and writes: a `---`-fenced YAML frontmatte
machine-checkable `anchors`) followed by a markdown body (the prose a human or agent reads).
`parse_hub` is the contract everything else binds to — its shape is why `at:` can be a scalar or a
list, why `hash` is optional until verified, and why unknown fields are rejected (so a typo can't
masquerade as a new field) while the forward-declared `refs`/`covers` are accepted but inert.
masquerade as a new field) while `refs`/`covers` are accepted and lint-validated but never gate the
`check` verdict.

**The distinction that drives the design:** a human reviews every write, so edits must be
*surgical*. Writes go through the line-level editor (`set_anchor_hash` / `set_anchor_at`) rather
Expand Down
196 changes: 184 additions & 12 deletions surf-cli/src/lint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -105,14 +105,23 @@ fn lint_workspace(ws: &Workspace) -> Result<Vec<Finding>> {
let mut unhealthy: HashSet<String> = HashSet::new();
let mut owner: HashMap<String, String> = HashMap::new();

for loaded in ws.iter_hubs()? {
let rel = loaded.rel;
let hub = match loaded.hub {
// `refs` validation needs every other hub on hand, so load once and index the well-formed
// hubs by rel. A malformed hub is absent from the index (it gets its own block below), so a
// ref into it reads as "does not resolve to a hub" — which it effectively doesn't.
let loaded = ws.iter_hubs()?;
let hub_index: HashMap<&str, &surf_core::Hub> = loaded
.iter()
.filter_map(|l| l.hub.as_ref().ok().map(|h| (l.rel.as_str(), h)))
.collect();

for loaded_hub in &loaded {
let rel = loaded_hub.rel.as_str();
let hub = match &loaded_hub.hub {
Ok(hub) => hub,
Err(e) => {
findings.push(Finding {
severity: Severity::Block,
hub: rel,
hub: rel.to_string(),
claim: String::new(),
at: String::new(),
message: format!("invalid hub: {e}"),
Expand All @@ -125,7 +134,7 @@ fn lint_workspace(ws: &Workspace) -> Result<Vec<Finding>> {
for site in claim.at.sites() {
let outcome = lint_site(
ws,
&rel,
rel,
&claim.claim,
site,
claim.hash.as_deref(),
Expand All @@ -138,11 +147,11 @@ fn lint_workspace(ws: &Workspace) -> Result<Vec<Finding>> {
owner
.entry(info.file.clone())
.and_modify(|h| {
if rel < *h {
*h = rel.clone();
if rel < h.as_str() {
*h = rel.to_string();
}
})
.or_insert_with(|| rel.clone());
.or_insert_with(|| rel.to_string());
if info.resolved {
covered.entry(info.file).or_default().insert(info.segments);
} else {
Expand All @@ -152,14 +161,15 @@ fn lint_workspace(ws: &Workspace) -> Result<Vec<Finding>> {
}
}

lint_covers(&rel, &hub, &mut findings);
lint_claim_log(&rel, &hub, &mut findings);
lint_thin_prose(&rel, &hub, &mut findings);
lint_covers(rel, hub, &mut findings);
lint_refs(rel, hub, &hub_index, &mut findings);
lint_claim_log(rel, hub, &mut findings);
lint_thin_prose(rel, hub, &mut findings);

if hub.frontmatter.anchors.len() > MAX_ANCHORS_PER_HUB {
findings.push(Finding {
severity: Severity::Warn,
hub: rel.clone(),
hub: rel.to_string(),
claim: String::new(),
at: String::new(),
message: format!(
Expand Down Expand Up @@ -282,6 +292,69 @@ fn lint_covers(rel: &str, hub: &surf_core::Hub, findings: &mut Vec<Finding>) {
}
}

/// Validate a hub's `refs` composition (§9.3, #4). Each entry names another hub by a path
/// relative to this one, optionally `> segment` to address a claim within it. A ref that doesn't
/// resolve to a loaded hub, points at its own hub, or names a claim no anchor in the target
/// matches is a structural error and blocks — the same fail-on-typo reasoning as `covers`. The
/// verdict does not read `refs` yet (PR2), so lint is the only thing that acts on them.
fn lint_refs(
rel: &str,
hub: &surf_core::Hub,
hub_index: &HashMap<&str, &surf_core::Hub>,
findings: &mut Vec<Finding>,
) {
for raw in &hub.frontmatter.refs {
let mut block = |message: String| {
findings.push(Finding {
severity: Severity::Block,
hub: rel.to_string(),
claim: String::new(),
at: raw.clone(),
message,
});
};

let parsed = match surf_core::parse_ref(raw) {
Ok(r) => r,
Err(e) => {
block(format!("invalid `refs` entry \"{raw}\": {e}"));
continue;
}
};

let target_rel = crate::workspace::resolve_ref_path(rel, &parsed.path);
if target_rel == rel {
block(format!("ref \"{raw}\" points at its own hub"));
continue;
}
let Some(target) = hub_index.get(target_rel.as_str()) else {
block(format!(
"ref \"{raw}\" does not resolve to a hub (looked for `{target_rel}`) — `refs` compose hubs, not arbitrary files"
));
continue;
};

if !parsed.segments.is_empty() {
let names: Vec<&str> = parsed.segments.iter().map(|s| s.name.as_str()).collect();
let matched = target.frontmatter.anchors.iter().any(|c| {
c.at.sites().iter().any(|site| {
parse_anchor(site).is_ok_and(|a| {
let anchor_names: Vec<&str> =
a.segments.iter().map(|s| s.name.as_str()).collect();
anchor_names.ends_with(&names)
})
})
});
if !matched {
block(format!(
"ref \"{raw}\" names a claim `{}` that no anchor in `{target_rel}` matches",
names.join(" > ")
));
}
}
}
}

/// Markdown link targets (`](target)`) in a fragment of text.
fn link_targets(text: &str) -> impl Iterator<Item = &str> {
text.split("](")
Expand Down Expand Up @@ -904,6 +977,105 @@ mod tests {
assert_eq!(warn.at, "src/auth.rs");
}

#[test]
fn refs_to_existing_hub_is_silent() {
let (_t, ws) = ws_with(&[
("src/auth.rs", "pub fn greet() {}\n"),
(
"hubs/a.md",
"---\nsummary: x\nrefs:\n - ./b.md\nanchors:\n - claim: g\n at: src/auth.rs > greet\n---\n",
),
("hubs/b.md", "---\nsummary: y\n---\n# B\n"),
]);
assert!(lint_workspace(&ws).unwrap().is_empty());
}

#[test]
fn refs_to_missing_hub_blocks() {
let (_t, ws) = ws_with(&[("hubs/a.md", "---\nsummary: x\nrefs:\n - ./gone.md\n---\n")]);
let f = lint_workspace(&ws).unwrap();
let block = f
.iter()
.find(|x| x.message.contains("does not resolve to a hub"))
.expect("expected a dangling-ref error");
assert_eq!(block.severity, Severity::Block);
assert_eq!(block.at, "./gone.md");
}

#[test]
fn refs_to_non_hub_file_blocks() {
// A doc path is not a hub — the reclassification trigger for the two ../docs refs (#4).
let (_t, ws) = ws_with(&[(
"hubs/a.md",
"---\nsummary: x\nrefs:\n - ../docs/guide.md\n---\n",
)]);
let f = lint_workspace(&ws).unwrap();
assert!(f
.iter()
.any(|x| x.severity == Severity::Block && x.message.contains("does not resolve")));
}

#[test]
fn self_ref_blocks() {
let (_t, ws) = ws_with(&[("hubs/a.md", "---\nsummary: x\nrefs:\n - ./a.md\n---\n")]);
let f = lint_workspace(&ws).unwrap();
let block = f
.iter()
.find(|x| x.message.contains("its own hub"))
.expect("expected a self-ref error");
assert_eq!(block.severity, Severity::Block);
}

#[test]
fn malformed_ref_blocks() {
let (_t, ws) = ws_with(&[(
"hubs/a.md",
"---\nsummary: x\nrefs:\n - '> dangling'\n---\n",
)]);
let f = lint_workspace(&ws).unwrap();
assert!(f
.iter()
.any(|x| x.severity == Severity::Block && x.message.contains("invalid `refs` entry")));
}

#[test]
fn claim_ref_matches_anchor_suffix() {
// `./b.md > greet` resolves: b.md has a claim anchored at `src/auth.rs > greet`.
let (_t, ws) = ws_with(&[
("src/auth.rs", "pub fn greet() {}\n"),
(
"hubs/a.md",
"---\nsummary: x\nrefs:\n - ./b.md > greet\n---\n",
),
(
"hubs/b.md",
"---\nsummary: y\nanchors:\n - claim: g\n at: src/auth.rs > greet\n---\n",
),
]);
assert!(lint_workspace(&ws).unwrap().is_empty());
}

#[test]
fn claim_ref_with_no_matching_anchor_blocks() {
let (_t, ws) = ws_with(&[
("src/auth.rs", "pub fn greet() {}\n"),
(
"hubs/a.md",
"---\nsummary: x\nrefs:\n - ./b.md > nonexistent\n---\n",
),
(
"hubs/b.md",
"---\nsummary: y\nanchors:\n - claim: g\n at: src/auth.rs > greet\n---\n",
),
]);
let f = lint_workspace(&ws).unwrap();
let block = f
.iter()
.find(|x| x.message.contains("no anchor in"))
.expect("expected a no-matching-claim error");
assert_eq!(block.severity, Severity::Block);
}

fn agents_findings(ws: &Workspace) -> Vec<Finding> {
lint_workspace(ws)
.unwrap()
Expand Down
Loading