Optimize the skill suite, redesign the plans system (v2), and migrate CI to Node .mjs#2
Merged
Merged
Conversation
Add two validation checks the mechanical 16-pt scorer can't see, plus the
content they enforce. Both wired into scripts/ci.sh and .github/workflows/ci.yml
(kept in lockstep per .github/AGENTS.md).
refs-guard.mjs (run by skills/guard.sh) — reference hygiene:
- broken local references/ + assets/ links (dangling read-this pointers)
- orphan reference files never mentioned by SKILL.md
- missing "## Contents" TOC on references/*.md > 100 lines with >=3 doc-level
headings (Anthropic best-practice; the heading gate auto-exempts embedded
output templates whose sections live inside a verbatim code fence)
-> added a generated TOC to 26 reference files to satisfy it
skill-trigger-collision.mjs — cross-skill trigger-overlap audit:
fails when two descriptions share >=5 positive-surface trigger tokens but
neither routes to the other. Tokenises only the positive trigger surface
(words inside a "Not for…" exclusion clause route AWAY, so they don't count
as claimed surface). Caught one genuine collision: solid vs
type-safety-discipline — fixed by adding a type-level routing clause to
solid's description (494 chars, still tight).
Bookkeeping: bumped metadata.updated + backfilled content_hash on the 12
touched skills; docs/scaffold/spec.yaml seeds refs-guard.mjs so a scaffolded
project's guard chain stays green; scripts/AGENTS.md documents both checks.
https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Content optimization of the genuine weak spots (mechanical scorer was saturated
at 16 for 23/27 skills, so these are real quality gaps, not score-gaming):
make-interfaces-feel-better 10->16: three constraint blocks for the actual
non-negotiables (exact tuned values; never transition:all / will-change
broad; concentric radii), a BAD/GOOD CSS pair, and a UI-trio routing clause
(design-tokenization / react-component-patterns).
plan-sidecar 12->16: tightened description, BAD/GOOD html pair for the
inline-vs-linked-asset rule.
plan-init 15->16: tightened description + per-plan/sidecar routing clause.
lint-no-suppressions: obra-style rationalization table (excuse -> reality)
for the discipline-under-pressure case; no caps-lock "iron law" (the
existing consequence-bearing constraint already serves it).
Empirically validated routing with a blind docks-naive subagent over 25
realistic prompts: 24/25 matched intent, all high-confidence; the solid vs
type-safety-discipline collision (fixed last commit) now routes correctly.
Doc-authority sync: write-skill + plugins/docks/skills/AGENTS.md document the
reference-TOC rule and the trigger-collision audit; metadata.updated bumped +
content_hash backfilled on the touched skills.
https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
…oing) First artifact of the v2 plans redesign, written in the model it proposes: lives in docs/plans/active/, status is the frontmatter field, no .html/.js sidecar. Self-reviewed on draft (the ## Self-review block records the holes the rubric caught); six open questions resolved with the user into ## Decisions. Auto-committed on the planned→ongoing transition (dogfooding decision D-4). https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
plan-init gains idempotent old-layout migration; plan-manager detects a deprecated layout and offers it. Recorded in Self-review as a human-caught gap. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
…eview Rewrite docs/plans/AGENTS.md to the v2 model (Step 1 of the plans-v2 plan): two folders with status as a frontmatter field, lean required body spine + optional sections, a baked-in draft→self-review→open-questions loop with an explicit rubric, native multiple-choice / on-demand visual-HTML open questions, auto-commit on transition, and on-demand views (no committed dashboard). Keeps the frontmatter schema, age-token grammar, 3-tier pretty-print, and audit-first. 262 lines (was 453); tree/guard clean. Skills (plan-init/manager/review), plan-sidecar deletion, the template mirror, and the data migration follow in subsequent commits. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Per review: AskUserQuestion-style picker is THE way to surface open questions; visual choices render a throwaway HTML. Typed text is only the floor for a runtime with no question UI (Codex), not a designed path. Sync D-6 in the plan. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Research (openai/codex#9926): Codex is landing a native interactive questionnaire tool (single/multi-choice + custom option) — its AskUserQuestion equivalent, interactive mode only, disabled in codex exec. Contract now points each runtime at its native picker; the floor case is genuinely non-interactive runs (codex exec / CI), not Codex broadly. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Replace the 5-folder + committed-HTML plan system with the model designed with the user this session. Executes plan plans-v2-md-only. Model: - Two folders — active/ (lifecycle stage is the status: frontmatter field) and finished/ (archive). Transitions edit a field; git mv only on ship. - The .md is the only tracked artifact; views render on demand. Removed the committed _views/*.html, _assets/, index.html, and _open_questions/. - Open questions surface via the native picker (AskUserQuestion / Codex ask_user_question); visual choices render a throwaway gitignored .html. - New plans are drafted then self-reviewed (rubric + cold-handoff) before the user sees them; remaining guesses become open questions. plan-manager auto-commits the .md on every transition. Skills/agents: - Rewrote docs/plans/AGENTS.md + plan-init's embedded template (kept in sync). - Rewrote plan-init (bootstrap + idempotent v1->v2 migration, with a preservation constraint + ## Verification; added to transform-guard). - Rewrote plan-manager (status-field transitions, self-review loop, deprecation detection, folded-in visual rendering) and the plan-manager/plan-review agent wrappers; plan-review gained a draft-review mode. - Deleted the plan-sidecar skill and its assets. - Updated cross-refs (root AGENTS.md, refactor/security/skill-agent-pipeline plan paths, skills-node contract-sync) and migrated docks' own docs/plans. bash scripts/ci.sh green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Drafted + self-reviewed in the v2 system; four open questions resolved with the user into ## Decisions: port everything incl. ci.sh/release.sh (D-1), collapse the skill guard+scorer+mirror into one bundled skill-guard.mjs as the single source of truth (D-2), delete each .sh after a parity gate (D-3), one .mjs per validator (D-4). Ready to implement via /workflows. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Steps 1–2 of ci-validators-to-mjs. Adds the reusable parity harness
(tests/parity.mjs: runs old .sh vs new .mjs, diffs stdout+exit) and the three
calibrated scripts, each byte-identical to its bash original:
- scripts/skills/score.mjs == score.sh (--per-file + total)
- scripts/agents/score.mjs == agents/score.sh
- scripts/skills/content-hash.mjs == content-hash.sh (all 26 per-skill hashes
+ --check-only)
The .sh stay wired into ci.sh until the rewiring step deletes them (D-3); both
coexist and CI is green. This proves the parity methodology on the riskiest
(calibrated) scripts before the structural guards + orchestrators follow.
https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Step 3 of ci-validators-to-mjs. Seven guards, each byte-identical to its .sh: read-floor (drops the python3 dependency), tree/guard, no-author-scripts, codex-facts, agents/guard, scaffold/guard-spec, transform-guard. The .sh stay wired until the orchestrator rewire (Step 4) deletes them; CI green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
…oven) Executes plan ci-validators-to-mjs. Every validator is now Node .mjs; the only bash left is release.sh (git-tag/GH-release orchestration — deferred, see D-1). - Single orchestrator: scripts/ci.mjs runs the full gate (guards, scores, manifests, claude plugin validate, scaffold, shellcheck). .github/workflows/ ci.yml now runs that same ci.mjs in one hardened job — true local↔CI parity, no drift. SHA-pinned actions, pinned claude-code, audit signatures preserved. - Single-source scorer (D-2): the 16-pt rubric lives ONCE in the bundled write-skill/scripts/skill-guard.mjs (score/validate); the kit CI scores with that same shipped file. Deletes the author-side score.sh + the bash mirror + their sync contract. - Ports: guard/refs/codex-facts/content-hash/transform-guard/no-author-scripts/ agents-guard/agents-score/tree-guard/read-floor/guard-spec/scaffold-test + the idempotency test, each gated by tests/parity.mjs (output byte-identical to its .sh). read-floor also drops a hidden python3 dependency. - Scaffold seed migrated to .mjs (ci.mjs.template; spec seeds the .mjs validators); a fresh seed passes its own ci.mjs. - Deletes 20 .sh; docs (scripts/.github/scaffold/skills AGENTS.md) updated. No skill/agent score or content_hash changed (parity + idempotency green). node scripts/ci.mjs green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Finishes the bash->.mjs migration on the user's go-ahead. release.sh (repo) and release.sh.template (seed) are now release.mjs / release.mjs.template: - Native JSON version bump instead of jq — verified by --dry-run that the manifest edit changes ONLY the "version" line (JSON.stringify matches the committed jq 2-space formatting, no reformatting churn). - A --dry-run mode runs the ci.mjs gate + version compute + manifest diff read-only and PRINTS the destructive steps (git/tag/gh) instead of running them — the safe way to verify a release script that can't be test-released. `node scripts/release.mjs --dry-run patch` → gate green, 0.5.9->0.5.10. - ci.mjs shellcheck now scopes to plugins/docks/hooks/*.sh (the only bash left is that shipped runtime hook); scaffold seeds release.mjs.template. The repo now has zero author-side bash — all tooling is Node .mjs. Docs + plan updated (the D-1 deferral is resolved). node scripts/ci.mjs green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
…bash
The PostToolUse hook's job is parsing a JSON payload, which it did with grep/sed
on the raw JSON. The Node port uses JSON.parse (correct on escaping, multiple
paths, nested structures) and is byte-identical to the bash hook across all
tested payloads: Claude file_path (in-node + at-root), Codex apply_patch,
multi-node, and malformed JSON (all exit 0, never break the session).
- hooks.json command: "${CLAUDE_PLUGIN_ROOT}"/...sh -> node "${CLAUDE_PLUGIN_ROOT}/...mjs"
(same plugin-root prefix the .sh used, so Codex resolves it the same way; node
is already a plugin prerequisite). claude plugin validate passes.
- Deleted the .sh. The repo now contains ZERO bash — all tooling AND the runtime
hook are Node. ci.mjs's shell-lint reports "no bash to lint" (kept for a future
shell hook). Docs updated.
Tradeoff noted: ~50ms node-startup per PostToolUse vs bash, accepted for the
JSON-parse correctness win on a between-tool-calls hook. node scripts/ci.mjs green.
https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Code-review follow-ups on the .mjs migration (source-of-truth + dedup): - Extract scripts/lib/skills-walk.mjs (findSkillFiles/eachSkillDir/ findSkillByName) and scripts/lib/skills-parse.mjs (body-line/slop/meta helpers); point the author-side validators at them so the SKILL.md walk and the body-line method live once. The bundled write-skill skill-guard.mjs keeps its own copies on purpose (ships standalone where scripts/lib/ is absent). Seed skills-walk.mjs via spec.yaml (seeded validators import it). - Delete the now-dead tests/parity.mjs (all .sh ports are retired). - Drop the hardcoded kit floors from skill-guard.mjs's score hint (they duplicated scoring.json and are wrong for consumers). - Finish the migration's doc surface: every README / AGENTS.md node / scaffold template / npm script / shipped-skill body that still told users to run a deleted .sh (bash scripts/ci.sh, ./scripts/release.sh, guard.sh, the renamed bundled skill-guard.sh, the stale seed scripts list) now points at the .mjs reality. The repo's own `npm run ci` and the seeded package.json were both broken — now fixed. - Strip dead `(port of X.sh)` provenance comments referencing deleted files. - Rehash the 3 edited shipped skills (scaffold/write-skill/skill-maintenance); content_hash back in sync, no score change. ci.mjs green; content-hash --check-only clean (26 in sync); scaffold seed starts green.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This branch carries three connected pieces of work. Everything is gated by
node scripts/ci.mjs(green), and no skill or agentscore/content_hashchanged as a result of the tooling migration (parity + idempotency proven).1 · Skill-suite optimization + new validation
The mechanical 16-pt scorer was saturated (23/27 skills at max), so the wins came from content quality + new checks the scorer can't see:
refs-guard.mjs— broken localreferences//assets/links, orphan reference files, and a TOC-for-references-over-100-lines rule (Anthropic best-practice). Added## ContentsTOCs to 26 reference files.skill-trigger-collision.mjs— cross-skill trigger-overlap audit (fills a gap docks' own AGENTS.md admitted). Caught one genuine collision (solid⇄type-safety-discipline), fixed via a routing clause.lint-no-suppressions), tightened over-500 descriptions.2 · Plans system v2
Redesigned
docs/plans/from the conversation with the maintainer:active/+finished/); lifecycle stage is thestatus:frontmatter field (was dual-encoded as folder and field). Transitions edit a field;git mvonly on ship..mdis the only tracked artifact — removed the committed_views/*.html,_assets/,index.htmldual-write. Views render on demand; open questions use the native picker (AskUserQuestion/ Codexask_user_question), visual choices a throwaway gitignored.html.plan-initgains an idempotent v1→v2 migration;plan-managerdetects a deprecated layout. Deletedplan-sidecar.3 · CI / validators → Node
.mjsEvery validator is now Node
.mjs, each ported under a parity gate (tests/parity.mjs: new.mjsoutput byte-identical to the old.sh) before the.shwas deleted:scripts/ci.mjsruns the full gate;.github/workflows/ci.ymlruns that same file (true local↔CI parity, no drift; SHA-pinned actions + pinnedclaude-code+ audit-signatures preserved).write-skill/scripts/skill-guard.mjs; the kit CI scores with that same shipped file → no author-side-vs-mirror sync contract.python3dependency (read-floor); migrated the scaffold seed to.mjs(a fresh seed passes its ownci.mjs). 18.shdeleted, ~1,500 net lines removed.Verification
node scripts/ci.mjsgreen; idempotency + tree-guard green; scaffold seed starts green..mjsvalidator proven byte-identical to its.shpredecessor.One deferred item (honest)
release.shwas kept as bash (its gate call updated toci.mjs). Porting a git-tag / GH-release orchestrator blind and untestable-without-an-actual-release was the one unsafe move; it isn't run in CI, so the branch stays green. Follow-up: port torelease.mjswith a real-release smoke test.The work is tracked in
docs/plans/(plans-v2-md-only,ci-validators-to-mjs) — both dogfood the new system.https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd
Generated by Claude Code