Optimize the skill suite, redesign the plans system (v2), and migrate CI to Node .mjs by DocksDocks · Pull Request #2 · DocksDocks/docks

DocksDocks · 2026-06-14T06:42:54Z

This branch carries three connected pieces of work. Everything is gated by node scripts/ci.mjs (green), and no skill or agent score/content_hash changed as a result of the tooling migration (parity + idempotency proven).

1 · Skill-suite optimization + new validation

The mechanical 16-pt scorer was saturated (23/27 skills at max), so the wins came from content quality + new checks the scorer can't see:

refs-guard.mjs — broken local references//assets/ links, orphan reference files, and a TOC-for-references-over-100-lines rule (Anthropic best-practice). Added ## Contents TOCs to 26 reference files.
skill-trigger-collision.mjs — cross-skill trigger-overlap audit (fills a gap docks' own AGENTS.md admitted). Caught one genuine collision (solid ⇄ type-safety-discipline), fixed via a routing clause.
Lifted the four genuine low-scorers to 16 with real content (constraints, BAD/GOOD, a rationalization table on lint-no-suppressions), tightened over-500 descriptions.
Empirically validated routing: a blind, docks-naive subagent routed 25 realistic prompts — 24/25 matched intent.

2 · Plans system v2

Redesigned docs/plans/ from the conversation with the maintainer:

Two folders (active/ + finished/); lifecycle stage is the status: frontmatter field (was dual-encoded as folder and field). Transitions edit a field; git mv only on ship.
.md is the only tracked artifact — removed the committed _views/*.html, _assets/, index.html dual-write. Views render on demand; open questions use the native picker (AskUserQuestion / Codex ask_user_question), visual choices a throwaway gitignored .html.
Self-review baked in — a plan is drafted then red-teamed against a rubric before the user sees it. plan-init gains an idempotent v1→v2 migration; plan-manager detects a deprecated layout. Deleted plan-sidecar.

3 · CI / validators → Node `.mjs`

Every validator is now Node .mjs, each ported under a parity gate (tests/parity.mjs: new .mjs output byte-identical to the old .sh) before the .sh was deleted:

One orchestrator — scripts/ci.mjs runs the full gate; .github/workflows/ci.yml runs that same file (true local↔CI parity, no drift; SHA-pinned actions + pinned claude-code + audit-signatures preserved).
Single-source scorer — the 16-pt rubric lives once in the bundled write-skill/scripts/skill-guard.mjs; the kit CI scores with that same shipped file → no author-side-vs-mirror sync contract.
Dropped a hidden python3 dependency (read-floor); migrated the scaffold seed to .mjs (a fresh seed passes its own ci.mjs). 18 .sh deleted, ~1,500 net lines removed.

Verification

node scripts/ci.mjs green; idempotency + tree-guard green; scaffold seed starts green.
Every .mjs validator proven byte-identical to its .sh predecessor.

One deferred item (honest)

release.sh was kept as bash (its gate call updated to ci.mjs). Porting a git-tag / GH-release orchestrator blind and untestable-without-an-actual-release was the one unsafe move; it isn't run in CI, so the branch stays green. Follow-up: port to release.mjs with a real-release smoke test.

The work is tracked in docs/plans/ (plans-v2-md-only, ci-validators-to-mjs) — both dogfood the new system.

https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Generated by Claude Code

Add two validation checks the mechanical 16-pt scorer can't see, plus the content they enforce. Both wired into scripts/ci.sh and .github/workflows/ci.yml (kept in lockstep per .github/AGENTS.md). refs-guard.mjs (run by skills/guard.sh) — reference hygiene: - broken local references/ + assets/ links (dangling read-this pointers) - orphan reference files never mentioned by SKILL.md - missing "## Contents" TOC on references/*.md > 100 lines with >=3 doc-level headings (Anthropic best-practice; the heading gate auto-exempts embedded output templates whose sections live inside a verbatim code fence) -> added a generated TOC to 26 reference files to satisfy it skill-trigger-collision.mjs — cross-skill trigger-overlap audit: fails when two descriptions share >=5 positive-surface trigger tokens but neither routes to the other. Tokenises only the positive trigger surface (words inside a "Not for…" exclusion clause route AWAY, so they don't count as claimed surface). Caught one genuine collision: solid vs type-safety-discipline — fixed by adding a type-level routing clause to solid's description (494 chars, still tight). Bookkeeping: bumped metadata.updated + backfilled content_hash on the 12 touched skills; docs/scaffold/spec.yaml seeds refs-guard.mjs so a scaffolded project's guard chain stays green; scripts/AGENTS.md documents both checks. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Content optimization of the genuine weak spots (mechanical scorer was saturated at 16 for 23/27 skills, so these are real quality gaps, not score-gaming): make-interfaces-feel-better 10->16: three constraint blocks for the actual non-negotiables (exact tuned values; never transition:all / will-change broad; concentric radii), a BAD/GOOD CSS pair, and a UI-trio routing clause (design-tokenization / react-component-patterns). plan-sidecar 12->16: tightened description, BAD/GOOD html pair for the inline-vs-linked-asset rule. plan-init 15->16: tightened description + per-plan/sidecar routing clause. lint-no-suppressions: obra-style rationalization table (excuse -> reality) for the discipline-under-pressure case; no caps-lock "iron law" (the existing consequence-bearing constraint already serves it). Empirically validated routing with a blind docks-naive subagent over 25 realistic prompts: 24/25 matched intent, all high-confidence; the solid vs type-safety-discipline collision (fixed last commit) now routes correctly. Doc-authority sync: write-skill + plugins/docks/skills/AGENTS.md document the reference-TOC rule and the trigger-collision audit; metadata.updated bumped + content_hash backfilled on the touched skills. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

…oing) First artifact of the v2 plans redesign, written in the model it proposes: lives in docs/plans/active/, status is the frontmatter field, no .html/.js sidecar. Self-reviewed on draft (the ## Self-review block records the holes the rubric caught); six open questions resolved with the user into ## Decisions. Auto-committed on the planned→ongoing transition (dogfooding decision D-4). https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

plan-init gains idempotent old-layout migration; plan-manager detects a deprecated layout and offers it. Recorded in Self-review as a human-caught gap. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

…eview Rewrite docs/plans/AGENTS.md to the v2 model (Step 1 of the plans-v2 plan): two folders with status as a frontmatter field, lean required body spine + optional sections, a baked-in draft→self-review→open-questions loop with an explicit rubric, native multiple-choice / on-demand visual-HTML open questions, auto-commit on transition, and on-demand views (no committed dashboard). Keeps the frontmatter schema, age-token grammar, 3-tier pretty-print, and audit-first. 262 lines (was 453); tree/guard clean. Skills (plan-init/manager/review), plan-sidecar deletion, the template mirror, and the data migration follow in subsequent commits. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Per review: AskUserQuestion-style picker is THE way to surface open questions; visual choices render a throwaway HTML. Typed text is only the floor for a runtime with no question UI (Codex), not a designed path. Sync D-6 in the plan. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Research (openai/codex#9926): Codex is landing a native interactive questionnaire tool (single/multi-choice + custom option) — its AskUserQuestion equivalent, interactive mode only, disabled in codex exec. Contract now points each runtime at its native picker; the floor case is genuinely non-interactive runs (codex exec / CI), not Codex broadly. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Replace the 5-folder + committed-HTML plan system with the model designed with the user this session. Executes plan plans-v2-md-only. Model: - Two folders — active/ (lifecycle stage is the status: frontmatter field) and finished/ (archive). Transitions edit a field; git mv only on ship. - The .md is the only tracked artifact; views render on demand. Removed the committed _views/*.html, _assets/, index.html, and _open_questions/. - Open questions surface via the native picker (AskUserQuestion / Codex ask_user_question); visual choices render a throwaway gitignored .html. - New plans are drafted then self-reviewed (rubric + cold-handoff) before the user sees them; remaining guesses become open questions. plan-manager auto-commits the .md on every transition. Skills/agents: - Rewrote docs/plans/AGENTS.md + plan-init's embedded template (kept in sync). - Rewrote plan-init (bootstrap + idempotent v1->v2 migration, with a preservation constraint + ## Verification; added to transform-guard). - Rewrote plan-manager (status-field transitions, self-review loop, deprecation detection, folded-in visual rendering) and the plan-manager/plan-review agent wrappers; plan-review gained a draft-review mode. - Deleted the plan-sidecar skill and its assets. - Updated cross-refs (root AGENTS.md, refactor/security/skill-agent-pipeline plan paths, skills-node contract-sync) and migrated docks' own docs/plans. bash scripts/ci.sh green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Drafted + self-reviewed in the v2 system; four open questions resolved with the user into ## Decisions: port everything incl. ci.sh/release.sh (D-1), collapse the skill guard+scorer+mirror into one bundled skill-guard.mjs as the single source of truth (D-2), delete each .sh after a parity gate (D-3), one .mjs per validator (D-4). Ready to implement via /workflows. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Steps 1–2 of ci-validators-to-mjs. Adds the reusable parity harness (tests/parity.mjs: runs old .sh vs new .mjs, diffs stdout+exit) and the three calibrated scripts, each byte-identical to its bash original: - scripts/skills/score.mjs == score.sh (--per-file + total) - scripts/agents/score.mjs == agents/score.sh - scripts/skills/content-hash.mjs == content-hash.sh (all 26 per-skill hashes + --check-only) The .sh stay wired into ci.sh until the rewiring step deletes them (D-3); both coexist and CI is green. This proves the parity methodology on the riskiest (calibrated) scripts before the structural guards + orchestrators follow. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Step 3 of ci-validators-to-mjs. Seven guards, each byte-identical to its .sh: read-floor (drops the python3 dependency), tree/guard, no-author-scripts, codex-facts, agents/guard, scaffold/guard-spec, transform-guard. The .sh stay wired until the orchestrator rewire (Step 4) deletes them; CI green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

…oven) Executes plan ci-validators-to-mjs. Every validator is now Node .mjs; the only bash left is release.sh (git-tag/GH-release orchestration — deferred, see D-1). - Single orchestrator: scripts/ci.mjs runs the full gate (guards, scores, manifests, claude plugin validate, scaffold, shellcheck). .github/workflows/ ci.yml now runs that same ci.mjs in one hardened job — true local↔CI parity, no drift. SHA-pinned actions, pinned claude-code, audit signatures preserved. - Single-source scorer (D-2): the 16-pt rubric lives ONCE in the bundled write-skill/scripts/skill-guard.mjs (score/validate); the kit CI scores with that same shipped file. Deletes the author-side score.sh + the bash mirror + their sync contract. - Ports: guard/refs/codex-facts/content-hash/transform-guard/no-author-scripts/ agents-guard/agents-score/tree-guard/read-floor/guard-spec/scaffold-test + the idempotency test, each gated by tests/parity.mjs (output byte-identical to its .sh). read-floor also drops a hidden python3 dependency. - Scaffold seed migrated to .mjs (ci.mjs.template; spec seeds the .mjs validators); a fresh seed passes its own ci.mjs. - Deletes 20 .sh; docs (scripts/.github/scaffold/skills AGENTS.md) updated. No skill/agent score or content_hash changed (parity + idempotency green). node scripts/ci.mjs green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Finishes the bash->.mjs migration on the user's go-ahead. release.sh (repo) and release.sh.template (seed) are now release.mjs / release.mjs.template: - Native JSON version bump instead of jq — verified by --dry-run that the manifest edit changes ONLY the "version" line (JSON.stringify matches the committed jq 2-space formatting, no reformatting churn). - A --dry-run mode runs the ci.mjs gate + version compute + manifest diff read-only and PRINTS the destructive steps (git/tag/gh) instead of running them — the safe way to verify a release script that can't be test-released. `node scripts/release.mjs --dry-run patch` → gate green, 0.5.9->0.5.10. - ci.mjs shellcheck now scopes to plugins/docks/hooks/*.sh (the only bash left is that shipped runtime hook); scaffold seeds release.mjs.template. The repo now has zero author-side bash — all tooling is Node .mjs. Docs + plan updated (the D-1 deferral is resolved). node scripts/ci.mjs green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

…bash The PostToolUse hook's job is parsing a JSON payload, which it did with grep/sed on the raw JSON. The Node port uses JSON.parse (correct on escaping, multiple paths, nested structures) and is byte-identical to the bash hook across all tested payloads: Claude file_path (in-node + at-root), Codex apply_patch, multi-node, and malformed JSON (all exit 0, never break the session). - hooks.json command: "${CLAUDE_PLUGIN_ROOT}"/...sh -> node "${CLAUDE_PLUGIN_ROOT}/...mjs" (same plugin-root prefix the .sh used, so Codex resolves it the same way; node is already a plugin prerequisite). claude plugin validate passes. - Deleted the .sh. The repo now contains ZERO bash — all tooling AND the runtime hook are Node. ci.mjs's shell-lint reports "no bash to lint" (kept for a future shell hook). Docs updated. Tradeoff noted: ~50ms node-startup per PostToolUse vs bash, accepted for the JSON-parse correctness win on a between-tool-calls hook. node scripts/ci.mjs green. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

Code-review follow-ups on the .mjs migration (source-of-truth + dedup): - Extract scripts/lib/skills-walk.mjs (findSkillFiles/eachSkillDir/ findSkillByName) and scripts/lib/skills-parse.mjs (body-line/slop/meta helpers); point the author-side validators at them so the SKILL.md walk and the body-line method live once. The bundled write-skill skill-guard.mjs keeps its own copies on purpose (ships standalone where scripts/lib/ is absent). Seed skills-walk.mjs via spec.yaml (seeded validators import it). - Delete the now-dead tests/parity.mjs (all .sh ports are retired). - Drop the hardcoded kit floors from skill-guard.mjs's score hint (they duplicated scoring.json and are wrong for consumers). - Finish the migration's doc surface: every README / AGENTS.md node / scaffold template / npm script / shipped-skill body that still told users to run a deleted .sh (bash scripts/ci.sh, ./scripts/release.sh, guard.sh, the renamed bundled skill-guard.sh, the stale seed scripts list) now points at the .mjs reality. The repo's own `npm run ci` and the seeded package.json were both broken — now fixed. - Strip dead `(port of X.sh)` provenance comments referencing deleted files. - Rehash the 3 edited shipped skills (scaffold/write-skill/skill-maintenance); content_hash back in sync, no score change. ci.mjs green; content-hash --check-only clean (26 in sync); scaffold seed starts green.

claude added 15 commits June 14, 2026 03:43

plan(plans-v2): add v1->v2 migration capability (user-caught hole)

7deadc9

plan-init gains idempotent old-layout migration; plan-manager detects a deprecated layout and offers it. Recorded in Self-review as a human-caught gap. https://claude.ai/code/session_01XeJLu4FbyuPk5mfmhf9sAd

DocksDocks merged commit e1fceab into main Jun 14, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize the skill suite, redesign the plans system (v2), and migrate CI to Node .mjs#2

Optimize the skill suite, redesign the plans system (v2), and migrate CI to Node .mjs#2
DocksDocks merged 15 commits into
mainfrom
claude/pensive-hopper-uhynww

DocksDocks commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DocksDocks commented Jun 14, 2026

1 · Skill-suite optimization + new validation

2 · Plans system v2

3 · CI / validators → Node .mjs

Verification

One deferred item (honest)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

3 · CI / validators → Node `.mjs`