Releases: FailproofAI/failproofai
v0.0.11-beta.8 — audit first-run fix: fire-and-forget runs, scan all history
0.0.11-beta.8
Fixes
- The
/auditfirst run no longer fails on the first click, and an audit is no longer time-capped (#434) — the first, cold scan used to abort after ~15s and bounce back to the empty state (a retry only worked because the first attempt had warmed the caches server-side). The run is now fire-and-forget with uncapped status polling, so an audit runs to completion however long it takes, and the default scan now covers your entire session history instead of just the last 30 days. The run-lock's 5-minute auto-expiry is removed so a long-but-healthy run is never cut short, and a run that can't persist its result now surfaces an error instead of silently reporting success.
Docs
- Update translated docs for changed English sources (#433).
Published to npm under the beta dist-tag (npm i failproofai@beta).
v0.0.11-beta.7 — audit re-audit bar removed; re-audit forces a fresh scan
0.0.11-beta.7
Fixes
- Remove the top-of-page
[ re-audit ]bar from the audit page (#431) — on the empty/expired path it stacked a second "run an audit" CTA that read as broken, and on loaded reports the freshness strip earned little. Re-auditing still works from[ run audit ]on the empty state and[ re-audit now ]at the bottom of a report; the sticky progress strip, soft-refresh-on-success, and 7-day cache TTL are untouched. - Re-audit now forces a genuinely fresh scan (#432) —
[ re-audit now ]sendsnoCache: true, so it bypasses the per-transcript cache and re-scans every transcript from scratch instead of silently returning the identical cached result. The empty-state first run stays on the fast cached path; a failed re-audit leaves the prior report intact.
Published to npm under the beta dist-tag (npm i failproofai@beta).
v0.0.11-beta.6 — audit 7-day cache TTL, top-of-page re-audit bar, classifier hardening
Features
- 7-day cache TTL on both audit caches. Per-transcript cache (
src/audit/cache.ts) gains acachedAtfield and aCACHE_TTL_MS = 7dreject-on-read check (schema bump 2 → 3 forces a clean re-scan of pre-existing entries). Dashboard cache (src/audit/dashboard-cache.ts) reuses the existingisCacheStale(cachedAt, 7d)helper on the read path so a week-old result is never silently served. (#428) - Top-of-page re-audit bar. New
TopAuditBarrenders above the IdentitySection with the last-audit timestamp (audited 3d ago), an amberexpires in 14h — re-audit to refreshchip inside the final 24h of the TTL, and a[ re-audit ]button. Modes:cached,expired,empty. (#428) - Sticky progress strip + soft refresh during re-audit. Pink hard-offset banner pinned to the top of the viewport during a run, mm:ss elapsed timer, CSS-only edge pulse. On
RerunErrorit swaps to a red strip with copy keyed offRerunError.kind. Success path soft-refreshes the dashboard cache viagetAuditResultAction()— no morewindow.location.reload(). (#428)
Fixes
- Goldfish classifier hardening. PR #426's GOLDFISH_ENTROPY retune exposed that normalised lift entropy can't tell "every cluster at typical baseline" apart from "real scatter". Adds
GOLDFISH_MIN_SECOND_LIFT = 1.3so goldfish only fires when ≥2 clusters genuinely over-index; uniform-at-baseline profiles fall through to the existing argmax. (#429) - Stop the Next.js 16 dev-overlay "signal is aborted without reason" warning.
lib/fetch-with-timeout.tsswaps the manualAbortController + setTimeout(which calledcontroller.abort()with no reason and silently dropped any caller-suppliedinit.signal) for platformAbortSignal.timeout()composed withAbortSignal.any(). (#428)
Docs
- Update the dashboard + audit-CLI docs for the new TTL behaviour and the top-of-page re-audit bar. Reword
cachedAtas TTL metadata (not part of the cache key). Fix a stalefailproof policy addtypo tofailproofai policy add. (#428) - Translation refresh for changed English sources. (#427)
Full details in CHANGELOG.md under 0.0.11-beta.6.
v0.0.11-beta.5 — /audit persona fix: behavior-calibrated archetypes
Fixes
- Behavior-calibrated
/auditarchetypes — the persona classifier no longer collapses nearly every agent onto "the explorer". The lift denominator now uses empirical firing shares instead of catalog weights, so a persona wins only when it fires more than a typical agent;block-read-outside-cwdis dropped from the signal map (off by default + ubiquitous ambient reads), and the goldfish entropy threshold is retuned. Real-world distribution now spreads across all 8 personas instead of ~100% explorer. (#426)
Docs
- Document that contributors must build the project before the in-repo dev hooks resolve the
failproofaiimport againstdist/index.js. (#426)
Full details in CHANGELOG.md under 0.0.11-beta.5.
v0.0.11-beta.4 — /audit share-card hotfix (desktop intent + correct domain)
/audit share-card hotfix
Two fast follow-ups on the /audit share flow introduced in 0.0.11-beta.3.
Fixes
- Desktop "share on X" / "share on LinkedIn" no longer open the Windows share dialog.
lib/share-card.tsshareCardNative()early-returnsfalseon non-mobile devices (detected vianavigator.userAgentData.mobilewith a UA-string fallback for Safari / Firefox + amaxTouchPointscheck for iPadOS 13+), so the ShareDock falls through to its existing clipboard +x.com/intent/tweet/linkedin.com/sharing/share-offsitepath. Mobile keeps the one-tap system share sheet because there the OS sheet actually surfaces the X / LinkedIn apps as targets (#425). - Share templates linked to the wrong domain. Every X / LinkedIn template embedded
https://failproof.ai, but the actual marketing site isbefailproof.ai— so every shared post linked to a dead URL. UpdatedSITE_URLin bothapp/audit/_components/share-templates.tsandapp/audit/_components/share-dock.tsx, plus the barefailproof.aimention in the 4th X template; tightened the template test to assert the new domain so a regression fails fast (#425).
Full diff: v0.0.11-beta.3...v0.0.11-beta.4
Full changelog: https://github.com/FailproofAI/failproofai/blob/main/CHANGELOG.md#00114-beta4--2026-06-10
v0.0.11-beta.3 — /audit dashboard, email-OTP auth, pixel-craft design system
/audit dashboard, email-OTP auth, unified pixel-craft design system
This release ships the in-app /audit dashboard, email-OTP auth across CLI + dashboard, persistent re-audit reminders delivered via SES, and a brutalist pixel-craft design system unified across every dashboard page. Plus a deep correctness/efficiency hardening pass, a supply-chain security CI gate, and the usual telemetry coverage expansion.
Highlights
/audit dashboard
- New in-app report at
/auditthat turns the existingfailproofai auditdata into a personality-driven diagnostic. Every audited agent is classified into one of 8 archetypes —optimist,cowboy,explorer,goldfish,paranoid architect,precision builder,hammer,ghost— via a weighted classifier with full 47/47 signal coverage (every builtin policy + every audit-only detector). - Rewritten score + classifier engine. Personas are evenly reachable (Monte-Carlo over 50k simulated users confirms every persona lands at 10–18% share). Scores are rate-normalised against a reference volume and use a saturating exponential curve (
cap·(1−e^(−p/k))) so no two hit-counts collide on a fixed value. S/A/B/C/D/F grade bands.projectedScorepreviews the post-enable uplift. - Six sections: Identity (archetype hero with 8×8 pixel sigil + meta grid), Show-off CTA, Strengths (real numbers from the audit), Score + cohort leaderboard with distribution histogram, Findings (per-policy cards: what happened / cost / evidence / fix), Prescribed Policies (with projected-score uplift callout), and a "re-audit in 7 days" return loop.
- Persona variant catalog. Every archetype has 4–6 deterministically-seeded copy variants (taglines, descriptions, signature blocks, "common in" / "primary risk" / closing lines) keyed by a behaviour fingerprint, so two agents that land on the same archetype see different language but the same render is byte-identical across reloads.
- Shareable PNG poster. "Make poster" captures the identity frame via html2canvas at scale 2 (
failproofai-<archetype>-<YYYY-MM-DD>.png). Floating share-dock renders X / LinkedIn / save buttons stacked vertically with personalised templates (5 quirky for X, 5 measured for LinkedIn). Image attachment routes throughnavigator.share({ files })→ clipboard → download, picking the best route the browser allows.
Email-OTP auth (CLI + dashboard)
- New
failproofai auth login | logout | whoamiCLI subcommand wired to the Rustfailproof-api-server(/v0/auth/login/request,/login/verify,/token/refresh,/logout,/me). Tokens persist to~/.failproofai/auth.jsonat mode0600with auto-refresh within a 60s leeway window. - Dashboard
AuthDialogproxies the same flow through four new Next routes (/api/auth/{status,login-request,login-verify,logout}) so the refresh token never reaches the browser — only{authenticated, user}does. FAILPROOF_API_URL(defaulthttps://api.befailproof.ai) andFAILPROOFAI_AUTH_DIR(default~/.failproofai) for overrides.
Persistent re-audit reminders
- New
~/.failproofai/next-audit.json(mode0600, separate fromauth.jsonso the reminder is independent of token refresh) + dashboard/api/auth/reminderGET/POST/DELETE. - Reminders forward to the api-server's SES-backed scheduler (
POST/DELETE /v0/reminders) so the audit nudge is actually delivered as email. The local file remains the dashboard/CLI source-of-truth.
Unified pixel-craft design system
- The audit page's brutalist pixel-craft tokens (
--bg,--ink,--accent-pink,--accent-green,--font-mono→ JetBrains Mono,--font-display→ Bitcount Prop Single) are now declared once inapp/globals.cssand repoint every Tailwind alias (bg-card,text-foreground,border-border,--radius: 0, …) at the audit palette./policies,/projects, and/auditnow share the same chrome — pink corner brackets, dashed frames, green eyebrow captions — without rewriting any component markup. - Dashboard chrome scales to fill ultrawide monitors via
clamp(720px, 96vw, 1840px). Base font bumped 13px → 16px. Opt-in:focus-visiblering system. Navbar redesigned around.app-headerwith version chip + current-section eyebrow.
Reliability + efficiency
- Tier-A correctness pass. Concurrent refresh-token-exchange dedup (silent-logout bug fix), audit run-lock auto-expiry (5 min), JWT strict-base64url validation,
AbortSignal.anyfallback for Node < 20.3 / older Bun, dashboard cache schema-version rejection. - Tier-B refactor pass. New shared
lib/fetch-with-timeout.ts+lib/atomic-write.ts; ~30 LOC of copy-paste deleted acrossauth-dialog.tsx,rerun-button.tsx,api-server-client.ts,auth-store.ts,dashboard-cache.ts. - Tier-C polish. Memoised
detectorsTriggered + missingscan, rAF-coalesced scroll handler, memoised archetype-variant picker, 5s throttle on focus + visibilitychange status refresh. - Max-effort code-review hardening: corrected
failproof policy add→failproofai policy addon every finding card,app/layout.tsxfavicon fix,whoAmI()401-retry only wipes on unambiguous 401,Retry-Afterclamped to[0, 86400],AuthApiError(status: 0)→ 504 mapping, +12 more. - Reminder fetch + rerun loop now use
fetchWithTimeout(15s)so a hung route can't permanently disable the CTA. - Audit-aware atomic writes for
auth.json,next-audit.json, andaudit-dashboard.json(temp-file-then-rename with mode0600enforcement on both temp and final paths).
Telemetry
- 5 funnel-gap events on
/audit:audit_dashboard_viewed,audit_reminder_cta_{shown,clicked},audit_auth_dialog_{opened,dismissed,succeeded},audit_rerun_failed,api_server_unreachable. audit_user_identity_linkedfrom the CLI (src/auth/cli.ts) so OTP sign-ins viafailproofai auth loginare joined to pre-auth instance events.cli_policy_${action}_failureevents for thepolicy add|removefailure path.- Every PostHog event across all 4 channels (hooks/audit, server, web UI, npm-lifecycle) now stamped with
product: "failproofai-oss"(#380). - Raw verified email sent to PostHog (replacing the SHA-256
email_hash) for stronger verified-account → device identity stitch; still gated byFAILPROOFAI_TELEMETRY_DISABLED=1.
Infra
- New
bump-platform-submodule.ymlworkflow auto-bumps thefailproofai/ossgitlink inFailproofAI/platformon every merge into this repo'smain, race-safe with a rebase-and-retry loop (#394). - Supply-chain security CI gate: OSV-Scanner (
bun.lockscanned against OSV.dev + OpenSSF malicious-packages feed) on every PR / push / weekly. Socket GitHub App behavioral early-warning layer. Blocks on any known-vulnerable or malicious dependency. 18 pre-existing transitive advisories remediated (#391). - Default api-server base URL flipped to
https://api.befailproof.ai.
Fixes
- CI:
bump-platform-submoduleSIGPIPE fix (#423). The first-line extractionprintf '%s\n' "$COMMIT_SUBJECT" | head -n 1raced underset -o pipefailon multi-KB squash-merge commit bodies. Replaced with pure-bash parameter expansion. - Treat GitHub
neutralcheck-run conclusions as non-failing inrequire-ci-green-before-stop(Socket Security on external-contributor PRs) (#410). - Drop literal
━━escape sequences rendering as visible text in the/policiesactivity-tab eyebrow labels. - Submodule-bump workflow auth:
Authorization: bearer …only authenticates GitHub's REST API; git-over-HTTPS smart-protocol needsBasic x-access-token:<pat>(#395).
Dependencies
- Swap Vitest DOM environment from
happy-dom(single-maintainer, 2024 critical CVE) tojsdom(6 maintainers, ~7× weekly downloads, perfect Snyk maintenance score). Test suite (1691 tests across 82 files) stays green (#419).
Docs
- New
docs/cli/auth.mdxcoveringfailproofai auth login|logout|whoami, on-diskauth.jsonshape, env-var table, troubleshooting, plus a "Persistent re-audit reminder" section. - README logo updated to the new
fa_updated_full.svgwordmark (EN + 14 translated READMEs) (#387). - README supply-chain badge changed from live OSV-Scanner status to a static "supply chain: secure" badge, still linked to the workflow runs (#393).
Tests
- +40 tests covering previously-untested audit + auth modules:
__tests__/audit/{archetypes,findings,strengths,scoring,distribution,dashboard-cache,replay,share-templates}.test.ts,__tests__/lib/{auth-store,auth-store-refresh,api-server-client,share-card,fetch-with-timeout,atomic-write}.test.ts,__tests__/api/audit-state.test.ts. - Full suite: 1777 tests passing.
Full diff: v0.0.11-beta.2...v0.0.11-beta.3
Full changelog: https://github.com/FailproofAI/failproofai/blob/main/CHANGELOG.md#00113-beta3--2026-06-09
v0.0.11-beta.2 — `failproofai audit`, first-run prompt, telemetry coverage
v0.0.11-beta.2 — failproofai audit, first-run prompt, telemetry coverage
Pre-release. Tracks every commit between v0.0.11-beta.1 (2026-05-20) and current main.
Highlights
failproofai audit(beta) — retrospective scan of past agent sessions. New CLI command that walks transcripts from all 7 supported CLIs (Claude / Codex / Copilot / Cursor / OpenCode / Pi / Gemini), replays every tool-use event through the 39 builtin policies, and runs each through 8 new audit-only detectors for patterns not yet enforced in real time. Output is a GTM-oriented ANSI table (split into "✓ already protected" vs "○ slipping through" with per-row install CTAs) plus a sectioned, shareable markdown report at./failproofai-audit.md. Flags + output may still change between beta releases.- First-run install prompt on bare
failproofai. PostHog showed only ~10% of npm-installed users ever ranfailproofai policies --install; the no-args dashboard launch now detects "zero hooks installed across any detected CLI" and offers the existing interactive policy selection inline. Non-TTY (CI, piped) falls through with a stderr hint. Opt-out viaFAILPROOFAI_NO_FIRST_RUN=1. - PostHog telemetry coverage closed. 16 new server-side + 12 new web-UI events plug the gaps surfaced by the May audit — CLI install/uninstall outcomes, hook stdin/payload errors, builtin policy crashes (
policy_evaluation_error, distinct fromcustom_hook_error), config validation warnings, postinstall lifecycle (first_install,version_changed), web dashboard interactions, and more.
Features
failproofai audit(#377) — scan past agent transcripts and report how often the agent did things failproofai is built to stop. Replays through 39 builtin policies + 8 audit-only detectors:redundant-cd-cwd,prefer-edit-over-read-cat,prefer-edit-over-sed-awk,prefer-write-over-heredoc,sleep-polling-loop,find-from-root,git-commit-no-verify,reread-after-edit- Flags:
--cli,--project,--since,--policy,--limit,--show-examples,--report,--no-report,--json,--no-cache - Output: ANSI table (split into "already protected" vs "slipping through" sections with per-row install CTAs) + shareable markdown report
- Per-transcript cache at
~/.failproofai/cache/audit/auto-invalidates on policy/detector code changes - 4 PostHog events emitted (
audit_started,audit_pattern_detected,audit_install_cta_shown,audit_completed); strict slug/count/boolean-only privacy contract, honorsFAILPROOFAI_TELEMETRY_DISABLED=1
- First-run install prompt (#378) — bare
failproofaiinvocation detects an unconfigured machine and offers the install flow inline; newsrc/hooks/first-run-nudge.tsmodule + 4 PostHog events to measure the uplift. Opt-out:FAILPROOFAI_NO_FIRST_RUN=1. - PostHog telemetry expansion (#376) — 16 server-side + 12 web-UI events covering CLI lifecycle, hook errors, policy evaluation failures, config validation warnings, multi-scope warnings, beta-policy installs, postinstall lifecycle, and dashboard interactions. All honor
FAILPROOFAI_TELEMETRY_DISABLED=1.
Breaking
- Removed undocumented cloud auth + event relay subsystem (#374). Deletes
src/auth/(OAuth 2.0 device-flow login againstapi.befailproof.ai,~/.failproofai/auth.jsontoken store) andsrc/relay/(WebSocket event relay daemon, sanitized JSONL queue at~/.failproofai/cache/server-queue/, PID tracking). Strips thefailproofai login/logout/whoami/relay start|stop|status/syncsubcommands and the internal--relay-daemonmode. Users who ranfailproofai loginshould also wipe~/.failproofai/{auth.json,cache/server-queue,relay.pid}and stop any running relay daemon by hand; new auth/cloud surface will land in a follow-up.
Docs
- New
docs/cli/audit.mdx(beta) + nav entry, registered indocs/docs.jsonEnglish section. Translation-sync workflow (#371) will add localized pages. - First-run prompt documented in README,
docs/introduction.mdx, and a new "First-run prompt" section indocs/cli/environment-variables.mdx(withFAILPROOFAI_NO_FIRST_RUN=1opt-out).
Quality
- +62 tests (1623 → 1685 total). New
__tests__/audit/covers per-detector positive/negative cases, replay through real builtins, and an end-to-end fixture-transcript run viarunAudit(). - New
lib/format-date.tsunit tests (#373). - Refactored per-CLI tool-name + tool-input canonicalization out of
src/hooks/handler.tsintosrc/hooks/tool-name-canonicalize.tsso the live handler and audit replay share one implementation. - 0 lint errors,
tsc --noEmitclean, 7 CI jobs (build / docs / quality / test × 3 / test-e2e) green.
Upgrade notes
- Audit users:
failproofai audit --since 30dis a good first run. The markdown report at./failproofai-audit.mdis shareable in Slack/PRs. - Anyone using cloud auth/relay: see the Breaking section. Clean up
~/.failproofai/{auth.json,cache/server-queue,relay.pid}manually. - CI consumers: telemetry is opt-out — set
FAILPROOFAI_TELEMETRY_DISABLED=1to silence all events.
Full changelog: v0.0.11-beta.1...v0.0.11-beta.2
v0.0.11-beta.1
0.0.11-beta.1 — 2026-05-20
Breaking
- Default policy namespace renamed from
exospherehosttofailproofai. Configs that explicitly reference builtins asexospherehost/<name>must update tofailproofai/<name>. Flat-name shorthand (e.g."sanitize-jwt") continues to work unchanged because it auto-resolves to the new default namespace. Builtin docs (EN + 14 translations) updated to show the new namespace.
Docs
- Rename GitHub org URLs across
package.jsonmetadata, README CI badge (EN + 14 translated READMEs), CONTRIBUTING, in-app "Star us" banners (bin/failproofai.mjs,scripts/launch.ts, navbar, reach-developers component), Mintlifydocs/docs.json, and 30 translated docs (package-aliases.mdxissues link +examples.mdxrepo-tree link) to reflect theexospherehost→failproofaiorg rename. X social handle indocs/docs.jsonupdated fromx.com/exospherehosttox.com/failproofai.
Fixes
- Remove orphan
exospheresmalltoken from the Next.js proxy matcher inproxy.ts— no asset by that name exists in the repo.
v0.0.10 — 7-CLI policy enforcement: Claude, Codex, Copilot, Cursor, Gemini, OpenCode, Pi
First stable release of the 7-CLI cycle. failproofai now enforces policies across all major terminal coding agents:
| CLI | Config path | Stop semantics |
|---|---|---|
| Claude Code | .claude/settings.json |
exit-2 force-retry |
| OpenAI Codex | .codex/hooks.json |
exit-2 force-retry |
| GitHub Copilot | .github/hooks/failproofai.json |
{decision:"block",reason} JSON force-retry |
| Cursor Agent | .cursor/hooks.json |
{followup_message} JSON force-retry |
| Gemini CLI | .gemini/settings.json |
{decision:"block",reason} JSON force-retry |
| OpenCode | .opencode/plugins/failproofai.mjs + .opencode/opencode.json |
in-process plugin |
| Pi | .pi/settings.json + bundled pi-extension/ |
before_agent_start next-turn injection |
Highlights this cycle
- Per-CLI multi-select control panel in the dashboard
/policiesConfigure tab — install / uninstall the diff across all 7 CLIs in one round-trip, with brand-colored per-row status pills, a 7-segment coverage strip, and pre-checked detected CLIs for one-click adoption (#344). - Pi
Stoppolicy enforcement viabefore_agent_startsystem-prompt injection — works around Pi'sAgentEndEventhaving no Result type by capturing the denyreasonand gating the next user turn (#341). - OpenCode + Pi tool-input canonicalization — two-layer (shim + handler) so
block-read-outside-cwd,block-env-files, andblock-secrets-writeactually fire onread/write/editcalls. Existing user-scope shims auto-upgrade on the next failproofai version bump without a re-install (#337, #340). - Per-CLI
Stopsemantics docs — new "Per-CLI Stop semantics" subsection indocs/built-in-policies.mdxwith a 7-row table + Pi-limitation callout so users enablingrequire-*-before-stopunderstand what they'll see on each CLI (#342). - Dashboard restyle: single dark theme, project pages keyed by encoded cwd, full Gemini session UUIDs, plain-text startup line replacing the ASCII wordmark (#319, #335, #336, #338).
release-prep-checkworkflow policy + dated## <version> — <YYYY-MM-DD>CHANGELOG headings so every PR ships release-ready (no## Unreleaseddrift) (#335).
See CHANGELOG.md for the complete per-beta breakdown across the 13 betas in this cycle.
v0.0.10-beta.12
[luv-342] feat: enforce Pi Stop policies via before_agent_start hando…