Skip to content

Releases: FailproofAI/failproofai

v0.0.11-beta.8 — audit first-run fix: fire-and-forget runs, scan all history

11 Jun 08:48
9440f98

Choose a tag to compare

0.0.11-beta.8

Fixes

  • The /audit first run no longer fails on the first click, and an audit is no longer time-capped (#434) — the first, cold scan used to abort after ~15s and bounce back to the empty state (a retry only worked because the first attempt had warmed the caches server-side). The run is now fire-and-forget with uncapped status polling, so an audit runs to completion however long it takes, and the default scan now covers your entire session history instead of just the last 30 days. The run-lock's 5-minute auto-expiry is removed so a long-but-healthy run is never cut short, and a run that can't persist its result now surfaces an error instead of silently reporting success.

Docs

  • Update translated docs for changed English sources (#433).

Published to npm under the beta dist-tag (npm i failproofai@beta).

v0.0.11-beta.7 — audit re-audit bar removed; re-audit forces a fresh scan

11 Jun 07:00
6a410a4

Choose a tag to compare

0.0.11-beta.7

Fixes

  • Remove the top-of-page [ re-audit ] bar from the audit page (#431) — on the empty/expired path it stacked a second "run an audit" CTA that read as broken, and on loaded reports the freshness strip earned little. Re-auditing still works from [ run audit ] on the empty state and [ re-audit now ] at the bottom of a report; the sticky progress strip, soft-refresh-on-success, and 7-day cache TTL are untouched.
  • Re-audit now forces a genuinely fresh scan (#432) — [ re-audit now ] sends noCache: true, so it bypasses the per-transcript cache and re-scans every transcript from scratch instead of silently returning the identical cached result. The empty-state first run stays on the fast cached path; a failed re-audit leaves the prior report intact.

Published to npm under the beta dist-tag (npm i failproofai@beta).

v0.0.11-beta.6 — audit 7-day cache TTL, top-of-page re-audit bar, classifier hardening

11 Jun 04:48
d68a584

Choose a tag to compare

Features

  • 7-day cache TTL on both audit caches. Per-transcript cache (src/audit/cache.ts) gains a cachedAt field and a CACHE_TTL_MS = 7d reject-on-read check (schema bump 2 → 3 forces a clean re-scan of pre-existing entries). Dashboard cache (src/audit/dashboard-cache.ts) reuses the existing isCacheStale(cachedAt, 7d) helper on the read path so a week-old result is never silently served. (#428)
  • Top-of-page re-audit bar. New TopAuditBar renders above the IdentitySection with the last-audit timestamp (audited 3d ago), an amber expires in 14h — re-audit to refresh chip inside the final 24h of the TTL, and a [ re-audit ] button. Modes: cached, expired, empty. (#428)
  • Sticky progress strip + soft refresh during re-audit. Pink hard-offset banner pinned to the top of the viewport during a run, mm:ss elapsed timer, CSS-only edge pulse. On RerunError it swaps to a red strip with copy keyed off RerunError.kind. Success path soft-refreshes the dashboard cache via getAuditResultAction() — no more window.location.reload(). (#428)

Fixes

  • Goldfish classifier hardening. PR #426's GOLDFISH_ENTROPY retune exposed that normalised lift entropy can't tell "every cluster at typical baseline" apart from "real scatter". Adds GOLDFISH_MIN_SECOND_LIFT = 1.3 so goldfish only fires when ≥2 clusters genuinely over-index; uniform-at-baseline profiles fall through to the existing argmax. (#429)
  • Stop the Next.js 16 dev-overlay "signal is aborted without reason" warning. lib/fetch-with-timeout.ts swaps the manual AbortController + setTimeout (which called controller.abort() with no reason and silently dropped any caller-supplied init.signal) for platform AbortSignal.timeout() composed with AbortSignal.any(). (#428)

Docs

  • Update the dashboard + audit-CLI docs for the new TTL behaviour and the top-of-page re-audit bar. Reword cachedAt as TTL metadata (not part of the cache key). Fix a stale failproof policy add typo to failproofai policy add. (#428)
  • Translation refresh for changed English sources. (#427)

Full details in CHANGELOG.md under 0.0.11-beta.6.

v0.0.11-beta.5 — /audit persona fix: behavior-calibrated archetypes

10 Jun 14:38
85d2365

Choose a tag to compare

Fixes

  • Behavior-calibrated /audit archetypes — the persona classifier no longer collapses nearly every agent onto "the explorer". The lift denominator now uses empirical firing shares instead of catalog weights, so a persona wins only when it fires more than a typical agent; block-read-outside-cwd is dropped from the signal map (off by default + ubiquitous ambient reads), and the goldfish entropy threshold is retuned. Real-world distribution now spreads across all 8 personas instead of ~100% explorer. (#426)

Docs

  • Document that contributors must build the project before the in-repo dev hooks resolve the failproofai import against dist/index.js. (#426)

Full details in CHANGELOG.md under 0.0.11-beta.5.

v0.0.11-beta.4 — /audit share-card hotfix (desktop intent + correct domain)

10 Jun 10:36
9b4190a

Choose a tag to compare

/audit share-card hotfix

Two fast follow-ups on the /audit share flow introduced in 0.0.11-beta.3.

Fixes

  • Desktop "share on X" / "share on LinkedIn" no longer open the Windows share dialog. lib/share-card.ts shareCardNative() early-returns false on non-mobile devices (detected via navigator.userAgentData.mobile with a UA-string fallback for Safari / Firefox + a maxTouchPoints check for iPadOS 13+), so the ShareDock falls through to its existing clipboard + x.com/intent/tweet / linkedin.com/sharing/share-offsite path. Mobile keeps the one-tap system share sheet because there the OS sheet actually surfaces the X / LinkedIn apps as targets (#425).
  • Share templates linked to the wrong domain. Every X / LinkedIn template embedded https://failproof.ai, but the actual marketing site is befailproof.ai — so every shared post linked to a dead URL. Updated SITE_URL in both app/audit/_components/share-templates.ts and app/audit/_components/share-dock.tsx, plus the bare failproof.ai mention in the 4th X template; tightened the template test to assert the new domain so a regression fails fast (#425).

Full diff: v0.0.11-beta.3...v0.0.11-beta.4
Full changelog: https://github.com/FailproofAI/failproofai/blob/main/CHANGELOG.md#00114-beta4--2026-06-10

v0.0.11-beta.3 — /audit dashboard, email-OTP auth, pixel-craft design system

09 Jun 13:32
848dc81

Choose a tag to compare

/audit dashboard, email-OTP auth, unified pixel-craft design system

This release ships the in-app /audit dashboard, email-OTP auth across CLI + dashboard, persistent re-audit reminders delivered via SES, and a brutalist pixel-craft design system unified across every dashboard page. Plus a deep correctness/efficiency hardening pass, a supply-chain security CI gate, and the usual telemetry coverage expansion.

Highlights

/audit dashboard

  • New in-app report at /audit that turns the existing failproofai audit data into a personality-driven diagnostic. Every audited agent is classified into one of 8 archetypesoptimist, cowboy, explorer, goldfish, paranoid architect, precision builder, hammer, ghost — via a weighted classifier with full 47/47 signal coverage (every builtin policy + every audit-only detector).
  • Rewritten score + classifier engine. Personas are evenly reachable (Monte-Carlo over 50k simulated users confirms every persona lands at 10–18% share). Scores are rate-normalised against a reference volume and use a saturating exponential curve (cap·(1−e^(−p/k))) so no two hit-counts collide on a fixed value. S/A/B/C/D/F grade bands. projectedScore previews the post-enable uplift.
  • Six sections: Identity (archetype hero with 8×8 pixel sigil + meta grid), Show-off CTA, Strengths (real numbers from the audit), Score + cohort leaderboard with distribution histogram, Findings (per-policy cards: what happened / cost / evidence / fix), Prescribed Policies (with projected-score uplift callout), and a "re-audit in 7 days" return loop.
  • Persona variant catalog. Every archetype has 4–6 deterministically-seeded copy variants (taglines, descriptions, signature blocks, "common in" / "primary risk" / closing lines) keyed by a behaviour fingerprint, so two agents that land on the same archetype see different language but the same render is byte-identical across reloads.
  • Shareable PNG poster. "Make poster" captures the identity frame via html2canvas at scale 2 (failproofai-<archetype>-<YYYY-MM-DD>.png). Floating share-dock renders X / LinkedIn / save buttons stacked vertically with personalised templates (5 quirky for X, 5 measured for LinkedIn). Image attachment routes through navigator.share({ files }) → clipboard → download, picking the best route the browser allows.

Email-OTP auth (CLI + dashboard)

  • New failproofai auth login | logout | whoami CLI subcommand wired to the Rust failproof-api-server (/v0/auth/login/request, /login/verify, /token/refresh, /logout, /me). Tokens persist to ~/.failproofai/auth.json at mode 0600 with auto-refresh within a 60s leeway window.
  • Dashboard AuthDialog proxies the same flow through four new Next routes (/api/auth/{status,login-request,login-verify,logout}) so the refresh token never reaches the browser — only {authenticated, user} does.
  • FAILPROOF_API_URL (default https://api.befailproof.ai) and FAILPROOFAI_AUTH_DIR (default ~/.failproofai) for overrides.

Persistent re-audit reminders

  • New ~/.failproofai/next-audit.json (mode 0600, separate from auth.json so the reminder is independent of token refresh) + dashboard /api/auth/reminder GET/POST/DELETE.
  • Reminders forward to the api-server's SES-backed scheduler (POST/DELETE /v0/reminders) so the audit nudge is actually delivered as email. The local file remains the dashboard/CLI source-of-truth.

Unified pixel-craft design system

  • The audit page's brutalist pixel-craft tokens (--bg, --ink, --accent-pink, --accent-green, --font-mono → JetBrains Mono, --font-display → Bitcount Prop Single) are now declared once in app/globals.css and repoint every Tailwind alias (bg-card, text-foreground, border-border, --radius: 0, …) at the audit palette. /policies, /projects, and /audit now share the same chrome — pink corner brackets, dashed frames, green eyebrow captions — without rewriting any component markup.
  • Dashboard chrome scales to fill ultrawide monitors via clamp(720px, 96vw, 1840px). Base font bumped 13px → 16px. Opt-in :focus-visible ring system. Navbar redesigned around .app-header with version chip + current-section eyebrow.

Reliability + efficiency

  • Tier-A correctness pass. Concurrent refresh-token-exchange dedup (silent-logout bug fix), audit run-lock auto-expiry (5 min), JWT strict-base64url validation, AbortSignal.any fallback for Node < 20.3 / older Bun, dashboard cache schema-version rejection.
  • Tier-B refactor pass. New shared lib/fetch-with-timeout.ts + lib/atomic-write.ts; ~30 LOC of copy-paste deleted across auth-dialog.tsx, rerun-button.tsx, api-server-client.ts, auth-store.ts, dashboard-cache.ts.
  • Tier-C polish. Memoised detectorsTriggered + missing scan, rAF-coalesced scroll handler, memoised archetype-variant picker, 5s throttle on focus + visibilitychange status refresh.
  • Max-effort code-review hardening: corrected failproof policy addfailproofai policy add on every finding card, app/layout.tsx favicon fix, whoAmI() 401-retry only wipes on unambiguous 401, Retry-After clamped to [0, 86400], AuthApiError(status: 0) → 504 mapping, +12 more.
  • Reminder fetch + rerun loop now use fetchWithTimeout(15s) so a hung route can't permanently disable the CTA.
  • Audit-aware atomic writes for auth.json, next-audit.json, and audit-dashboard.json (temp-file-then-rename with mode 0600 enforcement on both temp and final paths).

Telemetry

  • 5 funnel-gap events on /audit: audit_dashboard_viewed, audit_reminder_cta_{shown,clicked}, audit_auth_dialog_{opened,dismissed,succeeded}, audit_rerun_failed, api_server_unreachable.
  • audit_user_identity_linked from the CLI (src/auth/cli.ts) so OTP sign-ins via failproofai auth login are joined to pre-auth instance events.
  • cli_policy_${action}_failure events for the policy add|remove failure path.
  • Every PostHog event across all 4 channels (hooks/audit, server, web UI, npm-lifecycle) now stamped with product: "failproofai-oss" (#380).
  • Raw verified email sent to PostHog (replacing the SHA-256 email_hash) for stronger verified-account → device identity stitch; still gated by FAILPROOFAI_TELEMETRY_DISABLED=1.

Infra

  • New bump-platform-submodule.yml workflow auto-bumps the failproofai/oss gitlink in FailproofAI/platform on every merge into this repo's main, race-safe with a rebase-and-retry loop (#394).
  • Supply-chain security CI gate: OSV-Scanner (bun.lock scanned against OSV.dev + OpenSSF malicious-packages feed) on every PR / push / weekly. Socket GitHub App behavioral early-warning layer. Blocks on any known-vulnerable or malicious dependency. 18 pre-existing transitive advisories remediated (#391).
  • Default api-server base URL flipped to https://api.befailproof.ai.

Fixes

  • CI: bump-platform-submodule SIGPIPE fix (#423). The first-line extraction printf '%s\n' "$COMMIT_SUBJECT" | head -n 1 raced under set -o pipefail on multi-KB squash-merge commit bodies. Replaced with pure-bash parameter expansion.
  • Treat GitHub neutral check-run conclusions as non-failing in require-ci-green-before-stop (Socket Security on external-contributor PRs) (#410).
  • Drop literal ━━ escape sequences rendering as visible text in the /policies activity-tab eyebrow labels.
  • Submodule-bump workflow auth: Authorization: bearer … only authenticates GitHub's REST API; git-over-HTTPS smart-protocol needs Basic x-access-token:<pat> (#395).

Dependencies

  • Swap Vitest DOM environment from happy-dom (single-maintainer, 2024 critical CVE) to jsdom (6 maintainers, ~7× weekly downloads, perfect Snyk maintenance score). Test suite (1691 tests across 82 files) stays green (#419).

Docs

  • New docs/cli/auth.mdx covering failproofai auth login|logout|whoami, on-disk auth.json shape, env-var table, troubleshooting, plus a "Persistent re-audit reminder" section.
  • README logo updated to the new fa_updated_full.svg wordmark (EN + 14 translated READMEs) (#387).
  • README supply-chain badge changed from live OSV-Scanner status to a static "supply chain: secure" badge, still linked to the workflow runs (#393).

Tests

  • +40 tests covering previously-untested audit + auth modules: __tests__/audit/{archetypes,findings,strengths,scoring,distribution,dashboard-cache,replay,share-templates}.test.ts, __tests__/lib/{auth-store,auth-store-refresh,api-server-client,share-card,fetch-with-timeout,atomic-write}.test.ts, __tests__/api/audit-state.test.ts.
  • Full suite: 1777 tests passing.

Full diff: v0.0.11-beta.2...v0.0.11-beta.3
Full changelog: https://github.com/FailproofAI/failproofai/blob/main/CHANGELOG.md#00113-beta3--2026-06-09

v0.0.11-beta.2 — `failproofai audit`, first-run prompt, telemetry coverage

22 May 04:57
252c843

Choose a tag to compare

v0.0.11-beta.2 — failproofai audit, first-run prompt, telemetry coverage

Pre-release. Tracks every commit between v0.0.11-beta.1 (2026-05-20) and current main.

Highlights

  • failproofai audit (beta) — retrospective scan of past agent sessions. New CLI command that walks transcripts from all 7 supported CLIs (Claude / Codex / Copilot / Cursor / OpenCode / Pi / Gemini), replays every tool-use event through the 39 builtin policies, and runs each through 8 new audit-only detectors for patterns not yet enforced in real time. Output is a GTM-oriented ANSI table (split into "✓ already protected" vs "○ slipping through" with per-row install CTAs) plus a sectioned, shareable markdown report at ./failproofai-audit.md. Flags + output may still change between beta releases.
  • First-run install prompt on bare failproofai. PostHog showed only ~10% of npm-installed users ever ran failproofai policies --install; the no-args dashboard launch now detects "zero hooks installed across any detected CLI" and offers the existing interactive policy selection inline. Non-TTY (CI, piped) falls through with a stderr hint. Opt-out via FAILPROOFAI_NO_FIRST_RUN=1.
  • PostHog telemetry coverage closed. 16 new server-side + 12 new web-UI events plug the gaps surfaced by the May audit — CLI install/uninstall outcomes, hook stdin/payload errors, builtin policy crashes (policy_evaluation_error, distinct from custom_hook_error), config validation warnings, postinstall lifecycle (first_install, version_changed), web dashboard interactions, and more.

Features

  • failproofai audit (#377) — scan past agent transcripts and report how often the agent did things failproofai is built to stop. Replays through 39 builtin policies + 8 audit-only detectors:
    • redundant-cd-cwd, prefer-edit-over-read-cat, prefer-edit-over-sed-awk, prefer-write-over-heredoc, sleep-polling-loop, find-from-root, git-commit-no-verify, reread-after-edit
    • Flags: --cli, --project, --since, --policy, --limit, --show-examples, --report, --no-report, --json, --no-cache
    • Output: ANSI table (split into "already protected" vs "slipping through" sections with per-row install CTAs) + shareable markdown report
    • Per-transcript cache at ~/.failproofai/cache/audit/ auto-invalidates on policy/detector code changes
    • 4 PostHog events emitted (audit_started, audit_pattern_detected, audit_install_cta_shown, audit_completed); strict slug/count/boolean-only privacy contract, honors FAILPROOFAI_TELEMETRY_DISABLED=1
  • First-run install prompt (#378) — bare failproofai invocation detects an unconfigured machine and offers the install flow inline; new src/hooks/first-run-nudge.ts module + 4 PostHog events to measure the uplift. Opt-out: FAILPROOFAI_NO_FIRST_RUN=1.
  • PostHog telemetry expansion (#376) — 16 server-side + 12 web-UI events covering CLI lifecycle, hook errors, policy evaluation failures, config validation warnings, multi-scope warnings, beta-policy installs, postinstall lifecycle, and dashboard interactions. All honor FAILPROOFAI_TELEMETRY_DISABLED=1.

Breaking

  • Removed undocumented cloud auth + event relay subsystem (#374). Deletes src/auth/ (OAuth 2.0 device-flow login against api.befailproof.ai, ~/.failproofai/auth.json token store) and src/relay/ (WebSocket event relay daemon, sanitized JSONL queue at ~/.failproofai/cache/server-queue/, PID tracking). Strips the failproofai login / logout / whoami / relay start|stop|status / sync subcommands and the internal --relay-daemon mode. Users who ran failproofai login should also wipe ~/.failproofai/{auth.json,cache/server-queue,relay.pid} and stop any running relay daemon by hand; new auth/cloud surface will land in a follow-up.

Docs

  • New docs/cli/audit.mdx (beta) + nav entry, registered in docs/docs.json English section. Translation-sync workflow (#371) will add localized pages.
  • First-run prompt documented in README, docs/introduction.mdx, and a new "First-run prompt" section in docs/cli/environment-variables.mdx (with FAILPROOFAI_NO_FIRST_RUN=1 opt-out).

Quality

  • +62 tests (1623 → 1685 total). New __tests__/audit/ covers per-detector positive/negative cases, replay through real builtins, and an end-to-end fixture-transcript run via runAudit().
  • New lib/format-date.ts unit tests (#373).
  • Refactored per-CLI tool-name + tool-input canonicalization out of src/hooks/handler.ts into src/hooks/tool-name-canonicalize.ts so the live handler and audit replay share one implementation.
  • 0 lint errors, tsc --noEmit clean, 7 CI jobs (build / docs / quality / test × 3 / test-e2e) green.

Upgrade notes

  • Audit users: failproofai audit --since 30d is a good first run. The markdown report at ./failproofai-audit.md is shareable in Slack/PRs.
  • Anyone using cloud auth/relay: see the Breaking section. Clean up ~/.failproofai/{auth.json,cache/server-queue,relay.pid} manually.
  • CI consumers: telemetry is opt-out — set FAILPROOFAI_TELEMETRY_DISABLED=1 to silence all events.

Full changelog: v0.0.11-beta.1...v0.0.11-beta.2

v0.0.11-beta.1

20 May 23:44
ac948e4

Choose a tag to compare

v0.0.11-beta.1 Pre-release
Pre-release

0.0.11-beta.1 — 2026-05-20

Breaking

  • Default policy namespace renamed from exospherehost to failproofai. Configs that explicitly reference builtins as exospherehost/<name> must update to failproofai/<name>. Flat-name shorthand (e.g. "sanitize-jwt") continues to work unchanged because it auto-resolves to the new default namespace. Builtin docs (EN + 14 translations) updated to show the new namespace.

Docs

  • Rename GitHub org URLs across package.json metadata, README CI badge (EN + 14 translated READMEs), CONTRIBUTING, in-app "Star us" banners (bin/failproofai.mjs, scripts/launch.ts, navbar, reach-developers component), Mintlify docs/docs.json, and 30 translated docs (package-aliases.mdx issues link + examples.mdx repo-tree link) to reflect the exospherehostfailproofai org rename. X social handle in docs/docs.json updated from x.com/exospherehost to x.com/failproofai.

Fixes

  • Remove orphan exospheresmall token from the Next.js proxy matcher in proxy.ts — no asset by that name exists in the repo.

v0.0.10 — 7-CLI policy enforcement: Claude, Codex, Copilot, Cursor, Gemini, OpenCode, Pi

10 May 16:40
5839fb8

Choose a tag to compare

First stable release of the 7-CLI cycle. failproofai now enforces policies across all major terminal coding agents:

CLI Config path Stop semantics
Claude Code .claude/settings.json exit-2 force-retry
OpenAI Codex .codex/hooks.json exit-2 force-retry
GitHub Copilot .github/hooks/failproofai.json {decision:"block",reason} JSON force-retry
Cursor Agent .cursor/hooks.json {followup_message} JSON force-retry
Gemini CLI .gemini/settings.json {decision:"block",reason} JSON force-retry
OpenCode .opencode/plugins/failproofai.mjs + .opencode/opencode.json in-process plugin
Pi .pi/settings.json + bundled pi-extension/ before_agent_start next-turn injection

Highlights this cycle

  • Per-CLI multi-select control panel in the dashboard /policies Configure tab — install / uninstall the diff across all 7 CLIs in one round-trip, with brand-colored per-row status pills, a 7-segment coverage strip, and pre-checked detected CLIs for one-click adoption (#344).
  • Pi Stop policy enforcement via before_agent_start system-prompt injection — works around Pi's AgentEndEvent having no Result type by capturing the deny reason and gating the next user turn (#341).
  • OpenCode + Pi tool-input canonicalization — two-layer (shim + handler) so block-read-outside-cwd, block-env-files, and block-secrets-write actually fire on read/write/edit calls. Existing user-scope shims auto-upgrade on the next failproofai version bump without a re-install (#337, #340).
  • Per-CLI Stop semantics docs — new "Per-CLI Stop semantics" subsection in docs/built-in-policies.mdx with a 7-row table + Pi-limitation callout so users enabling require-*-before-stop understand what they'll see on each CLI (#342).
  • Dashboard restyle: single dark theme, project pages keyed by encoded cwd, full Gemini session UUIDs, plain-text startup line replacing the ASCII wordmark (#319, #335, #336, #338).
  • release-prep-check workflow policy + dated ## <version> — <YYYY-MM-DD> CHANGELOG headings so every PR ships release-ready (no ## Unreleased drift) (#335).

See CHANGELOG.md for the complete per-beta breakdown across the 13 betas in this cycle.

v0.0.10-beta.12

10 May 05:33
ccc5546

Choose a tag to compare

v0.0.10-beta.12 Pre-release
Pre-release
[luv-342] feat: enforce Pi Stop policies via before_agent_start hando…