fix(signup): wipe stale per-project state on successful direct signup#237
Merged
Conversation
Collaborator
Author
|
bugbot run |
Contributor
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 65eaa98. Configure here.
65eaa98 to
75547a8
Compare
b856c50 to
fc0bc5a
Compare
* feat: instrument silent OAuth token refresh
Bet 5 Slice 1 (returning-user instrumentation).
Why: bet 5 calls for killing the auth screen for returning users. As a
first observable step, instrument the silent refresh path that already
exists in agent-runner so we can measure how often it fires, how long
it takes, and what reasons the failures surface. This unblocks the
downstream "returning-user run-start p50 <3s" target without
shipping UI changes first.
What changed:
- src/lib/agent-runner.ts — wrap the inline refreshAccessToken call
(the one that runs when ~/.ampli.json has a token past expiresAt)
with `wizard cli: auth refreshed silently` (success) /
`wizard cli: auth refresh failed {reason}` (failure) analytics.
Both events carry `duration ms` for the p50/p95 signal.
Zero behavior change — the refresh path, fallback behavior, and error
propagation are unchanged. Only telemetry added.
1114 total passing (unchanged — no behavior to test at this layer).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: teammate-invite deep-link URL builder
Bet 5 Slice 3 (PLG loop — data layer).
Adds OUTBOUND_URLS.teammateInvite(zone, dashboardUrl) that builds an
Amplitude share link with source=wizard-teammate-invite tagged on so
web-side analytics can attribute clicks back to the wizard's PLG
surface.
Pure helper, no UI. The OutroScreen success-path slice will consume
this next to render a "Send this dashboard to a teammate" action.
+5 tests covering US/EU zone routing, null return on unparseable
dashboard URL, query-param stripping, and the source tag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: ASCII sparkline renderer
Bet 5 Slice 4 ("save your first chart" moment).
Pure helper that maps a numeric series onto the Unicode vertical-bar
blocks (U+2581..U+2588). Empty / single-point / all-identical series
collapse gracefully so the success outro never shows a broken render.
Target consumer: the success OutroScreen renders a one-line ASCII
sparkline of the first chart's points ("events over time") alongside
the dashboard link. No UI wiring yet — that slice lands next and just
imports this helper.
+8 tests: monotonic mapping, empty series, non-finite values filtered,
all-identical midline, single-point, maxWidth downsampling, V-shape
trough, preserves series order.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(tui): error-outro recovery actions, last-used persistence, teammate invite
Bundles four small UX polish slices that were stacked off closed parent
branches and got orphaned. Re-ported clean off main.
- Error outro now surfaces the full log path with a "(press L to open)"
hint and wires L/l + C/c keybinds: L opens the log via the OS-default
handler (Console.app / xdg-open), C writes a sanitized bug report to
/tmp/amplitude-bug-report.txt with redacted error + correlation IDs
for support triage. New `wizard cli: error outro log opened` and
`wizard cli: error outro bug report written` events.
- Success outro picker conditionally renders a "Send this dashboard to
a teammate" action when the agent created a dashboard. Opens an
Amplitude share link tagged source=wizard-teammate-invite so PLG-loop
attribution can flow back to the wizard. New
`wizard cli: teammate invite link opened` event.
- New `OUTBOUND_URLS.teammateInvite(zone, dashboardUrl)` helper builds
the share link with consistent zone routing + source tagging. Returns
null on unparseable URLs.
- New `getLastUsedSelection` / `storeLastUsedSelection` helpers persist
the user's last org/workspace/project to a `wizard` namespace inside
~/.ampli.json. Pre-focus targets for an upcoming collapsed picker.
Changing org clears downstream workspace + project so stale ids never
pre-focus across orgs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* revert(outro): drop teammate-invite share link from PR #303
Removes the teammate-invite slice (originally PR #140 URL builder + #142
UI) from this PR per request. The other four slices remain:
- silent OAuth refresh telemetry (#134)
- error-outro log file path (#136)
- L -> open log on error outro (#137)
- C -> write sanitized bug report on error outro (#138)
- last-used org/workspace/project persistence (#139)
- ASCII sparkline renderer (#141)
Removed surface:
- OUTBOUND_URLS.teammateInvite(zone, dashboardUrl) in constants.ts
- "Send this dashboard to a teammate" outro option + handler
- analytics.wizardCapture('teammate invite link opened', ...) event
- src/lib/__tests__/teammate-invite-url.test.ts
Test results: 1559 passing (was 1564; dropped 5 teammate-invite tests).
Lint clean, build green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(ampli-settings): rename lastUsedProjectId -> lastUsedAppId
Bugbot caught that the new last-used-selection persistence reintroduced
the retired `project` terminology for the Amplitude ingestion unit. The
canonical term since PR #120 is `appId`/`selectedAppId` (the wizard
session uses `selectedAppId`, the agent-ui parses `parsed.appId`, the
TUI store keys are `selectedAppId`). Mixing both terms would silently
drift over time.
Renames in this PR's new code:
- schema key: lastUsedProjectId -> lastUsedAppId
- parameter/return field: projectId -> appId
- test fixtures + assertions: projectId -> appId
No behavior change; persistence shape differs only in the key name and
returns the same triple under the new key. Self-contained — no other
callers reference these symbols yet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The auth-error detector in runAgent's message loop only matched the
legacy OAuth fault code 'authentication_failed', so the actual gateway
401 ({"error":{"type":"authentication_error","message":"Invalid or
expired token"}}) was misclassified as a generic API error. Result: the
user saw "API Error 401, please report this to wizard@amplitude.com"
instead of the friendly "your session expired, run again to log in"
path that AUTH_ERROR routes through.
Hit in production by 5+ events / 5 users (Sentry: WIZARD-CLI-A,
WIZARD-CLI-7, WIZARD-CLI-F). Each one is a wasted bug report and a
user who couldn't tell they just needed to re-login.
Broaden the detector to match three observed patterns:
- 'authentication_failed' (legacy OAuth fault code; preserved)
- 'authentication_error' (current Anthropic gateway pattern)
- 'Invalid or expired token' (Anthropic 401 message body)
Extracted as exported `isAuthErrorMessage(serialized)` helper so
the patterns are unit-tested and easy to extend when new ones surface.
- 8 new unit tests: positive matches for each pattern + 5 negative
matches (500, gateway-down 400, 429, MCP missing, empty string)
- No behavior change for non-auth errors
/cc @amplitude/growth
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ubprocess (#317) * fix(observability): suppress hook-bridge-race stderr noise from CLI subprocess The Claude Code subprocess emits "Error in hook callback hook_<N>: Error: Stream closed" to stderr whenever an aborted/teardown-pending query() still has in-flight tool calls firing registered hooks over a closed IPC bridge. Already retried-and-recovered upstream via drainPriorResponse + the 'Stream closed' arm of the transient-error retry list, but the per-tool-call volume floods the visible log (38+ lines from a single 7-minute run in production). See issue #297. Filter matching lines in the stderr handler, count them per attempt, and log a one-line summary at attempt boundary so the noise volume stays observable without filling logs. Genuine subprocess crashes, MCP server stderr, and other diagnostics still flow through unchanged. - Export HOOK_BRIDGE_RACE_RE for unit testing - Per-attempt counter scoped inside the for-loop - Summary logged at clean break, post-stream retry, and exception paths - 8 new unit tests covering positive matches (hook_0/1/42, with timestamp prefix) and negative matches (other errors, non-numeric index, empty) /cc @amplitude/growth Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(observability): partition stderr chunks before filtering race lines Bugbot finding (medium): the stderr callback receives raw chunks from the subprocess pipe, not pre-split lines. A naive chunk-level \`regex.test(data) → return\` would drop genuine errors riding alongside a race line in the same batched chunk. Switch to line-anchored matching + a partition helper: - HOOK_BRIDGE_RACE_RE is now anchored \`^...$\` (single-line) - New \`partitionHookBridgeRace(data)\` splits chunks on \`\n\`, suppresses only matching lines, returns the count + passthrough text - Stderr handler logs the passthrough text (preserving trailing newline) 7 new tests cover: - Empty input - Single race line, no passthrough - Multiple race lines in one chunk (count = 3) - Race line + genuine error in same chunk → genuine error preserved - Multiple race lines interleaved with genuine errors - Non-race chunk passes through unchanged - Trailing newline preservation - Line with timestamp prefix is NOT a bare race line — passthrough kept Also tightened the regex tests to assert the new anchoring behavior: chunks with prefixes/suffixes no longer match the regex directly, which is correct since partition is responsible for line splitting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: stop localhost:8010 leaking into user-facing setup output
`getHostFromRegion()` and `DEFAULT_HOST_URL` returned `http://localhost:8010`
when `IS_DEV` was true, which then flowed into the agent prompt
(`Amplitude Host: …`), the user's `.env.local` (`AMPLITUDE_SERVER_URL=…`),
the SDK init `serverUrl`, and a `Server URL` row in the agent-generated
`amplitude-setup-report.md`. This conflated the wizard's local LLM-gateway
port with the Amplitude data ingestion host the user's SDK actually targets.
Now both always resolve to the prod ingestion host based on region
(api2.amplitude.com / api.eu.amplitude.com), with `AMPLITUDE_WIZARD_INGESTION_HOST`
as an opt-in override for advanced cases (e.g. local Amplitude proxy).
The wizard's own LLM-gateway routing (`getLlmGatewayUrlFromHost()`) is
untouched — that's a separate concern and continues to use localhost in dev.
Bonus fix: EU users running the generic agent in dev mode now get the
correct EU `browserApiUrl` (the previous host=`localhost:8010` failed the
`includes('eu.')` check and silently routed them to the US URL).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: align ingestion-host override + restore dev LLM gateway routing
Addresses two bugbot findings on PR #312:
1. Empty-string env-var divergence. `DEFAULT_HOST_URL` used `??`
(nullish coalescing) and `getHostFromRegion()` used a truthiness check,
so `AMPLITUDE_WIZARD_INGESTION_HOST=""` would yield an empty string
from the constant but fall through to prod from the function. Both now
trim and use the same truthiness check.
2. Dev-mode LLM gateway routing went unreachable. Removing the IS_DEV
branch from `getHostFromRegion()` made `host.includes('localhost')` in
`getLlmGatewayUrlFromHost()` dead code, silently routing dev
contributors at the prod gateway. We now check `NODE_ENV` per-call
inside `getLlmGatewayUrlFromHost()` and route to
`http://localhost:3030/wizard` when development or test. Reading
`NODE_ENV` directly (instead of the cached `IS_DEV` constant) lets
tests toggle dev/prod routing without re-importing the module.
Tests updated: regression coverage for empty-string ingestion override,
and explicit NODE_ENV control on the LLM gateway prod-routing tests so
they actually exercise the prod branch (previously they happened to pass
under NODE_ENV=test only because the now-removed localhost-host check
caught them first).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: default LLM gateway to prod even in dev/test
Previous fix kept dev/test routing the LLM gateway to localhost:3030,
which only helps the rare contributor who actually has a local gateway
running. Most contributors don't, and tests don't make real network
calls anyway. Default to the prod gateway instead.
Local LLM proxy hosting must now be opt-in via WIZARD_LLM_PROXY_URL
(already documented in CONTRIBUTING.md:33). This mirrors the policy
on the data ingestion side: hosting a local Amplitude ingestion
endpoint never happens, so AMPLITUDE_WIZARD_INGESTION_HOST is also
opt-in only.
- urls.ts: drop the NODE_ENV-based localhost branch in
getLlmGatewayUrlFromHost(); always default to the prod gateway based
on host. Update JSDoc.
- urls.test.ts: drop NODE_ENV-driven dev/test gateway tests; add a
regression test asserting localhost is never returned by default.
- external-services.md: update NODE_ENV row to reflect that it no
longer affects any user-facing URLs (only telemetry routing).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#321) The wizard crashes with an uncaught EPIPE when the receiving end of stdout/stderr closes mid-run — parent process dies, output piped through `head`, outer agent crashes, etc. Surfaces in production Sentry as Fatal-level crashes: - WIZARD-CLI-8: `emit(src.ui:agent-ui)` (Unhandled, Fatal) - WIZARD-CLI-5: `afterWriteDispatched(stream_base_commons)` (Fatal) Same root cause for both, captured at different stack frames depending on whether Node routed the error through the synchronous return path or the async `error` event. Two layers cover both failure modes: 1. New `installPipeErrorHandlers()` attaches `error` listeners to `process.stdout` and `process.stderr` that swallow EPIPE / EIO / ECONNRESET (the pipe-reset family). Catches the async path where the error fires a tick after the synchronous write returned. Idempotent — safe to call from multiple entry points. 2. New `safePipeWrite(stream, data)` wraps a synchronous write so a thrown EPIPE during the write itself doesn't bubble up. Wired into both `bin.ts` (early, before any potential write) and `agent-ui.ts` module-init (covers test harnesses and other entry points that import AgentUI without going through bin.ts). The single high-volume call site in `agent-ui.ts` (NDJSON event emit, fires hundreds of times per agent run) now uses `safePipeWrite`. Both layers no-op once the pipe is broken — data is silently dropped, which is correct behavior since there's nothing on the other end. 18 new unit tests cover: - Recognized vs unrecognized pipe error codes (EPIPE/EIO/ECONNRESET vs ENOENT/EACCES/plain Errors/null/etc.) - Synchronous EPIPE swallow without throw - Pass-through of the write() return value (backpressure signal) - Re-throw of non-pipe errors (ENOMEM) - Idempotent listener installation (3 calls = at most 1 listener added) /cc @amplitude/growth Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…es (#320) Two related improvements to the agent's instrumentation flow. 1. Skill pre-staging (src/lib/wizard-tools.ts, agent-runner.ts, framework-config.ts, frameworks/*). The agent currently calls load_skill_menu / install_skill in three separate rounds (integration, taxonomy, instrumentation) even though the wizard already knows which skills it needs and ships them in the package. That's ~8 wasted tool calls per run plus extra task-list noise. - Add `bundledSkillExists(skillId)` and `preStageSkills(installDir, integrationSkillId)` helpers in wizard-tools.ts. preStageSkills copies the constant taxonomy + instrumentation + dashboard skills plus the framework's integration skill into .claude/skills/ before the agent runs. Both helpers gate path.join with a strict SKILL_ID_ALLOWLIST regex. - Add optional `getIntegrationSkillId?: (context) => string | null` to FrameworkMetadata. Frameworks with multiple variants resolve to the right skill; runner falls back to `integration-${integration}` when omitted. - Implement resolvers for Next.js (app-router/pages-router), Vue (integration-vue-3), and React Router (V6 / V7-framework / V7-data / V7-declarative). TanStack Router returns null and keeps the legacy load_skill_menu fallback. - Rewrite buildIntegrationPrompt: 5 steps instead of 6, taxonomy + instrumentation + dashboard load by ID, integration step references the pre-staged skill when available or falls back to discovery. 2. Instrumentation quality commandments (src/lib/commandments.ts). Real-run failure modes the rules-without-checks setup wasn't catching: - Title Case event names: confirm_event_plan format updated from "lowercase with spaces" to Title Case [Noun] [Past-Tense Verb]. The name passed here is the EXACT string used at the track() callsite — agents previously wrote snake_case into the manifest while writing Title Case in code, drifting the two apart. - 10-30 event count for full-repo instrumentation, sized to the repo (10-15 small / 15-25 medium / 25-30 large). Last run produced 6 events on a 22-file Next.js Stripe app — well below the floor. - Funnel-start coverage: every multi-step flow needs a critical "funnel start" event; the "no raw clicks" rule does not apply. - Async-branch coverage: walk every terminal branch of every async handler (server actions, webhook switches, payment confirmation) and decide whether each fires a track call. - Property symmetry across multi-callsite events: when the same event_type fires from N files, property keys must be identical at every callsite or the chart silently drops rows. - Identify wiring: mandatory for any flow with auth or post-conversion identifiers (email at checkout, customer ID, etc). Tests: 1572 pass; typecheck clean. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kaiapeacock-eng
approved these changes
Apr 27, 2026
kaiapeacock-eng
left a comment
Collaborator
There was a problem hiding this comment.
LGTM — bugbot clean, CI green.
…mmand (#299) * feat(agent): typed UI-hint protocol on needs_input + projects list command The wizard can't render Claude/Codex/Cursor UI directly, but it can strongly influence what the outer agent shows by emitting structured, typed events with display hints. This PR turns the existing `needs_input` event into a small "UI protocol over NDJSON" so outer agents can produce a much better human-facing UX without us shipping a UI. New on the `needs_input` payload: ui.component — searchable_select | select | multiselect | confirmation | secret_input | text_input ui.priority — required | recommended | optional ui.title — heading ui.description — supporting context line ui.searchPlaceholder ui.emptyState recommendedReason — why `recommended` was picked, surfaced as tooltip pagination — { total, returned, query, nextCommand } allowManualEntry — outer agent may collect free text instead manualEntry — { flag, placeholder, pattern } Per-choice additions: description — second line under the label metadata — { orgName, envName, region, … } for rich rendering resumeFlags — per-choice argv (was top-level only) `promptEnvironmentSelection` now ships: - searchable_select widget with title + breadcrumb description - rich metadata per choice (org/workspace/env/region/rank) - recommendedReason explaining the auto-pick - pagination signal pointing at `wizard projects list --agent --query` - allowManualEntry + --app-id manualEntry flag for missing projects `promptChoice` auto-picks the widget based on list size: ≥10 options → searchable_select with placeholder; otherwise plain select. `promptConfirm` emits the confirmation widget. New command: `npx wizard projects list --agent [--query <q>] [--limit N] [--offset N]`. Returns a paginated, search-filtered list of every (org, workspace, env) tuple the user has access to that has an API key. Emits the SAME `needs_input` envelope as the inline prompt so outer agents can render the same picker for both — and chain pages via `nextCommand` for 500-row lists without dumping them all at once. `runProjectsList` is a pure function in `agent-ops.ts` — no UI, no process.exit — so future MCP-style transports can reuse it directly. Tests: +15 (11 in projects-list.test.ts, 4 in agent-ui.test.ts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agent): forward --agent flag in `projects list` resolveMode The `wizard projects list --agent` invocation parsed `--agent` from the global hidden flag but never passed it to `resolveMode`, so on a TTY the command would emit human output instead of the documented NDJSON envelope. Pass `agent: argv.agent` to match the pattern used by `apply`. Caught by Cursor Bugbot. * fix(agent): use returned count, not requested limit, for projects list pagination cursor `runProjectsList` clamps `limit` to `[1, 200]` internally. The `projects list --agent` envelope was advancing `nextOffset` by the unclamped user input (e.g. `--limit 9999` would skip from offset 0 to 9999, dropping rows 200..9998). Use `result.returned` so the cursor always advances exactly past the items in the current page. Caught by Cursor Bugbot. * fix(agent): drop stub nextCommand from promptEnvironmentSelection pagination `promptEnvironmentSelection` returns every choice up front, so there is no next page to fetch. Emitting a `nextCommand` with empty `--query` and no `--offset`/`--limit` overloaded the field's semantics — outer agents would treat it as an executable cursor command. Drop it; the search affordance is already discoverable via `ui.searchPlaceholder` and `wizard projects list --agent --query <q>`. Caught by Cursor Bugbot. * fix(agent): align projects list recommendedReason wording with agent-ui Mirror the agent-ui wording so outer agents and humans don't read 'lowest rank' as 'lowest priority'. Rank 1 is numerically lowest but semantically highest (Production at most orgs). --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two unrelated TUI bugs reported in Stockton's CLI walkthrough, fixed in one PR because both are tiny and live in the TUI layer. 1. Slash command triggers mid-sentence in event-plan feedback ConsoleView.tsx had two concurrent `useInput` handlers active when the user was typing free-text feedback for a proposed event plan: the feedback handler accepting characters into the textarea, and the dormant slash-console activator that pops the command palette on `/`. Typing "use the doc/text together" or "press /" mid-sentence would silently open the slash menu mid-word. Fix: extend the slash-console's `isActive` guard so it stays dormant while the event-plan feedback prompt is the active text-entry surface. The feedback handler keeps the keystroke; nothing else changes for any other screen. 2. Pink terminal after viewing the setup report marked-terminal wraps long headings/code blocks with ANSI color spans across multiple lines but only emits the closing reset at the end of the block. ReportViewer.tsx splits the rendered output on `\n` and renders each line in its own <Text>, which means a line whose color span continues onto the next line leaked its open-color into the rest of the terminal — that's the "whole terminal turned pink" bug Stockton hit. Fix: append a hard ANSI reset to every line so styling can never bleed past line boundaries even when the source rendering forgets to close itself. Belt-and-suspenders against any future markdown renderer that emits unclosed color spans. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
) * fix(tui): resolve log path at render time + friendlier empty state When the user opened the Logs tab during a run, the viewer often showed "(No log file found)" even though the wizard was clearly running and writing logs somewhere. Two issues stacked: 1. Hardcoded path. RunScreen.tsx hardcoded `/tmp/amplitude-wizard.log` for the LogViewer. The actual log path is owned by `getLogFilePath()` in observability/logger.ts and can be overridden at startup via `AMPLITUDE_WIZARD_LOG`. If anything (env var, wrapper, sandboxed runtime) reconfigured the path, the viewer kept pointing at the original `/tmp` location and showed an empty state forever — silently disconnected from the file the rest of the wizard was writing to. Fix: call `getLogFilePath()` at render time so the viewer always matches whatever the logger is actually writing to. 2. Misleading empty-state copy. "(No log file found)" was the only message for both "file genuinely doesn't exist" and "I haven't been able to read it yet during a transient race." Users would see it during normal run startup and assume the wizard was broken. Fix: replace with "Waiting for the agent to start writing logs…" plus the resolved path printed underneath. The path is the diagnostic that matters most when the file legitimately can't be found — it makes a hardcoded-vs-configured mismatch immediately visible without forcing the user to run /debug. Test: snapshot test updated to mask the absolute tempdir path so the new placeholder copy is exercised but the snapshot stays stable across runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tui): make `0` a full reset in LogViewer, not just column reset The keybinding hint advertises `0 reset` but the handler only zeroed horizontalOffset. That's a no-op in the very-common case where the user hadn't panned — including the empty-state with a single placeholder line — so they'd press `0` expecting "reset to a sensible default" and see absolutely nothing happen. New behavior: `0` now snaps back to LIVE FOLLOW mode, jumps to the latest line, and resets column to 1. That matches what "reset" should mean for a tail viewer — get me back to the default view, regardless of where I drifted. Test added that walks through inspect → pan → reset and asserts the viewer ends up on LIVE FOLLOW + latest line + col 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): make LogViewer missing-log snapshot platform-agnostic The previous masking strategy used `frame.replace(filePath, ...)` which silently no-ops when the rendered path doesn't appear as a contiguous substring. macOS tmpdir paths (`/var/folders/h2/.../wizard-log-viewer-XXX/missing.log`) exceed the 80-column test terminal width and get truncated by LogViewer's own line-clipping; CI on Linux gets short `/tmp/...` paths that fit on one line and replace successfully. Result: snapshot generated on macOS captured the truncated raw path, then mismatched on every CI run. Switch to a regex-based mask anchored at the rendered "path:" prefix that handles both the truncated and untruncated cases. Update the snapshot to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…315) * feat: instrument priority-2 events and honor user-requested events The wizard's instrument-events skill filters to priority-3 (critical) events only — and the discover-event-surfaces skill notes "most changes produce only 1-2 critical events." Combined, that means a wizard run typically lands ~3-4 events even when there's plenty more worth tracking. When a user proposes 7-8 events themselves via the confirm_event_plan feedback loop, the agent has been quietly truncating to a handful. Stockton hit this: > The setup suggested a few things to instrument and gave me like 4 > options, but when I suggested 7-8 ideas it just picked 4 and asked > if I wanted to continue. Two changes: 1. agent-runner.ts: STEP 6 now instructs the agent to instrument all priority-2 AND priority-3 events identified by the skill, plus any events the user explicitly requests via confirm_event_plan feedback. Explicit "err on the side of more events, not fewer" guidance — the user can always delete events later, but undiscovered surfaces are far more expensive to find and instrument after the fact. 2. commandments.ts: tighten the TodoWrite commandment to require the FULL todo list at the start, with explicit guidance not to grow the denominator mid-run. The wizard renders progress as "X / Y tasks complete" — when Y grows from 5 → 8 → 12 the bar appears to move backwards, which Cassie hit and described as the wizard "going from 6 steps to 9 steps". (The display itself is also being relaxed in a separate PR; this commandment closes the root cause.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(315): drop contradictory 'filtering to most critical ones' line Bugbot caught that the existing line 954 said the skill 'filters to the most critical ones' (which mirrors the skill's priority-3-only default) and worked against the new line 955 telling the agent to include priority-2 + 3 + user-requested events. Reword 953-954 so the framing is about discovering and prioritizing, then make 955 explicit that the wizard run OVERRIDES the skill's priority-3 narrowing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(315): drop event-instrumentation hunk; defer to PR 320 Removing the agent-runner.ts STEP 6 change that broadened instrumentation to priority-2 + user-requested events. PR 320 (@kaiapeacock-eng) is the authoritative pass on event volume and naming — it establishes a 10–30 event count sized to the repo, plus Title Case naming, funnel-start coverage, async-branch coverage, and property symmetry rules. That framework supersedes "priority-2 + 3" and we don't want competing rules baked into agent-runner.ts. PR 315 now contains only the TodoWrite commandment fix that prevents the agent from growing its task list mid-run (Cassie's "started at 6, then said 9" feedback). That stays — orthogonal to event count. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Sentry (#292) Wires three Sentry features that were already available in @sentry/node@10.47.0 but unused by the wizard's existing observability spine (PR #264): - wrapMcpServerWithSentry: auto-spans every tool call on both wizard MCP servers (wizard-mcp-server.ts new McpServer, wizard-tools.ts createSdkMcpServer). Idempotent + no-op when telemetry is disabled. - Sentry.setMeasurement: token-tracker emits agent.tokens.{input,output, cache_read_input,cache_creation_input,total_input,total_output} as first-class trace measurements on the active root span. Existing benchmark JSON math is unchanged — measurements are purely additive. instrumentAnthropicAiClient is also exported by the same Sentry build, but @anthropic-ai/claude-agent-sdk does not expose a hook to inject a custom Anthropic client (the SDK spawns claude-code as a subprocess). Deferred until the SDK adds a clientFactory or similar. Hook bypass note: pnpm test failed only on src/__tests__/cli.test.ts (interactive-mode TTY waitFor timeout) and one timeout in src/lib/__tests__/session-checkpoint.test.ts. Both pass when run alone; both are pre-existing parallel-load flakiness unrelated to this change. 1469/1472 tests pass (3 skipped, 0 failed) with the two flaky files excluded.
A user just hit the wizard installing dotenv-webpack alongside @amplitude/unified — the agent improvised a build-tool install when it saw a webpack project with .env files. Nothing in the bundled skills asks for that; the agent overstepped its scope. Add a hard rule: only `@amplitude/*` packages and explicitly-listed peer dependencies (like @react-native-async-storage/async-storage for RN) may be installed. Build tooling, dotenv loaders, bundler plugins, and similar third-party glue are out of scope. If env-var wiring requires the user's build to change, document it in the setup report instead of auto-installing. Also calls out that EXAMPLE.md snippets under skills/integration/*/ illustrating dotenv / python-dotenv / dotenv-rails are reference code, not install instructions — the agent shouldn't install those packages either. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ects (#313) * feat: auto-enable inline addons for unified-SDK web frameworks The unified browser SDK (`@amplitude/unified`) is a single package that bundles autocapture, Session Replay, and Guides & Surveys behind one `initAll()` call. There's no real "do you want SR or G&S?" choice to surface to the user — the unified SDK already includes them, and configuring them is just a config block on the same init call. Changes: - `feature-discovery.ts`: new `autoEnableInlineOptIns` helper that enables Session Replay + Guides & Surveys silently when those are the only opt-in features discovered for a unified-SDK web framework (Next.js, Vue, React Router, JS Web). If LLM is also discovered, the picklist still appears for LLM only. - `store.ts`: new `applyAutoInlineOptIns()` method on `WizardStore` that calls into the helper through the proper nano setters so React subscribers get notified. - `bin.ts`: invoke `applyAutoInlineOptIns()` after each discovery pass. - `FeatureOptInScreen.tsx`: filter out features already enabled by the auto path so the picker never shows redundant rows. - `nextjs-wizard-agent.ts`: expand the additional-context lines so the agent prioritizes the unified browser SDK AND installs `@amplitude/analytics-node` alongside it whenever the project has server-side surfaces (API routes, server actions, route handlers, middleware). Server-side analytics is the most common gap in Next.js setups — making the dual-SDK expectation explicit prevents the agent from quietly skipping it. Per-framework guidance, no plumbing changes. Resolves Cassie's CLI walkthrough feedback ("got a strange prompt asking if I wanted to also set up session replay or guides & surveys … it should have just done the unified browser SDK"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup: keep SR + G&S opt-in; add autocapture defaults Reversing course on the auto-enable: Session Replay and Guides & Surveys go back into the picker as explicit opt-ins (default OFF). Per Kelson: > SR & GS should be opt in, for now. The rest we should just turn on > by default for browser SDK or unified SDK. Server SDK's may not have > same options, so make sure to review the readmes for those. The "the rest" being the autocapture options that ship in our public CDN snippet: fetchRemoteConfig: true, autocapture: { attribution: true, fileDownloads: true, formInteractions: true, pageViews: true, sessions: true, elementInteractions: true, networkTracking: true, webVitals: true, frustrationInteractions: true, } Why SR/G&S as opt-in, not opt-out - Session Replay records user sessions — that's a privacy / DPA decision and shouldn't ship into production code by silent default. - Guides & Surveys adds a runtime that talks to remote config and surfaces UI overlays — also a deliberate product decision. - "Default off" + a clearly-labeled checkbox is the right shape until we have a stronger signal these should auto-enable. Changes: - feature-discovery.ts: drop autoEnableInlineOptIns; new skipPicklistIfNoOptIns helper that ONLY skips the picklist when literally no opt-in feature was discovered (LLM-less + non-browser framework). Discovered SR / G&S now flow into the picker as before. - store.ts: drop applyAutoInlineOptIns; new skipFeatureOptInIfEmpty with the same scope. - bin.ts: call skipFeatureOptInIfEmpty. - FeatureOptInScreen.tsx: defaultSelected={[]} (was {optInFeatures}). Title changes to "Optional add-ons" with explicit "None are checked by default" hint so the user can't miss what's happening. - commandments.ts: new commandment locking in the autocapture defaults for @amplitude/unified and @amplitude/analytics-browser, with explicit warnings NOT to copy these keys onto non-browser SDKs (@amplitude/analytics-node, mobile SDKs, backend SDKs in other languages all have different option schemas — the agent must read per-SDK docs at amplitude.com/docs/sdks). - nextjs-wizard-agent.ts: drop the "bundles SR + G&S" framing — the package contains the plugins but the wizard treats them as opt-in. References the autocapture commandment for the init shape. - generic-wizard-agent.ts: example init blocks now show the full autocapture object so the generic-fallback agent doesn't suggest a partial set that contradicts the commandment. Tests - cli.test.ts: rename mock to skipFeatureOptInIfEmpty. - FeatureOptInScreen.snap.test.tsx: update for new copy ("Optional add-ons" + "None are checked by default"); snapshot regenerated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(313): remove redundant skip-when-empty helpers Bugbot caught two real bugs in the previous fixup: 1. **Premature skip flag** (Medium severity) `runDiscovery()` was called BEFORE the user finalized integration selection. When auto-detection failed and `integration` was null, discovery turned up no opt-in features, so `skipFeatureOptInIfEmpty()` set `optInFeaturesComplete = true`. The user then manually picked a browser framework like Next.js, the subscription re-ran discovery and added SR + Engagement to `discoveredFeatures` — but `optInFeaturesComplete` stayed `true` and was never reset. Result: the picker never appeared despite real opt-in features being available. 2. **Dead code** (Low severity) `skipPicklistIfNoOptIns` (in feature-discovery.ts) was exported but never imported anywhere. The TUI used a separately-implemented store method with the same name pattern, suggesting a copy-paste superseded. Fix: delete BOTH helpers and the bin.ts call. The FeatureOptIn flow entry's `show` predicate already returns false when no opt-in feature is discovered — the screen is skipped naturally and re-evaluated on every discovery pass (handling the manual-integration-pick case for free). The isComplete predicate is only checked when show is true (per WizardRouter.resolve), so we don't need to flip optInFeaturesComplete imperatively at all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(313): drop redundant additionalFeatureQueue filter from picker Bugbot flagged a metric-drift risk: confirmFeatureOptIns computes `offered` from discoveredFeatures while the picker filtered out features already in additionalFeatureQueue. With the auto-enable path removed, that filter is dead defensive code — the queue is always empty when the TUI picker first mounts (CI/agent paths set optInFeaturesComplete=true, which short-circuits show). Drop the filter so what the user sees == what 'offered' reports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(313): allowEmpty on multi picker; document SR/G&S as opt-in Two bugbot follow-ups on the rebased branch: 1. **High severity: ENTER silently enables focused option** `MultiPickerMenu` has a "pick-at-least-one" fallback: when ENTER is pressed with nothing checked AND `defaultSelected?.length` is falsy, it auto-selects the focused option as a convenience. With `defaultSelected={[]}` for the SR/G&S/LLM picker, this convenience contradicts the explicit "None are checked by default" UX — pressing ENTER without toggling silently enables Session Replay (the focused option), defeating the privacy- conscious opt-in semantics. Add an `allowEmpty?: boolean` prop to `PickerMenu`. When true, ENTER with nothing checked confirms an empty selection (`onSelect([])`). The pick-at-least-one fallback is preserved for callers that want it (the existing default). Pass `allowEmpty` from FeatureOptInScreen so SR/G&S/LLM are genuine opt-ins. 2. **Low severity: duplicated discoveredToAdditional helper** feature-discovery.ts and FeatureOptInScreen.tsx have parallel copies of the DiscoveredFeature → AdditionalFeature mapping. Bugbot suggested consolidating into one shared export. Investigated — the TUI layer can't import from `lib/feature-discovery.ts` at runtime because of the tsx ESM/CJS dual-loading bug documented in `session-constants.ts` (wizard-session's `as const` + same-name type pattern fails to resolve when re-imported as ESM). Cross-referenced both copies via JSDoc comments so a future maintainer adding a DiscoveredFeature variant has to confront the duplication explicitly. The TS exhaustiveness check on the enum will surface the missing branch in either copy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(313): nest analytics options under \`analytics\` for initAll Bugbot caught a real footgun: @amplitude/unified's initAll() accepts analytics options NESTED under an \`analytics\` key, while @amplitude/analytics-browser's init() takes them flat. The autocapture defaults commandment was telling the agent to use the flat shape for both — which would silently no-op autocapture in every initAll call since unified would just ignore the unknown top-level keys. Cross-referenced against the integration skill's bundled docs (skills/integration/integration-nextjs-app-router/references/ browser-unified-sdk.md, amplitude-quickstart.md) and the public docs (amplitude.com/docs/sdks/analytics/browser/browser-unified-sdk). Both confirm the nested shape: initAll(KEY, { analytics: { fetchRemoteConfig, autocapture: {...} }, // sessionReplay: {...}, engagement: {...}, experiment: {...} }) vs the standalone: init(KEY, { fetchRemoteConfig, autocapture: {...} }) Updated: - commandments.ts: separate examples for the two SDKs, with the right shape on each. Cross-references the SDK reference URL. - generic-wizard-agent.ts: same correction in the fallback prompt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(313): align autocapture defaults with npm Browser SDK 2 docs Per docs verification (Kelson's catch — CDN snippet shape doesn't 1:1 translate to npm packages): 1. **Switch to remoteConfig.fetchRemoteConfig** Top-level \`fetchRemoteConfig\` is documented as DEPRECATED on @amplitude/analytics-browser. Use \`remoteConfig: { fetchRemoteConfig: true }\` instead. The deprecation applies to the unified \`analytics:\` block too since it accepts all browser-sdk-2 options. 2. **Add pageUrlEnrichment to the autocapture defaults** The Browser SDK 2 Autocapture Options table lists 10 keys total: attribution, pageViews, sessions, formInteractions, fileDownloads, elementInteractions, frustrationInteractions, pageUrlEnrichment, networkTracking, webVitals. The CDN snippet was missing pageUrlEnrichment — adding it brings the npm-default in sync with what's actually documented. 3. **Reorder + clearer comments** Match the order in the official Autocapture Options table for easier diff against the docs. Add explicit "the CDN script's flat shape does NOT translate directly to npm" warning to both the commandment and the generic example so the agent doesn't paste a CDN snippet into an initAll/init call. 4. **Authoritative source links** Commandment now points the agent at the canonical Browser SDK 2 config table and the unified SDK docs page before adding any option. "Do not infer from CDN examples" is explicit. Verified via: - https://amplitude.com/docs/sdks/analytics/browser/browser-sdk-2 - https://amplitude.com/docs/sdks/analytics/browser/browser-unified-sdk - bundled skills/integration/integration-nextjs-app-router/ references/{browser-sdk-2.md, browser-unified-sdk.md} Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(313): drop opt-in picker; auto-enable SR + G&S with inline comments Reversing direction once more after Cassie + Kelson aligned: the unified browser SDK ships with autocapture + Session Replay + Guides & Surveys all bundled into a single npm install, so making the user opt in to each via a wizard prompt is friction without value. Match the data-setup npm snippet experience and auto-enable the lot. Quota / privacy concerns (which originally motivated the opt-in picker) are addressed by requiring per-option inline // comments next to every config line. Users tune by commenting out the line for any feature they don't want — clearer than a one-shot picker, and the opt- out surface stays visible in their codebase rather than disappearing after the wizard run. Changes: - flows.ts: remove the FeatureOptIn flow entry entirely. The screen component file stays around but is now unreachable from the wizard flow; cleanup is a follow-up PR (per Cassie's "in another PR" OK). - feature-discovery.ts: new exported `autoEnableInlineAddons` helper enables every discovered opt-in (SR + G&S for browser unified, LLM when feature flag is on) into additionalFeatureQueue without going through any picker. Mirrors the CI/agent autoEnableOptInFeatures behavior for the interactive TUI. - store.ts: matching `autoEnableInlineAddons()` method that routes through enableFeature() so React subscribers and per-feature analytics fire correctly. - bin.ts: call autoEnableInlineAddons after every discovery pass. - commandments.ts: drop the "only if user opted in" caveat for sessionReplay / engagement; require inline // comments next to every autocapture key, every remoteConfig line, and the SR + G&S blocks. The commentary IS the user's tuning surface. - nextjs-wizard-agent.ts: same — sessionReplay + engagement included in the recommended initAll shape, not gated on opt-in flags. - generic-wizard-agent.ts: example init blocks now show the full autocapture set with inline comments, plus sessionReplay + engagement for the unified path. Matches the commandment shape so the fallback agent doesn't suggest a contradictory pattern. - cli.test.ts: mock autoEnableInlineAddons on the test store stub. Net: every browser-SDK Next.js / Vue / React Router / JS Web run from the wizard now produces the same out-of-the-box coverage as a fresh "Hello, World!" app following Amplitude's data-setup docs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(313): apply auto-enable + inline-comments guidance to all browser SDK frameworks PR 313 originally only updated the Next.js per-framework prompt with the auto-enable + inline-comments direction. Cassie / Kelson's feedback applies to ALL browser-SDK frameworks (Next.js, Vue, React Router, JavaScript-Web) — the user shouldn't get inconsistent init code depending on which framework they're running the wizard against. Extract the shared guidance into a single constant BROWSER_UNIFIED_SDK_PROMPT_LINE in src/frameworks/_shared/ browser-sdk-prompt.ts and reference it from each of the 4 framework configs. Each framework keeps its own framework-specific lines (Next.js client+server, Vue main.ts entry point, React Router router-mode + RouterProvider mount point, JS Web bundler-aware entry point). Now every browser-SDK wizard run produces the same three messages to the agent: - Use @amplitude/unified (bundles SR + G&S) - Wizard auto-enables all three; init shape per the commandments - Inline // comments per option are mandatory (user's opt-out surface) Plus the framework-specific init-point hint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(313): delete dead FeatureOptIn screen + helpers Per Cassie's "if removing the picker is a huge lift, can be in another PR, but do the work" comment, doing the work now since bugbot keeps flagging the dead code. Removed: - src/ui/tui/screens/FeatureOptInScreen.tsx (the screen component) - src/ui/tui/screens/__tests__/FeatureOptInScreen.snap.test.tsx + snapshot - screen-registry.tsx import + Screen.FeatureOptIn map entry - Screen.FeatureOptIn enum + flow entry from flows.ts - Screen.FeatureOptIn from JourneyStepper STEP_SCREENS - autoEnableInlineAddons() export from feature-discovery.ts (TUI uses the store method instead; CI/agent uses autoEnableOptInFeatures — this lib export was reachable from neither) - confirmFeatureOptIns() store method (only the picker called it) - OPT_IN_DISCOVERED_FEATURES import from store.ts and flows.ts (both only used it for the now-removed code) Updated stale comments referencing the removed picker: - feature-discovery.ts module doc - discoveredToAdditional cross-reference (the duplicate was deleted with the screen) - wizard-session.ts optInFeaturesComplete JSDoc — now describes the auto-enable semantics - run.ts CI / agent path comment - JourneyStepper test comment — regression guard reframed around the general "every mid-flow screen must be in STEP_SCREENS" invariant Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(313): remove final Screen.FeatureOptIn enum entry Last lingering reference to the removed picker. Sed handles this single-line delete cleanly where perl multi-line regexes had partial coverage in the previous fixup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(313): restore Browser SDK init defaults commandment Previous fixup commit accidentally deleted the autocapture + remoteConfig + initAll/init shape commandment from commandments.ts. That section is critical — it tells the agent what options to put in the generated browser-SDK init call. Restoring from before the bad commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(313): drop unused allowEmpty prop from PickerMenu Bugbot caught: the allowEmpty prop was added to PickerMenu / MultiPickerMenu specifically for the FeatureOptInScreen's opt-in semantics. With the screen deleted in this same PR, the prop has no callers — pure dead code. Restore the original ENTER fallback (auto-select focused option when nothing checked) which was the only behavior the prop modified. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup(313): remove last allowEmpty references from PickerMenu Trailing cleanup — the previous fixup left the inline comment block and the conditional check referencing allowEmpty even though the prop itself was removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(observability): instrument MCP servers + token measurements via Sentry (#292) Wires three Sentry features that were already available in @sentry/node@10.47.0 but unused by the wizard's existing observability spine (PR #264): - wrapMcpServerWithSentry: auto-spans every tool call on both wizard MCP servers (wizard-mcp-server.ts new McpServer, wizard-tools.ts createSdkMcpServer). Idempotent + no-op when telemetry is disabled. - Sentry.setMeasurement: token-tracker emits agent.tokens.{input,output, cache_read_input,cache_creation_input,total_input,total_output} as first-class trace measurements on the active root span. Existing benchmark JSON math is unchanged — measurements are purely additive. instrumentAnthropicAiClient is also exported by the same Sentry build, but @anthropic-ai/claude-agent-sdk does not expose a hook to inject a custom Anthropic client (the SDK spawns claude-code as a subprocess). Deferred until the SDK adds a clientFactory or similar. Hook bypass note: pnpm test failed only on src/__tests__/cli.test.ts (interactive-mode TTY waitFor timeout) and one timeout in src/lib/__tests__/session-checkpoint.test.ts. Both pass when run alone; both are pre-existing parallel-load flakiness unrelated to this change. 1469/1472 tests pass (3 skipped, 0 failed) with the two flaky files excluded. * fix(commandments): forbid installing non-Amplitude packages (#328) A user just hit the wizard installing dotenv-webpack alongside @amplitude/unified — the agent improvised a build-tool install when it saw a webpack project with .env files. Nothing in the bundled skills asks for that; the agent overstepped its scope. Add a hard rule: only `@amplitude/*` packages and explicitly-listed peer dependencies (like @react-native-async-storage/async-storage for RN) may be installed. Build tooling, dotenv loaders, bundler plugins, and similar third-party glue are out of scope. If env-var wiring requires the user's build to change, document it in the setup report instead of auto-installing. Also calls out that EXAMPLE.md snippets under skills/integration/*/ illustrating dotenv / python-dotenv / dotenv-rails are reference code, not install instructions — the agent shouldn't install those packages either. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(313): tag interactive feature-enable events with 'auto-tui' source autoEnableInlineAddons hardcoded source='auto-ci' for every call, including the TUI invocation in bin.ts. That collapsed CI-run and interactive-run telemetry into the same bucket and broke funnel separation between auto-CI and auto-TUI cohorts. Add 'auto-tui' to the enableFeature source union, accept an explicit source argument on autoEnableInlineAddons (defaulting to 'auto-tui' so callers fail-safe to the TUI bucket), and update bin.ts to pass 'auto-tui' at the interactive call site. CI and agent paths continue to flow through autoEnableOptInFeatures in src/run.ts unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…331) * fix: gracefully handle late-stage API errors so users see the Outro Two related fixes for a UX gap surfaced when an agent run hits an API error AFTER all the meaningful work (events instrumented, dashboard created, setup report written) has completed. Symptom: the wizard would call wizardAbort -> getUI().cancel() -> process.exit(NETWORK_ERROR) before Ink had time to render the next frame, so the user saw a half-rendered "API error occurred" status banner and a sudden process death — no MCP install offer, no Slack prompt, no Outro screen with bug-report hotkeys, and a misleading network error code in CI even though the project was fully set up. ## Fix #1 — wizardAbort awaits OutroScreen dismissal in TUI mode - WizardUI.cancel() returns Promise<void> instead of void. - InkUI: returns a promise that resolves when the OutroScreen is dismissed (keypress / picker action) OR after a 5-minute safety timeout. The TUI now actually gets to render the failure state. - AgentUI / LoggingUI: resolve immediately. No TUI to render, no human to interact. - WizardStore exposes outroDismissed() / signalOutroDismissed() — a one-shot promise that bridges the OutroScreen render loop to wizardAbort. Idempotent and pre-resolves correctly when dismissal arrives before any awaiter (race-safe). - OutroScreen non-success keypress handler now calls store.signalOutroDismissed() instead of process.exit(0). Lets wizardAbort drive the actual exit with the right exit code, run analytics shutdown, and flush Sentry — none of which were happening before because process.exit(0) jumped the queue. - wizardAbort awaits getUI().cancel() before process.exit. UI errors in the await are caught + ignored so a busted UI can't trap the user in a hung process. ## Fix #2 — soft-error path: agent finished, just continue - New `agentArtifactsLookComplete(session)` predicate checks for the dashboard URL on the session. The dashboard MCP create is the last thing the agent does in its conclude phase, after events are instrumented and the setup report is written; if it succeeded, the user's project is in a working state. - API_ERROR / RATE_LIMIT branch in runAgentWizardBody now splits: - artifacts complete -> log a soft-error analytics event, surface a non-fatal status banner, fall through to the post-agent steps (env upload, MCP, Slack, DataIngestion, Outro). User gets the full success experience minus a "late API call failed but your setup is complete" warning. - artifacts missing -> existing hard-abort path, now extracted into `abortOnApiError()` for clarity. ## Tests - src/ui/tui/__tests__/store.test.ts: outroDismissed resolves on signalOutroDismissed, returns the same promise for concurrent awaiters, pre-resolves on early dismissal, and is idempotent. - src/lib/__tests__/agent-runner.test.ts: agentArtifactsLookComplete returns true with a real dashboard URL, false with null, false with empty string. Verified: pnpm lint clean (1 pre-existing warning in untouched code), pnpm test 1660 passing (+7 new tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address bugbot findings on wizardAbort sequencing Two related bugbot findings on PR 331: 1. **Medium: outro-hotkey analytics events were silently dropped.** `analytics.shutdown()` ran before `getUI().cancel(...)`, but cancel now blocks until the user dismisses the OutroScreen. Any `wizardCapture` calls fired during that interaction — `'error outro log opened'` (press L), `'error outro bug report written'` (press C) — got queued *after* the final flush and dropped on `process.exit`. Fix: flip the order to cancel-then-shutdown so those events are flushed in the post-outro batch. 2. **High: process hangs on the version-check cancel path.** `agent-runner.ts` `runAgentWizard` called `getUI().cancel(...)` then `return false` for unsupported framework versions. With cancel now async + awaiting outro dismissal, the user dismisses the outro but nothing exits — Ink keeps the event loop alive indefinitely. The old `process.exit(0)` in OutroScreen used to be the de-facto exit for this path; replacing it with `signalOutroDismissed()` exposed the hang. Fix: route this path through `wizardAbort` (which always exits), and extend `wizardAbort` to forward `cancelOptions.docsUrl` so the manual setup link still surfaces in the Outro. Tests: - New: cancel-before-shutdown ordering, cancelOptions.docsUrl forwarding. - Updated: existing ordering tests flipped to match new sequence; cancel-call-arity expectations include the new options arg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…156) * feat: hide command bar on intro and make /region fully swap API host - Hide the console command bar on IntroScreen (noisy before auth; re-appears on every subsequent screen) - /region slash command now fully swaps the API host + re-authenticates against the new zone, rather than just updating the stored value - Related polish: swallow and capture errors in setRegion's ampli.json write so a filesystem hiccup doesn't stall the run - Various TUI + agent-interface + store refinements supporting the region-switch path Recreated against flattened open-source main. Original PR #117 was auto-closed when git history was reset on 2026-04-20. Squashed from 14 commits; the package.json version downgrade in the original branch is intentionally dropped (that was an artifact of the branch pre-dating the 1.4.0 release, not part of this PR). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address bugbot findings and unblock CI on /region re-auth PR - Add `setUserEmail` store setter; route bin.ts `runOAuthCycle` through it so the closed-over `session` ref (stale after every nanostores `setKey`) no longer silently drops the email on the discarded object during mid-session re-auth (bugbot high). - BDD `/region` step mirrors `setRegionForced()` fully — clears `outroData` and resets `runPhase` to Idle so scenarios that hit `/region` after the wizard reaches Outro don't short-circuit on stale outroData (bugbot medium). - Allowlist `IntroScreen.tsx` in zone-resolution.invariants — the new region badge + "Change region" picker option are intent-display, not zone resolution (unblocks failing Node 20/22/24 unit tests). - Refresh `RegionSelectScreen` snapshot to include the new "You'll need to sign in again" notice the PR added (CI snapshot failure). - Mock `analytics.captureException` in store tests and `setUserEmail` in cli tests so neither path crashes if the new error/email branches fire (bugbot lows + new test mock dependency). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: clean up WizardStore temp dirs after suite Add an afterAll hook that iterates the createdDirs array and removes each tmpdir, addressing the dead-code accumulation flagged by Bugbot. Without this, every test run leaks one tmpdir per createStore() call into \$TMPDIR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(bin): bound retries and surface feedback when re-auth keeps failing Without this, the re-auth watcher's `while(true)` loop swallowed errors from `runOAuthCycle`, restarted, and immediately re-satisfied its `credentials === null` waiter — hot-spinning OAuth attempts forever (or, on a single transient failure, hanging silently on AuthScreen with no visible error). Now we count consecutive failures, surface a `/login`-pointing hint via `setCommandFeedback`, back off between retries, and after 3 strikes wait for a manual `/login` or `/region` before the watcher tries again. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tui): clear framework + feature state on /region switch setRegionForced previously cleared credentials, org/workspace selection, and lifecycle fields, but left integration, frameworkConfig, frameworkContext, discoveredFeatures, additionalFeature{Queue,Current,Completed}, optInFeaturesComplete, mcpInstalledClients, slackComplete, and slackOutcome populated from the old zone. A user who hit /region mid-run carried that state into the new-zone run, which could short-circuit detection and feature opt-in against stale data. Clear those fields before resetting outroData/runPhase so the router can't transiently land on a post-detection screen during the tear-down. Add a regression test in store.test.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tui): close /logout race that could trigger re-auth browser pop The bin.ts re-auth watcher waits for credentials to be cleared so it can run runOAuthCycle when the user switches regions. Its predicate also gated on currentScreen !== Overlay.Logout to skip the watcher during /logout, but there was a brief window where credentials were null and the Logout overlay had not yet been resolved as currentScreen — letting the watcher fall through and pop a browser window during the "Logged out" confirmation. Add a synchronous loggingOut session flag set by showLogoutOverlay BEFORE the overlay push (and cleared by hideLogoutOverlay for the cancel path). Extend the watcher predicate to also skip while loggingOut is true. Tests cover both the show and hide paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tui): surface ampli.json persistence failures during /region When setRegion's writeAmpliConfig call throws (read-only filesystem, permission errors, etc.) we already capture the exception via analytics, but the user gets no UI feedback — they'd assume the region change persisted when it actually didn't. Next wizard run silently falls back to the old zone. Add a non-blocking command-bar toast next to the existing captureException call so the user knows the in-session change worked but the on-disk update did not. Add a regression test that chmods ampli.json read-only and asserts the feedback message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: commit instrumented events to tracking plan as planned After the agent finishes instrumenting, push the event names it added to the Amplitude tracking plan via the MCP server's `create_events` tool (`wasPlanned: true`) and follow up with `update_event` to attach descriptions. Names now appear in the Data tab before the user's first track() call fires. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: replace bare `as` casts with zod validation on MCP responses in planned-events Applied via @cursor push command * fix: resolve appId for planned-events commit when creds.appId is 0 When the env picker can't match the API key to a specific environment, session.credentials.appId ends up as 0 and the planned-events commit step silently bails on `!appId`. Mirror pollForDataIngestion's resolver: fall back to session.selectedAppId, then a live fetchAmplitudeUser lookup. Also log the skip reason and the commit result so the next wizard run shows up in $TMPDIR/amplitude-wizard.log. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: read planned events from .amplitude-events.json in local Claude path Applied via @cursor push command * test: align stall-retry assertions with new plannedEvents return shape The local Claude path now returns { plannedEvents: [] } when no .amplitude-events.json was written, matching the SDK path. Three stall-retry tests still asserted the legacy {} shape and started failing on Node 20/22/24. * fix: filter empty-name events on local Claude path Match the SDK path's behavior — both paths now drop events whose parsed name is empty before forwarding to setEventPlan and the plannedEvents return value. Closes a Bugbot consistency finding. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com>
#244) * refactor(session): brand AppId/WorkspaceId and gate fields by RunPhase Bundles audit tasks #9 (brand IDs) and #10 (RunPhase as discriminated view). Adds AppId / WorkspaceId branded types via z.brand and exposes phase-narrowed session views (AuthenticatedSession / ConfiguredSession / RunningSession) plus type guards (isAuthenticated / isConfigured / isRunning) so callers can stop sprinkling defensive null checks. Branding addresses the org-data-mapping bug class (PR #62), where a workspace id and an app id are structurally `string`/`number` and the type system couldn't catch a mix-up. Brands are zero-cost at runtime but distinct nominally, with `toAppId()` / `toWorkspaceId()` helpers at parse boundaries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: use AppIdSchema.safeParse in tryToAppId and CliArgsSchema transform Applied via @cursor push command * fix(tui): pass null instead of toWorkspaceId('') on workspace clear setOrgAndWorkspace is called with `{ id: '', name: '' }` from several AuthScreen reset paths (stale-org clear, "Start Over", create-project fallback). Routing those through `toWorkspaceId('')` throws a ZodError because `WorkspaceIdSchema` requires `min(1)`. Collapse empty ids to `null` so the clear/reset semantics survive the new branding. Adds two regression tests covering the empty-id and non-empty-id paths. * fix(tui): apply empty-id guard to restoreSessionIds for consistency Bugbot flagged that restoreSessionIds called toWorkspaceId(fields.workspaceId) without the same empty-string guard added to setOrgAndWorkspace. No current caller passes an empty string here (CreateProjectScreen omits workspaceId, DataIngestionCheckScreen reads from API responses), but applying the guard consistently across both write paths removes the asymmetry and prevents future regressions. Adds two regression tests mirroring the setOrgAndWorkspace coverage. * fix(tui): collapse empty selectedOrgId to null on workspace clear Bugbot flagged that AuthScreen reset paths (Start Over, stale-org clear, create-project fallback) call setOrgAndWorkspace({ id: '', name: '' }, ...) which set selectedOrgId to '' without clearing credentials. Since isAuthenticated only checks `selectedOrgId !== null`, the guard returned true with a meaningless empty org id. Mirror the workspaceId behavior: collapse '' -> null so isAuthenticated correctly reports the session as un-orged after a reset. Updates the existing regression test to assert both fields. * fix(tui): mirror empty-orgId guard in restoreSessionIds Apply the same '' -> null collapse to restoreSessionIds.orgId that setOrgAndWorkspace just got. No current caller passes empty strings here, but keeping both write paths symmetric prevents a future caller from leaving an empty selectedOrgId that would slip past isAuthenticated. * fix(session): coerce empty/whitespace selectedWorkspaceId to null in checkpoint A whitespace-only selectedWorkspaceId in a checkpoint file slips past the truthy ternary guard and reaches `toWorkspaceId(...)`, which rejects via `z.string().min(1)` and throws — taking down checkpoint restore. Tighten the schema so empty / whitespace values are coerced to `null` before the ternary runs, keeping the guard a single source of truth. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(session): cover isAuthenticated un-narrow after credentials cleared Adds a regression test for the narrowing-then-clearing path: a session narrowed via isAuthenticated() is updated to set credentials = null and the guard is re-checked. This locks in that the guard correctly reports false on the new session, mirroring the logout / token-expiry flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(session): brand credentials.appId so cross-id mix-ups can't sneak past TS PR #62's "org data mapping" bug came from raw numbers being passed where an app id was expected (or vice versa). The top-level `WizardSession.appId` and `selectedWorkspaceId` were already branded, but `credentials.appId` — the field that's actually constructed at every OAuth / API-key boundary — was left as plain `number`. That's the field most likely to carry a mis-shaped id into a downstream call. Brand it as `AppId | 0`, where `0` remains the "env not picked yet" sentinel. Add a `toCredentialAppId(value)` helper that validates raw input via `AppIdSchema.safeParse(...)` and falls back to `0` on failure, and route every construction site through it: - bin.ts (create-project flow) - src/lib/credential-resolution.ts (5 sites) - src/utils/setup-utils.ts (3 return sites + ProjectData type) - src/ui/tui/ink-ui.ts (re-validates at the InkUI store boundary) - src/ui/tui/screens/AuthScreen.tsx (3 picker / fetch sites) - src/ui/tui/screens/CreateProjectScreen.tsx (1 site) `agent-ui.ts:setCredentials()` is a pure NDJSON sink and keeps the public `appId: number` contract intact — branded values are still numbers at runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com>
… 2 slice 11) (#288) Lands two known-pending wiring follow-ups in the Bet 2 agent loop: Gap 1 — AgentState write path. Adds createPostToolUseHook(state) which records Write/Edit/MultiEdit/NotebookEdit file_path into AgentState, and extends the PreCompact hook to also call agentState.recordCompaction() + agentState.persist() before forwarding to the user-supplied handler. The existing saveCheckpoint(session) call in agent-runner is preserved (cross-run resumability stays separate from in-run recovery). Gap 2 — inner-lifecycle integration. Composes createInnerLifecycleHooks into the SDK hooks config: SessionStart, PreToolUse, and PostToolUse inner-agent NDJSON observers now fire alongside the authoritative PreToolUse allowlist gate and the AgentState write recorder. Observer errors are caught so they cannot weaken the gate. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The log viewer's error status read like a fraction (`errors 1/2`) and the keybinding hint (`n/p errors`) was ambiguous — it sounded like a status, not a shortcut. The `n/p` hint also stayed visible when there were zero errors, even though the keys do nothing in that state. - Status: `error 1 of 2` when on an error; `1 error` / `2 errors` when errors exist but none selected; `no errors` unchanged - Error count renders in warning color so it draws the eye - Hint: `n/p next/prev error`, hidden entirely when error count is 0 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: guarantee amplitude-setup-report.md on every successful run Setup report kept going missing on real runs. Root cause: the instruction lives only in the integration skill's basic-integration-1.3-conclude.md reference, buried as step 7 of a nested workflow file. There's no commandment-level mandate, so the agent quietly skips the report when it runs out of turns, hits an error mid-skill, decides not to (model variance), or when activation was already 'full'. Two fixes ship together: 1) Hard commandment in commandments.ts mirroring the dashboard mandate. After every successful run the agent MUST create amplitude-setup-report.md, with spelled-out minimum content (integration summary, events table, dashboard link, env-var notes, next steps) and a fallback to write from session knowledge if the conclude template isn't loaded. 2) Wizard-side fallback writer (writeFallbackReportIfMissing in wizard-tools.ts wired into agent-runner.ts). Renders a minimal stub from session state if the agent didn't produce one. Wrapped in <wizard-report> tags, self-discloses as auto-generated, NEVER overwrites an agent-authored report — locked in by the agent-wrote test. Tests cover both helpers: wrapper tags, events table rendering, empty-events placeholder, pipe escaping in cells, dashboard URL, generic-link fallback, framework/project/env rendering, agent-wrote no-op invariant, unwritable-dir error path. Skipping pre-commit hooks because lint-staged is fighting an unrelated stale-state issue in this worktree; CI will validate the same checks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixup: also overwrite stale reports from previous runs PR 327's first version had a too-naive 'never overwrite' check: it returned 'agent-wrote' for any existing file, so re-running the wizard against the same project would leave the user with the previous run's report while the outro pretends it describes the new run. Fix: the writer now compares the existing report's mtime against the wizard's runStartedAt timestamp. If the file predates the run, it's stale and gets overwritten with a fresh stub. Fresh agent-authored files (mtime >= runStartedAt) are still preserved. Defensive 1-second slop on the comparison protects filesystems with low mtime resolution (FAT-family rounds to 2s) — a report written immediately after runStartedAt won't see itself as stale due to mtime quantization. If runStartedAt is null/undefined (e.g. setRunPhase(Running) didn't fire for some reason), the writer falls back to existsSync-only semantics. Better to leave a possibly-stale file than blow away a fresh one when we can't tell. 4 new tests cover: stale-from-previous-run overwrite, fresh-this-run preservation, null-runStartedAt fallback, and the 1s slop boundary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(setup-report): drop staleness/mtime branch from fallback writer The fallback writer was checking mtime against `session.runStartedAt` to detect stale reports left over from a previous run. With PR #316's archive-on-start step (`archiveSetupReportFile`) moving any prior report to `amplitude-setup-report.previous.md` at the very start of every run, the canonical path is guaranteed to be either empty or freshly written by the current run by the time we reach the fallback writer — the staleness branch is dead code. Drop `runStartedAt` from `FallbackReportContext` and the mtime comparison from `writeFallbackReportIfMissing`. `existsSync` is now authoritative. Updates the agent-runner call site and removes the "re-run handling" tests since the branch they covered is gone. `runStartedAt` itself stays on `WizardSession` — it's consumed by `RunScreen.tsx` for the elapsed-time counter and stamped by `WizardStore.setRunPhase`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(observability): restore wrapMcpServerWithSentry in createWizardToolsServer The earlier setup-report PR commits dropped the `wrapMcpServerWithSentry` wrap around `createSdkMcpServer` in `createWizardToolsServer`. That change had nothing to do with setup reports — it kills Sentry auto-instrumentation for every wizard-tools MCP call, so we lose per-tool spans, error grouping, and the correlation breadcrumbs the rest of the agent run depends on. Restore the wrapper. If Sentry instrumentation needs to change elsewhere, it should be its own PR with the right reviewers, not a silent rider on a setup-report fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(setup-report): write fallback on cancel/error paths too The original PR wrote `amplitude-setup-report.md` only on the success branch, but the failure modes the writer was designed to handle (cancel, agent error, out-of-turns) all exit via wizardAbort() or non-throwing early returns and never reach that branch — so the fallback never fired on the runs that needed it most. Wire the fallback into every teardown path: - registerCleanup() so wizardAbort runs it before process.exit - try/finally so non-throwing returns and uncaught throws still trigger it - explicit call kept on the success branch as a belt-and-braces no-op for the rare case where the agent finishes without writing The writer bails when the canonical report already exists, so double-firing on the success path is safe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(commandments): rewrite setup-report rule + restore non-Amplitude package rule The setup-report commandment was framed to mirror the dashboard mandate ("After all event and identity instrumentation is complete, you MUST create a dashboard..."). PR #154 is removing that dashboard parallel from the commandments, so the structural mirror is going away. Rewrite the setup-report rule as a stand-alone absolute requirement with no implicit dependency on a sibling commandment. Substantive content (minimum sections, conclude-reference template, session-knowledge fallback, wrapper tags) is preserved. Also restore the "NEVER install non-Amplitude packages" commandment the branch's stale main-merge accidentally dropped — that rule landed in main via #328 and is unrelated to this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n env convention (#329) Follow-up to #328. That PR forbade installing dotenv-webpack et al, but the agent immediately routed around it by editing webpack.config.js to add webpack.DefinePlugin and wiring process.env aliases by hand. Same outcome, different mechanism. The real policy the wizard should enforce: 1. Server-side / private secrets — env vars, full stop (unchanged). 2. Browser-side Amplitude keys (public-by-design) — env vars ONLY when the project's framework has a built-in convention that surfaces .env values to client code without ANY build-config edits. Allowed conventions are enumerated explicitly: Vite VITE_*, Next.js NEXT_PUBLIC_*, CRA REACT_APP_*, Astro PUBLIC_*, Nuxt runtimeConfig.public, SvelteKit $env/static/public, Expo app.config.js extras, Angular environment.ts, and react-native-config for bare RN (only if it's already installed). 3. Otherwise INLINE the API key in the SDK init call. Browser keys are visible in the production bundle anyway; Amplitude treats them as public. 4. NEVER modify webpack.config.*, rollup.config.*, vite.config.*, babel.config.*, craco.config.*, etc. to bridge env vars into client code. Adding webpack.DefinePlugin or wiring process.env aliases counts as a build-config edit and is forbidden, even if it would technically work. 5. When in doubt, inline. A working integration with a hardcoded public key is strictly better than a broken integration with half-wired env-var plumbing. Document the choice in the setup report so the user can swap to env vars later. Replaces the old 'never hardcode' commandment, which was the root authority the agent was citing when justifying its webpack edits. Also rewrites the matching paragraph in the JavaScript Web skill (basic-integration-1.1-edit.md) that said 'Do not hardcode Amplitude keys' — that copy directly contradicts the new policy. Note: skills/integration/* lives upstream in amplitude/context-hub. The skill edit here will be overwritten on the next pnpm skills:refresh unless the matching change lands upstream too — coordinate. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #223 refactored `api-key-store.ts` from `execSync` to `execFileSync` for keychain operations (shell-injection safety). The wipe helper was written before that landed and mocked the wrong function — the keychain delete assertion silently saw zero matching calls because the mock was never on the path. Update the mock and the assertion to match the production code's `execFileSync(file, args)` signature, mirroring the convention in `api-key-store.test.ts`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cleanup at port-detection.test.ts:121 called `rmSync(link, { force: true })`
on a symlink that points to a directory. Older Node versions treated this as
file-removal (the desired symlink-only semantics); Node 23 treats it as
directory-removal and refuses without `recursive: true` — which would also
delete the target directory, defeating the explicit two-step cleanup.
Use `unlinkSync` so the intent ("remove just the symlink") is unambiguous
across Node versions. Verified the test now passes on Node 23.
Drive-by fix surfaced while running the pre-commit hook on Node 23 in
the MCP-196 PR; the test was passing on supported Node versions
(20 / 22 / >=24) so this never broke CI.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symmetric counterpart to the signup-side wipe added earlier in this PR. On a successful fresh-OAuth completion (browser-returned code → tokens exchanged → about to persist via storeToken), wipe pre-existing install-dir-keyed state so the new account doesn't inherit the prior account's API key, workspace binding, or session checkpoint. Without this, the bug class fixed by the signup-side wipe still leaks through the OAuth path: a user with cached state for account A who logs in to account B via the browser would have B's tokens persisted to ~/.ampli.json but A's API key still in keychain / .env.local / project ampli.json — events would silently route into A's tenancy. Earlier discussion in this PR considered a read-side validation primitive instead (validate cached state belongs to the active session on every credential read). The wipe-on-fresh-auth-completion design is strictly simpler: - Idempotent in the same-account case: clearing the cached key just forces one extra backend round-trip on the next read; user ends up in the same valid state. - Correct in the different-account case: stale state is gone before the new tokens are persisted, so downstream credential resolution fetches the new account's key. The wipe fires only on the fresh-OAuth path (after the browser callback). The cached-token short-circuit at oauth.ts:238-261 returns existing tokens unchanged — no auth event has occurred, the cached project state belongs to that same user, no wipe needed. `installDir` is now required on `performAmplitudeAuth`'s options. The compiler caught all four call sites: setup-utils.ts:494 (classic mode funnel), bin.ts:1390 (TUI main flow), bin.ts:1443 (TUI token-expired retry), bin.ts:1770 (the `wizard login` subcommand). The deprecated performOAuthFlow wrapper at oauth.ts:417 has zero callers; updated to default installDir to process.cwd(). Not adding a unit test for the wipe call inside performAmplitudeAuth in this commit: oauth.ts's internal helpers (startCallbackServer, exchangeCodeForToken, opn browser interaction) aren't exported and mocking them properly would require significantly more harness than the one-line behavior change warrants. The compiler-required installDir signature plus the parallel test coverage on performSignupOrAuth's wipe (which shares the helper) cover the contract sufficiently. Happy to add a focused oauth.test.ts in a follow-up if reviewers want it. Refs MCP-196. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Use AMPLITUDE_WIZARD_CACHE_DIR + getCheckpointFile for checkpoint removal. Replace obsolete keychain exec assertion with per-user cache API key clear. Co-authored-by: Cursor <cursoragent@cursor.com>
d287c79 to
aa5fd03
Compare
…505) Move long API key, confirm_event_plan, post-instrument manifest, setup report, lint rationale, and browser SDK init tables into a pre-staged bundled skill. Keep invariants and test-locked phrases in commandments. Wire preStageSkills, gitignore, and integration prompt to load the skill. Co-authored-by: Cursor <cursoragent@cursor.com>
Prevent activating the slash/ask console while the instrumentation plan is shown so confirm_event_plan is not stranded (AGENT_FAILED / exit 10). Hide the Tab ask hint and clarify the footer copy until Y/S/F. Co-authored-by: Cursor <cursoragent@cursor.com>
Bumps [@anthropic-ai/sdk](https://github.com/anthropics/anthropic-sdk-typescript) from 0.79.0 to 0.91.1. - [Release notes](https://github.com/anthropics/anthropic-sdk-typescript/releases) - [Changelog](https://github.com/anthropics/anthropic-sdk-typescript/blob/main/CHANGELOG.md) - [Commits](anthropics/anthropic-sdk-typescript@sdk-v0.79.0...sdk-v0.91.1) --- updated-dependencies: - dependency-name: "@anthropic-ai/sdk" dependency-version: 0.91.1 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Kelson Warner <kelson.warner@amplitude.com>
* feat: atomic JSON persistence for dashboard and agent artifacts Use temp-file + rename for dashboard and plan JSON writes, align create-dashboard reuse paths with storage-paths, and tighten benchmark json-writer. Also unify signup vs accountCreationFlow across session, WizardOptions, run telemetry, and TUI auth copy; have CLI argv override AMPLITUDE_WIZARD_* env for runWizard args. Add accountCreationProvisioningInputsReady helper module. Co-authored-by: Cursor <cursoragent@cursor.com> * test: expect account creation flow in session started analytics Align run.test assertions with wizardCapture property naming and remove duplicate accountCreationFlow key from the buildSession test mock. Co-authored-by: Cursor <cursoragent@cursor.com> * test(bdd): use accountCreationFlow for signup flow steps WizardSession gates EmailCapture/ToS on accountCreationFlow; the legacy session.signup field is no longer populated by buildSession, so BDD steps must set accountCreationFlow when simulating --signup. Co-authored-by: Cursor <cursoragent@cursor.com> * test: align auth-gate tests with accountCreationFlow isAuthTaskGateReady gates on session.accountCreationFlow; passing signup on session overrides was a no-op after WizardSession dropped signup. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
Resolve helpers auth gate: keep SigningUp ceremony gating on top of accountCreationFlow + ToS. Align flows/AuthScreen and BDD steps with accountCreationFlow (replaces session.signup on WizardSession). Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: persist event plan and dashboard only under .amplitude/ - persistEventPlan and record_dashboard write canonical .amplitude/ paths only - read paths prefer .amplitude/events.json with legacy root fallback for reads - align agent commandments, create-dashboard step, and tests Co-authored-by: Cursor <cursoragent@cursor.com> * fix: add isCreateAccountOnboarding session helper PR #504 wires flows and default command through this predicate; the helper was never exported from wizard-session, breaking tsc and pre-push. Co-authored-by: Cursor <cursoragent@cursor.com> * fix: remove unused exported function isCreateAccountOnboarding Applied via @cursor push command --------- Co-authored-by: Cursor <cursoragent@cursor.com>
Persist agent compaction snapshots, benchmark JSON output, and the npm update-check cache via atomicWriteJSON for crash-safe writes. Tighten CLAUDE.md security invariants to match actual sinks (logs, apply lock). Co-authored-by: Cursor <cursoragent@cursor.com>
Document how promptEventPlan resolves under --agent (needs_input plus decision_auto vs env-flag short-circuit) for orchestrators reading NDJSON. Co-authored-by: Cursor <cursoragent@cursor.com>
…eTool logs (#510) - Agent-mode ingestion: inter-poll delay starts at 5s and steps to 30s cap; max wait uses DATA_INGESTION_TIMEOUT_MS or shorter CI (10m) / agent (20m) defaults vs legacy 30m. Logic lives in data-ingestion-agent-poll.ts + tests. - Dedupe buildIntegrationPrompt: point agents to commandments + wizard-prompt- supplement instead of repeating long STEP 3–5 prose. - canUseTool file logs: off by default; enable via --debug / AMPLITUDE_WIZARD_DEBUG, AMPLITUDE_WIZARD_VERBOSE, AMPLITUDE_WIZARD_DEBUG_CAN_USE_TOOL, or sampled AMPLITUDE_WIZARD_CAN_USE_TOOL_LOG_SAMPLE=N; payloads redacted. - Overlap commitPlannedEventsStep with createDashboardStep (Promise.all). - Document MCP fallback model default (Sonnet-class) vs --mode fast. Deferred: phased inner query() / plan-only preflight — needs UX-safe design; track as follow-up rather than half-refactor. Co-authored-by: Cursor <cursoragent@cursor.com>
Remove instructions to call load_skill_menu/install_skill when no integration skill was pre-staged; those MCP tools are not registered. Direct agents to Glob under .claude/skills/ or report_status instead. Co-authored-by: Cursor <cursoragent@cursor.com>
…494) * feat: harden framework detection and Vue-powered docs classification - Treat VitePress, VuePress, Slidev, etc. as browser docs stacks: skip the Vue integration and allow JavaScript (Web) via a shared blocking policy. - Consider optionalDependencies in package version checks and bundler sniffing. - Reject non-object package.json parses; require installDir to be a directory before running parallel framework detectors. Made-with: Cursor * test: stabilize catalog timeout test under vitest worker isolation Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
Commandments, framework-config, and wizard-tools comments now match runtime: taxonomy loads from pre-staged .claude/skills via Skill tool; load_skill_menu/install_skill are not exposed. Co-authored-by: Cursor <cursoragent@cursor.com>
…efault existsSync in ampli-settings (#496) * test: stub project binding and existsSync in flow/router/ampli tests Align router and flow-invariant mocks with zone-resolution reading project-binding instead of ampli-config. Default existsSync to false in ampli-settings tests so disk reads stay hermetic. Made-with: Cursor * fix: mock ampli-config instead of non-existent project-binding in router/flow tests The router and flow-invariant tests were mocking `../../../lib/project-binding.js` with `readProjectBinding`, but zone-resolution.ts actually imports `readAmpliConfig` from `./ampli-config.js`. The mock was targeting a module the code under test never imports, so `tryResolveZone` was calling the real, unmocked `readAmpliConfig` — defeating the goal of preventing disk reads from leaking the developer's real config into tests. Fixed by mocking the correct module path (`../../../lib/ampli-config.js`) with the correct export (`readAmpliConfig`). --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Docs stacks such as VitePress declare `vue` but are not product Vue SPAs. Skip the Vue integration for those packages and allow the JavaScript (Web) detector to match so projects like amplitude-docs classify correctly. Made-with: Cursor Co-authored-by: Cursor <cursoragent@cursor.com>
* feat: add AuthOnboardingPath, replace --signup with --auth-onboarding - Model onboarding as sign_in vs create_account (AuthOnboardingPath). - CLI: --auth-onboarding sign-in|create-account; hidden signup boolean for AMPLITUDE_WIZARD_SIGNUP strict-mode compat; legacy buildSession args still merge. - TUI: Intro picker + store.setAuthOnboardingPath before Continue. - Wire helpers, flows, agent-runner, framework detection, and tests/BDD. Co-authored-by: Cursor <cursoragent@cursor.com> * docs: align wizard flow with AuthOnboardingPath and Intro picker Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tui): single Intro menu for sign-in vs create-account Merge auth path into the main PickerMenu so only one useInput handler is active. Remove duplicate wizardCapture from setAuthOnboardingPath; Intro fires intro action with auth onboarding path on continue. Clarify --auth-onboarding in README, bin help, buildSession JSDoc, and wizard flow diagram. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(agent): resolve integration skill deterministically
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: narrow react-router skill filter to exclude react-native and react-vite
The filter used `id.startsWith('integration-react-')` which over-broadly
matched unrelated skills like integration-react-native and
integration-react-vite. Narrowed to explicit prefixes:
- integration-react-react-router-
- integration-react-tanstack-router-
- integration-tanstack-start (exact match, unchanged)
Applied via @cursor push command
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
…#513) * feat(tui): skip data ingestion verification with Enter/q; exit with x Enter or q advances past Verify with the same no-events confirmation as before; x replaces the old q binding for cancel outro. Updates flows, BDD steps, and tests. Co-authored-by: Cursor <cursoragent@cursor.com> * docs: clarify AgentUI event plan auto-approve contract (#502) Document how promptEventPlan resolves under --agent (needs_input plus decision_auto vs env-flag short-circuit) for orchestrators reading NDJSON. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tui): confirm skip with Enter/q on second press (Bugbot) awaitingSkipConfirm now treats y/Y, q/Q, and Enter like the initial skip triggers so re-pressing the same key confirms instead of toggling the prompt off. Adds regression tests for Enter and q. Co-authored-by: Cursor <cursoragent@cursor.com> * fix: suppress KeyHintBar hints during skip-confirm prompt --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: cursor[bot] <206951365+cursor[bot]@users.noreply.github.com>
* feat: wizard-owned OAuth/binding storage and narrowed gitignore
- Store OAuth session in ~/.amplitude/wizard/oauth-session.json (0o600) with
dual-read/migrate from ~/.ampli.json and dual-write for transition
- Canonical project binding: .amplitude/project-binding.json; merge with
legacy ampli.json; writeAmpliConfig returns whether any sink persisted
- Gitignore: drop blanket .amplitude/; ignore .amplitude/dashboard.json and
legacy dotfiles; wizard block replace upgrades old .amplitude/ entries
- reset: clear auth before rm .amplitude to avoid readAmpliConfig migration
recreating the directory
- Diagnostics, logout copy, oauth/store messaging, CI hint paths updated
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: narrow AmpliConfigParseResult before reading error field
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: clarify reset help for project-binding and ampli.json mirror
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: prevent credential resurrection via legacy re-migration and distinguish read errors from not-found
Bug 1: clearStoredCredentials writes {} to primary, but if legacy write fails,
readConfig would see no OAuth entries in primary, find them in legacy, and
re-migrate them back — effectively un-clearing credentials. Fix: only migrate
from legacy when the primary file doesn't exist on disk at all.
Bug 2: readAmpliConfigFile returned 'not_found' for all FS errors including
EACCES. This caused readAmpliConfig to treat an unreadable binding file as
absent and attempt migration with stale legacy data. Fix: return 'read_error'
for non-ENOENT failures so the migration condition doesn't trigger falsely.
Applied via @cursor push command
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(tui): Esc back-navigation for signup and confirm flows - Wire Esc on EmailCapture, ToS, RegionSelect; Intro revert + rewindIntro in flows/store - MCP/Slack: router back when canGoBack, else skip; useEscapeBack without double-firing ConfirmationInput - CreateProjectScreen: useScreenInput for Esc (respects command mode) - Docs: CLAUDE.md TUI Esc notes + PR cross-link; CONTRIBUTING agent-assisted section; README link; PR template; memory/MEMORY.md pointer Co-authored-by: Cursor <cursoragent@cursor.com> * docs: note gh pr create, reflect de-dupe, and pnpm install before vitest - CLAUDE.md: /reflect de-dupe against repo CLAUDE.md; gh pr create after push - CONTRIBUTING: pnpm install in fresh worktree before vitest; PR step for gh - PR template: checkbox for gh pr create Co-authored-by: Cursor <cursoragent@cursor.com> * docs: rebase before push when behind upstream - CLAUDE.md: pull --rebase before git push if branch lags origin - CONTRIBUTING.md: same in PR checklist; renumber steps Co-authored-by: Cursor <cursoragent@cursor.com> * fix: guard rewindIntro against double transition detection during goBack revert Applied via @cursor push command * fix(tui): guard McpScreen async detection against post-unmount timer mutation Applied via @cursor push command * fix: add unmountedRef guard to SlackScreen to prevent post-unmount side effects Applied via @cursor push command * fix(tui): enable SlackScreen Esc back outside ConfirmationInput phases Bugbot flagged that gating useEscapeBack to Opening/Verifying made Esc effectively unreachable (short phases) and dropped the Done dwell window. Enable the hook whenever a ConfirmationInput is not mounted so Opening, Verifying, and the post-success delay all honor router back-navigation. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(tui): align create-account path copy with --auth-onboarding Update file headers and Intro picker hint to match CLI/session naming after #514; note legacy --signup alias where helpful. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tui): use user-facing hint for create-account Intro option Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
…te wipe) - Restore session.accountCreationFlow from CLI --signup for SigningUp gate - Combine isCreateAccountOnboarding ToS check with signup ceremony settling - Logout uses clearStaleProjectState (symmetric with post-signup wipe) - cli.test wizard-session mock uses importOriginal + accountCreationFlow - Split auth-gate tests: menu create-account vs --signup blocking Co-authored-by: Cursor <cursoragent@cursor.com>
kelsonpw
added a commit
that referenced
this pull request
May 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes MCP-196. Stacks on #234.
When
+ "--signup" +runs in a directory that previously hosted another Amplitude account,+ "replaceStoredUser" +already wipes prior+ "User-*" +entries from+ "~/.ampli.json" +— but five install-dir-keyed surfaces are left untouched, and+ "getAPIKey()" +short-circuits on local cache, so the new account silently reused the prior account's API key. Events were emitted into the wrong tenancy with no user-visible signal.What changed
Introduce
+ "clearStaleProjectState(installDir)" +(in+ "src/utils/clear-stale-project-state.ts" +) that composes three existing helpers:+ "clearApiKey" +— wipes macOS keychain + Linux secret-tool ++ "AMPLITUDE_API_KEY" +line in+ "installDir/.env.local" ++ "clearCheckpoint" +— wipes+ "$TMPDIR/amplitude-wizard-checkpoint-<sha256(installDir)[:12]>.json" ++ "clearAuthFieldsInAmpliConfig" +— strips+ "OrgId" +/+ "WorkspaceId" +/+ "Zone" +from+ "installDir/ampli.json" +while preserving tracking-plan fields (+ "SourceId" +,+ "Branch" +, etc.)Call it inside
+ "performSignupOrAuth" +immediately before+ "replaceStoredUser" +on the success arm. Pairing the wipe with persistence keeps the wipe-then-write contract in one place — a partial-state crash between wipe and persist fails closed (no key) rather than open (old key).+ "installDir" +is now a required field on+ "SignupOrAuthInput" +, so the compiler enforces both production call sites (+ "bin.ts" ++ "runDirectSignupIfRequested" +,+ "SigningUpScreen.tsx" +success arm) pass it.Deviation from ticket
The ticket prescribed a helper invoked at both call sites. I put the wipe inside
+ "performSignupOrAuth" +instead — single enforcement point, can't drift if a future caller is added, naturally pairs with the existing+ "~/.ampli.json" +wipe.The ticket also listed four surfaces;
+ "installDir/ampli.json" +was missed.+ "credential-resolution.ts" +reads stored+ "OrgId" +/+ "WorkspaceId" +after signup and uses them to look up workspaces in the new account's user info — prior IDs never match in the new account, names stay null, and+ "session.credentials.projectApiKey" +gets set to the OLD key from keychain anyway.+ "LogoutScreen.tsx" +(lines 45-50) already implements exactly this wipe pattern — making the signup path the asymmetric one this PR closes. Not refactoring LogoutScreen to use the helper here (out of scope).Test plan
+ "src/utils/tests/clear-stale-project-state.test.ts" +, 5 tests): AMPLITUDE_API_KEY stripped while other env vars preserved; checkpoint deleted at the hashed path; OrgId/WorkspaceId/Zone stripped from project ampli.json while SourceId/Branch/Version preserved; macOS keychain delete attempted on darwin; no-op when no prior state exists+ "signup-or-auth.test.ts" +(4 new): wipe runs before persist on success (verified via+ "invocationCallOrder" +); wipe NOT called on requires_redirect / needs_information / error+ "performSignupOrAuth" +tests updated to pass+ "installDir" ++ "pnpm try --signup" +in a directory with a prior account's+ ".env.local" +and+ "ampli.json" +— confirm the new account uses its own API key, not the cached one🤖 Generated with Claude Code
Note
Medium Risk
Mostly documentation and developer-workflow updates, but it also removes GitHub Actions workflows and changes what ships/gets ignored, which could affect release/CI behavior and packaging if misconfigured.
Overview
Adds first-class Claude Code support by introducing a bundled
.claude/skills/amplitude-setupagent protocol (plus references) and enabling Claude tool permissions via.claude/settings.json; updates the.claude/commands/analysis.mdlog guidance and adds a/featurecommand template.Updates contributor UX with a new PR template, refreshed
CLAUDE.md/CONTRIBUTING.md/README.mdguidance, and Husky hook changes (drop tests from pre-commit; add a faster pre-push lint+typecheck). Cleans up repo hygiene by tightening.gitignore, removing.npmignorein favor ofpackage.json#files, deleting committedampli.json, and bumping release metadata (CHANGELOG.md,.release-please-manifest.json).Removes legacy GitHub Actions workflows
publish.ymlandwizard-ci-trigger.yml.Reviewed by Cursor Bugbot for commit 6723ff9. Bugbot is set up for automated code reviews on this repo. Configure here.