Skip to content

fix(telemetry): stop unauthenticated + duplicate CLI rollups polluting PostHog (v1.48.0)#169

Open
krish221997 wants to merge 1 commit into
mainfrom
fix/cli-telemetry-accuracy
Open

fix(telemetry): stop unauthenticated + duplicate CLI rollups polluting PostHog (v1.48.0)#169
krish221997 wants to merge 1 commit into
mainfrom
fix/cli-telemetry-accuracy

Conversation

@krish221997

Copy link
Copy Markdown
Collaborator

Three telemetry fixes in src/lib/analytics.ts (+ src/lib/config.ts). Context: a single anonymous, unauthenticated agent (CLI v1.47.5 on a cloud VM, looping actions execute) produced ~84% of all CLI events and emitted some batches up to 8×, polluting PostHog and inflating the bill.

Fixes

  1. Don't record auth-required commands run without an identity. An unauthenticated CLI can still invoke actions execute — it just fails "Not configured" — and we were counting those under an anonymous device id. Pre-auth funnel commands (login, init, guide, …) are still recorded (allowlist, fail-closed). Every event now carries an authenticated boolean for clean dashboard/ingest filtering.
  2. Deterministic $insert_id for rollups (hash of the exact entries) so a re-emitted batch dedupes on PostHog ingest instead of double-counting. (The previous random id defeated the dedup the code comment promised.)
  3. Atomic claim of the usage log before flushing (claimUsageLog — rename-aside) so under concurrent CLI processes exactly one flush owns a batch, eliminating duplicate CLI Usage Rollup events.

Verification

  • 16/16 analytics unit tests (10 existing + 6 new); typecheck clean.
  • 12-process concurrency test: 300 commands → 300 rollups, 0 duplicates, 0 loss, no over-count.
  • Real-key run: authenticated actions execute → recorded (authenticated:true, env:live); unauthenticated actions execute0 events; unauthenticated login → kept (authenticated:false).

Note

This stops upgraded CLIs from emitting anonymous/duplicate telemetry. It does not retroactively stop already-deployed older CLIs (e.g. the current bot) — those must be filtered at PostHog ingest (by IP/distinct_id now; by the authenticated flag going forward).

…PostHog (v1.48.0)

Three fixes to lib/analytics.ts (+ config.ts), all unit-tested:

1. Don't record auth-required commands run WITHOUT an identity. An
   unauthenticated CLI can still invoke e.g. `actions execute` (it just
   fails "Not configured"), and those were counted under an anonymous
   device id — one such agent produced ~84% of all CLI events. Pre-auth
   funnel commands (login/init/guide/…) are still recorded. Every event now
   carries an `authenticated` flag for clean dashboard/ingest filtering.

2. Deterministic $insert_id for rollups (hash of the exact entries) so a
   re-emitted batch dedupes on PostHog ingest instead of double-counting.

3. Atomic claim of the usage log before flushing (rename-aside) so under
   concurrent CLI processes exactly one flush owns a batch — eliminating the
   duplicate "CLI Usage Rollup" events (one user was emitting the same
   500-command batch up to 8x).

Verified: 16/16 analytics unit tests; a 12-process concurrency test (300
commands -> 300 rollups, 0 duplicates, 0 loss, no over-count); and a
real-key run (authenticated actions execute -> recorded; unauthenticated
actions execute -> 0 events; unauthenticated login -> kept, authenticated=false).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant