Skip to content

Add normalized token accounting evidence labels #8

Description

@TheKrush

Summary

Add a normalized token accounting model with explicit evidence labels so PromptFuel can report token usage consistently across providers without blurring measured, derived, estimated, and unavailable values.

The goal is to make token totals, cache utilization, provider-reported totals, and future cost/rate-limit estimates clearer and harder to double-count.

Current behavior

Provider token fields can have different meanings across sources. Input, output, cached input, cache creation, reasoning tokens, provider totals, and derived totals are not consistently represented as separate concepts with evidence labels.

Desired behavior

PromptFuel should normalize token data into disjoint components and derive totals from those components. Every value that is not directly measured should carry an evidence label.

Requirements

  • Add normalized fields for:
    • uncachedInputTokens
    • cacheReadInputTokens
    • cacheWriteInputTokens
    • outputTokens
    • reasoningOutputTokens, when available
    • logicalInputTokens, derived
    • totalLogicalTokens, derived
    • providerReportedTotalTokens, when available
    • cacheHitRatio, derived when supportable
    • estimatedCacheSavings, only when supportable
    • rateLimitImpactTokens, only when supportable
    • billedEquivalentTokens, only when supportable
  • Add evidence labels:
    • measured
    • derived
    • estimated
    • heuristic
    • unavailable
  • Keep logical token volume, billing impact, and rate-limit impact as separate concepts.
  • Avoid universal formulas that assume cached tokens are always additive.
  • Preserve raw/provider-reported fields needed for debugging and cross-checking.
  • Add tests for providers where cached tokens are disjoint fields and providers where cached tokens are subsets of input.
  • Document provider-specific mapping caveats.

Non-goals / constraints

  • Do not implement a full dashboard in this issue unless required for validation.
  • Do not infer billing or rate-limit impact without enough provider/pricing evidence.
  • Do not remove existing provider fields that are still useful for diagnostics.
  • Do not present estimated values as measured.

Suggested implementation

Create or extend a normalized token accounting layer that maps provider-specific usage fields into disjoint normalized components. Derive totals from normalized fields and store provider-reported totals separately for anomaly detection.

Acceptance criteria

  • Normalized token components are disjoint by construction.
  • Derived totals are calculated from normalized components.
  • Provider-reported totals are stored separately where available.
  • Evidence labels are attached to measured, derived, estimated, heuristic, and unavailable values.
  • Tests cover provider-specific cache token semantics.
  • Documentation explains double-counting risks and mapping rules.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions