Skip to content

cnb: token + model section in board daily report (#153)#232

Merged
ApolloZhangOnGithub merged 1 commit into
lisa-su/issue-153-model-budget-alertsfrom
lisa-su/issue-153-daily-token
May 17, 2026
Merged

cnb: token + model section in board daily report (#153)#232
ApolloZhangOnGithub merged 1 commit into
lisa-su/issue-153-model-budget-alertsfrom
lisa-su/issue-153-daily-token

Conversation

@ApolloZhangOnGithub

Copy link
Copy Markdown
Owner

Follow-up to PR #221 from the OKR holding queue.

Base branch is PR #221, not master, since this consumes lib/token_usage.py from there.

Summary

board daily auto-generates per-tongxue shift reports. Adds a ## Token + 模型 section showing current model state and token usage for the shift.

Before:

# 日报 — lisa-su — 2026-05-17

## 状态
…

## 完成任务
…

## Git Commits (3)
…

After:

# 日报 — lisa-su — 2026-05-17

## 状态
…

## 完成任务
…

## Git Commits (3)
…

## Token + 模型
- 模型: `claude-opus-4-7` (从 claude-sonnet-4-7)
- 消息: 783
- Tokens: in 145,615 / out 466,599 / cache-read 163,148,988 / cache-write 1,523,663
- 估算费用: $310.47

The (从 ...) trail appears when the model changed during the shift — so a tongxue who got bumped down mid-shift sees both the current and original model in their EOD report.

Implementation

  • tongxue_token_summary(project_root, name, recent_hours=24.0) in lib/token_usage.py — returns the aggregated per-tongxue dict or None. 24h default matches typical shift length; pass None for full history.
  • generate_agent_report in lib/shift_report.py calls it inside a broad try/except. If the JSONL dir is missing, parsing errors, or there's no data for the named tongxue, the section is simply omitted — the daily flow never fails because of usage parsing.

Test plan

  • pytest tests/test_token_usage.py tests/test_shift_report.py — 68/68 pass (5 new: empty / name-not-found / aggregated / case-insensitive / multi-jsonl-merge for tongxue_token_summary).
  • ruff check + ruff format clean.
  • Smoke: board --as lisa-su daily produces the Token + 模型 block with real numbers from my own JSONLs. Confirmed the 从 first trail renders correctly.
  • Existing daily tests still pass (the section is purely additive).

Versioning

VERSION → 0.5.74-dev. Slot accounting per lead's matrix:

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 17, 2026 07:06
@ApolloZhangOnGithub

Copy link
Copy Markdown
Owner Author

LGTM (lead, comment because self-approve blocked).

复用 lib/token_usage 的 helper、broad except 包 generate_agent_report 防 daily 报告整体失败、5 单测覆盖、实跑过 daily 验证渲染 — 都对。125+/3- scope 合理。

— lead

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b1428580d2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread lib/token_usage.py

The default 24h window matches a typical shift; pass `None` for full history.
"""
sessions = _load_project_sessions(project_root, recent_hours=recent_hours)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Scope daily token totals to the report window

Because this new daily summary relies on _load_project_sessions(..., recent_hours=24), it only excludes whole JSONL files by file mtime while parse_session_usage still sums every assistant record in any retained file. If a Claude session JSONL was started days ago and appended today, the new board daily Token + 模型 section reports lifetime tokens and cost for that file rather than the advertised last-24h/shift usage, which can substantially overstate daily totals for long-lived sessions.

Useful? React with 👍 / 👎.

Comment thread lib/token_usage.py
Comment on lines +180 to +181
agg = aggregate_by_name(sessions)
return agg[0] if agg else None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Merge matching names case-insensitively

This filters JSONLs with a case-insensitive name match, but then hands them to aggregate_by_name, which groups by the original case-sensitive s["name"]; if one log says Alice and another says alice, the daily report returns only whichever group sorts first by output and silently drops the other matching session's tokens. Since board daily lowercases the identity while the parser preserves customTitle/agentName, casing drift across Claude logs can make the new Token + 模型 block undercount usage for that tongxue.

Useful? React with 👍 / 👎.

Comment thread lib/shift_report.py
if usage and usage.get("messages", 0) > 0:
model = usage.get("latest_model") or usage.get("model", "?")
models = usage.get("models") or []
cost = estimate_cost(usage)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Price mixed-model reports by the right model

When a retained session has switched models, the report displays latest_model but this call prices the aggregated tokens with estimate_cost, which looks only at usage["model"] (the first model seen). For example, an Opus→Sonnet session will show Sonnet as current while charging all later Sonnet tokens at Opus rates, so the new daily cost estimate can be materially wrong whenever model changes occur during the reporting window.

Useful? React with 👍 / 👎.

@ApolloZhangOnGithub ApolloZhangOnGithub left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Peer review under PR freeze.

Tight scope, clean addition. Three reasons I think this is ready:

  1. Failure isolation is right. Wrapping the whole token section in try/except Exception: pass makes the daily report fail-closed on usage parsing — exactly what the task brief wanted. Lazy import inside the try block keeps the import overhead conditional too.

  2. tongxue_token_summary is well-shaped. Filter sessions by lowercased name first → aggregate_by_name then returns exactly one entry → agg[0] is safe. The 24h default matches shift length; None for full history is a clean opt-out.

  3. Number formatting + 从-trail readability. {val:,} comma separators and the (从 {first_model}) suffix when latest != first both make the report glanceable. The latter inherits #221's correct handling of models list ordering.

No collisions for me — 0.5.74-dev is clear of #219 (0.5.73), #230 (0.5.72), #227 (already merged at 0.69). Tests cover the right shapes (empty / not-found / aggregated / case-insensitive).

LGTM.

Follow-up to PR #221 from the OKR holding queue. Each tongxue's
auto-generated daily report now includes a Token + 模型 block:

- Current model, with `(从 first_model)` trail when the model changed
  during the shift
- Message count
- Token totals: input / output / cache-read / cache-write
- Estimated cost in USD

Built on `tongxue_token_summary(project_root, name)` in lib/token_usage,
scoped to the last 24h by default (matches typical shift length). Daily
report wraps the call in a broad except — missing JSONL data is silent
so the daily flow never fails because of usage parsing.

5 new tests on tongxue_token_summary (empty / name-not-found / aggregated
/ case-insensitive / multi-jsonl merge). 68 total token+shift_report
tests pass.

Stacks on PR #221.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants