Skip to content

Catalog refresh, incremental persistence, and a manual refresh tool#16

Merged
msitarzewski merged 3 commits into
mainfrom
feat/catalog-refresh-and-incremental-persistence
Jun 22, 2026
Merged

Catalog refresh, incremental persistence, and a manual refresh tool#16
msitarzewski merged 3 commits into
mainfrom
feat/catalog-refresh-and-incremental-persistence

Conversation

@msitarzewski

Copy link
Copy Markdown
Owner

Three independent changes, one per commit.

1. Catalog refresh (feat(catalog))

  • Adds current frontier: Claude Opus 4.8/4.7, GPT-5.5, GPT-5.4 mini, Gemini 3.5 Flash
  • Drops deprecated o3 (per the feed's status field)
  • Corrects Anthropic GA context windows: Opus 4.6 + Sonnet 4.6 → 1M; Sonnet 4.5 stays 200k (its 1M is beta-gated)
  • Adds gpt-5.5 to NO_TEMPERATURE_MODELS + _REASONING_EFFORT_MODELS after empirically confirming it rejects temperature

Sourcing: model IDs from live provider APIs, pricing cross-checked against the truefoundry/models feed, temperature behavior verified against the live OpenAI endpoint. No fabricated values.

2. Incremental persistence (feat(persist))

  • New IncrementalPersister: thread created active up front → each round committed as it finishes → finalized complete at the end
  • A crash mid-run now leaves a real partial thread instead of nothing
  • WebSocket streams a thread_started event with the real thread ID for mid-run deep-linking
  • Consolidates three duplicated persist paths (CLI / WS / batch) into one shared module (net −211 lines in the affected files)

3. Manual catalog refresh tool (feat(scripts))

  • scripts/refresh_catalog.py — propose-only, run on demand
  • Diffs catalog vs feed + live APIs, hashes a per-model projection against catalog_snapshot.json to show what changed since last run, discovers new frontier models, and empirically probes OpenAI temperature (the one field the feed gets wrong)
  • It never modifies catalog.py

Testing

  • 1663 Python tests pass (6 new for incremental persistence), mypy clean (63 files), ruff clean, frontend build unaffected
  • The refresh tool was run live against all keyed providers; it caught two context-window mistakes during development (Opus 4.6 still 200k; an over-correction of Sonnet 4.5 to its beta max)

Not included (follow-up)

  • REST /api/ask still uses its lite persist path; unifying it onto the incremental path is the noted next step.

🤖 Generated with Claude Code

https://claude.ai/code/session_01EkrekgzMAQko92UkjnXhHL

msitarzewski and others added 3 commits June 21, 2026 19:22
Add Claude Opus 4.8/4.7, GPT-5.5, GPT-5.4 mini, and Gemini 3.5 Flash;
drop deprecated o3. Correct Anthropic GA context windows (Opus 4.6 and
Sonnet 4.6 -> 1M; Sonnet 4.5 stays 200k, its 1M is beta-gated). Add
gpt-5.5 to NO_TEMPERATURE_MODELS and _REASONING_EFFORT_MODELS after
empirically confirming it rejects temperature.

Model IDs sourced from live provider APIs, pricing cross-checked against
the truefoundry/models feed, temperature behavior verified against the
live OpenAI endpoint.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EkrekgzMAQko92UkjnXhHL
Introduce IncrementalPersister (memory/persist.py): the thread is created
up front as "active", each round is committed as soon as it finishes, and
the thread is finalized to "complete" at the end. A crash mid-run now
leaves a real partial thread instead of nothing. The WebSocket streams a
thread_started event with the real thread ID so clients can deep-link
mid-run.

Consolidates the three previously-duplicated persist paths (CLI, WS, and
the batch convenience) into one shared module and adds
ConsensusContext.snapshot_round() so a finished round can be persisted
before it is archived.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EkrekgzMAQko92UkjnXhHL
scripts/refresh_catalog.py diffs the model catalog against the
truefoundry/models feed and the live provider APIs. It reports
price/context/status changes, hashes a per-model field projection against
catalog_snapshot.json to show what changed since the last run, discovers
new frontier models via the live APIs, and empirically probes OpenAI
temperature support (the one field the feed gets wrong). Propose-only: it
never modifies catalog.py.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EkrekgzMAQko92UkjnXhHL
@msitarzewski msitarzewski merged commit de8b994 into main Jun 22, 2026
3 checks passed
@msitarzewski msitarzewski deleted the feat/catalog-refresh-and-incremental-persistence branch June 22, 2026 00:23
msitarzewski added a commit that referenced this pull request Jun 22, 2026
docs: capture this session's work in the memory bank (PRs #16-#21)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant