fix(copilot): capture resolved model for auto-routed runs#268
Conversation
4bb9fe4 to
c249521
Compare
jrob5756
left a comment
There was a problem hiding this comment.
Reviewed at the PR head in an isolated checkout. Lint, format, typecheck, and the full copilot suite (88 tests) pass, and the approach matches how claude_agent_sdk.py already reads msg.model from the SDK, so nothing here blocks merge.
My main note is one behavioral point. The capture isn't limited to model: auto: assistant.usage.model is always set, so the resolved name replaces the configured model on every run. For an explicitly pinned priceable model that can swap in a name the pricing table doesn't recognize (some dotted Copilot IDs like claude-sonnet-4.5 and gemini-2.5-pro aren't in it), and then cost_usd goes None and quietly drops out of the budget total in usage.py, so spend can be undercounted with no error. Suggestion inline.
Two smaller things. claude.py still reports the configured model rather than the SDK-served one, so the three providers now diverge on this. Anthropic has no auto-routing so it's a reasonable choice, but AGENTS.md treats model as a parity-tracked field, so a line in the PR description would make the divergence intentional. And the parsed-JSON branch that agents with an output schema take isn't covered by the new tests, so its resolved_model propagation could regress unnoticed.
|
Updated. The main execution path now only uses the SDK I also threaded the configured agent model into the Copilot follow-up path and applied the same precedence there:
I also updated the Validation:
|
The resolved-model precedence refactor accidentally dropped cache_write_tokens from the main execution path's AgentOutput, leaving `cache_write` unused (ruff F841) and silently zeroing cache-write token accounting in pricing/usage — diverging from the follow-up path and claude.py. Restore the argument, and collapse the follow-up-path model expression so `ruff format --check` passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jrob5756
left a comment
There was a problem hiding this comment.
All five review comments are addressed (auto-gating, docstring, follow-up path threads agent.model, and tests now exercise execute() including the explicit-priceable + unpriceable-resolved precedence case). Pushed db188c8 to restore the dropped cache_write_tokens on the main-path AgentOutput (fixed the F841 lint + a silent cache-write accounting regression) and a format fix. CI is green.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #268 +/- ##
=======================================
Coverage ? 86.48%
=======================================
Files ? 69
Lines ? 12157
Branches ? 0
=======================================
Hits ? 10514
Misses ? 1643
Partials ? 0 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Bump version 0.1.19 -> 0.1.20 and finalize the changelog. Changelog (0.1.20): - Added: Hermes provider (#235), cost budget enforcement (#212), external-workflow-friction knobs — output_mode / max_parse_recovery_attempts / gate-respond CLI / Windows paths (#234), templated reasoning.effort & context_tier (#263), scoped applyTo instruction loading (#238). - Fixed: Copilot model attribution for auto-routed runs (#268), claude-agent-sdk default tool preset when tools: omitted (#269). Re-locked uv.lock to record 0.1.20. Quality gates green locally (ruff, ty, pytest excl. real_api/performance: 3673 passed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
This PR captures the concrete model resolved by the Copilot SDK for auto-routed runs.
Previously, when an agent was configured with
model: auto, Conductor recordedAgentOutput.modelas"auto"even though the Copilot SDK had resolved the request to a concrete model. Token usage was available, but downstream accounting could not attribute that usage to the actual model.This change captures
event.data.modelfrom the SDKassistant.usageevent, stores it asSDKResponse.resolved_model, preserves it through response reconstruction paths, and uses it forAgentOutput.modelwhen available.What changed
resolved_modeltoSDKResponse.assistant.usage.resolved_modelthrough_execute_sdk_call.AgentOutput.assistant.usageevent extraction.Validation
Unit tests:
uv run pytest tests/test_providers/test_copilot.py::TestCopilotProviderResolvedModel -v— passeduv run pytest tests/test_providers/test_copilot.py -v— passedLocal smoke tests with Copilot:
model: autostructured-output run resolved to a concrete model instead of"auto".model: autono-schema/raw-output run resolved to a concrete model instead of"auto".Observed resolved models included
gpt-5.3-codex/gpt-5.4-minidepending on the run.Notes
This PR does not attempt to calculate GitHub Copilot credit or USD cost. Copilot pricing/accounting is plan and policy dependent and may change independently from the SDK. The change records the concrete resolved model so downstream pricing or accounting logic can make an informed decision instead of receiving
"auto".