Skip to content

fix(copilot): capture resolved model for auto-routed runs#268

Merged
jrob5756 merged 3 commits into
microsoft:mainfrom
ldavidgomez:fix/copilot-auto-resolved-model
Jun 26, 2026
Merged

fix(copilot): capture resolved model for auto-routed runs#268
jrob5756 merged 3 commits into
microsoft:mainfrom
ldavidgomez:fix/copilot-auto-resolved-model

Conversation

@ldavidgomez

Copy link
Copy Markdown
Contributor

Summary

This PR captures the concrete model resolved by the Copilot SDK for auto-routed runs.

Previously, when an agent was configured with model: auto, Conductor recorded AgentOutput.model as "auto" even though the Copilot SDK had resolved the request to a concrete model. Token usage was available, but downstream accounting could not attribute that usage to the actual model.

This change captures event.data.model from the SDK assistant.usage event, stores it as SDKResponse.resolved_model, preserves it through response reconstruction paths, and uses it for AgentOutput.model when available.

What changed

  • Add resolved_model to SDKResponse.
  • Capture the resolved model from assistant.usage.
  • Preserve resolved_model through _execute_sdk_call.
  • Use the resolved model when constructing AgentOutput.
  • Keep existing fallback behavior when no resolved model is available.
  • Add tests for:
    • resolved model propagation;
    • fallback behavior;
    • cost calculation when the resolved model is priceable;
    • SDK assistant.usage event extraction.

Validation

Unit tests:

  • uv run pytest tests/test_providers/test_copilot.py::TestCopilotProviderResolvedModel -v — passed
  • uv run pytest tests/test_providers/test_copilot.py -v — passed

Local smoke tests with Copilot:

  • model: auto structured-output run resolved to a concrete model instead of "auto".
  • explicit resolved-model run succeeded.
  • model: auto no-schema/raw-output run resolved to a concrete model instead of "auto".

Observed resolved models included gpt-5.3-codex / gpt-5.4-mini depending on the run.

Notes

This PR does not attempt to calculate GitHub Copilot credit or USD cost. Copilot pricing/accounting is plan and policy dependent and may change independently from the SDK. The change records the concrete resolved model so downstream pricing or accounting logic can make an informed decision instead of receiving "auto".

@ldavidgomez ldavidgomez force-pushed the fix/copilot-auto-resolved-model branch from 4bb9fe4 to c249521 Compare June 24, 2026 11:38
@ldavidgomez ldavidgomez marked this pull request as ready for review June 24, 2026 11:51

@jrob5756 jrob5756 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed at the PR head in an isolated checkout. Lint, format, typecheck, and the full copilot suite (88 tests) pass, and the approach matches how claude_agent_sdk.py already reads msg.model from the SDK, so nothing here blocks merge.

My main note is one behavioral point. The capture isn't limited to model: auto: assistant.usage.model is always set, so the resolved name replaces the configured model on every run. For an explicitly pinned priceable model that can swap in a name the pricing table doesn't recognize (some dotted Copilot IDs like claude-sonnet-4.5 and gemini-2.5-pro aren't in it), and then cost_usd goes None and quietly drops out of the budget total in usage.py, so spend can be undercounted with no error. Suggestion inline.

Two smaller things. claude.py still reports the configured model rather than the SDK-served one, so the three providers now diverge on this. Anthropic has no auto-routing so it's a reasonable choice, but AGENTS.md treats model as a parity-tracked field, so a line in the PR description would make the divergence intentional. And the parsed-JSON branch that agents with an output schema take isn't covered by the new tests, so its resolved_model propagation could regress unnoticed.

Comment thread src/conductor/providers/copilot.py Outdated
Comment thread src/conductor/providers/copilot.py Outdated
Comment thread src/conductor/providers/copilot.py Outdated
Comment thread tests/test_providers/test_copilot.py Outdated
Comment thread tests/test_providers/test_copilot.py Outdated
@ldavidgomez

Copy link
Copy Markdown
Contributor Author

Updated.

The main execution path now only uses the SDK resolved_model when the configured agent model is None or "auto", so explicitly configured models remain authoritative for reporting and pricing.

I also threaded the configured agent model into the Copilot follow-up path and applied the same precedence there:

  • auto / default-routed agents use the SDK resolved model when available;
  • explicitly configured agent models are preserved;
  • provider default remains the final fallback.

I also updated the resolved_model docstring to clarify that it can be None when the usage event is absent or when the event does not carry a usable model name.

Validation:

  • uv run pytest tests/test_providers/test_copilot.py -k resolved_model -q
  • uv run pytest tests/test_providers/test_copilot.py -q

The resolved-model precedence refactor accidentally dropped
cache_write_tokens from the main execution path's AgentOutput,
leaving `cache_write` unused (ruff F841) and silently zeroing
cache-write token accounting in pricing/usage — diverging from the
follow-up path and claude.py. Restore the argument, and collapse the
follow-up-path model expression so `ruff format --check` passes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@jrob5756 jrob5756 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All five review comments are addressed (auto-gating, docstring, follow-up path threads agent.model, and tests now exercise execute() including the explicit-priceable + unpriceable-resolved precedence case). Pushed db188c8 to restore the dropped cache_write_tokens on the main-path AgentOutput (fixed the F841 lint + a silent cache-write accounting regression) and a format fix. CI is green.

@jrob5756 jrob5756 merged commit d0a8b89 into microsoft:main Jun 26, 2026
9 checks passed
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@5d769eb). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/conductor/providers/copilot.py 75.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #268   +/-   ##
=======================================
  Coverage        ?   86.48%           
=======================================
  Files           ?       69           
  Lines           ?    12157           
  Branches        ?        0           
=======================================
  Hits            ?    10514           
  Misses          ?     1643           
  Partials        ?        0           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jrob5756 added a commit that referenced this pull request Jun 27, 2026
Bump version 0.1.19 -> 0.1.20 and finalize the changelog.

Changelog (0.1.20):
- Added: Hermes provider (#235), cost budget enforcement (#212),
  external-workflow-friction knobs — output_mode / max_parse_recovery_attempts /
  gate-respond CLI / Windows paths (#234), templated reasoning.effort &
  context_tier (#263), scoped applyTo instruction loading (#238).
- Fixed: Copilot model attribution for auto-routed runs (#268),
  claude-agent-sdk default tool preset when tools: omitted (#269).

Re-locked uv.lock to record 0.1.20. Quality gates green locally
(ruff, ty, pytest excl. real_api/performance: 3673 passed).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants