fix(perf): support composite (dual-encoder) models in winml perf by xieofxie · Pull Request #866 · microsoft/winml-cli

xieofxie · 2026-06-10T09:21:48Z

Problem

winml perf crashed on composite (dual-encoder) models such as SigLIP/CLIP:

winml perf --ep openvino --device cpu -m google/siglip-base-patch16-224 \
  --task zero-shot-image-classification
...
AttributeError: 'WinMLModelForZeroShotImageClassification' object has no attribute 'io_config'

PerfBenchmark assumed every model exposes a single io_config / _session. Composite
models have neither — they orchestrate multiple sub-models (e.g. an image encoder and a
text encoder), each with its own ONNX session. The failure is device-independent: the
(model_type, task) registry routes SigLIP to the composite class regardless of --device.

Fix

Make PerfBenchmark composite-aware while leaving the single-session path's measurement
semantics untouched:

_aggregate_io_config() — unions the sub-models' inputs (deduped by name, order
preserved). Their union is exactly the composite forward() kwargs, so random-input
generation and the info display work unchanged.
End-to-end timing — composites time the full forward() pass (both encoders + the
similarity step) via an external PerfStats. Single-session models keep recording
pure-ORT time inside session.perf(). The monitored loop now takes a run-iteration
callable so both paths share it.
Device / EP / task resolved from a representative sub-model.
_probe_composite_outputs() — runs one forward() and introspects the result so the
reported outputs are the composite's real task-level tensors (e.g. logits_per_image)
instead of a deduped union of sub-model ONNX outputs. Best-effort: falls back to the
aggregated view if the probe fails.

The output describer (_describe_outputs) is architecture-agnostic (handles HF
ModelOutput / dict / sequence / single tensor) — no model-specific field names.

Result

Device:      cpu / OpenVINOExecutionProvider
Task:        zero-shot-image-classification
Inputs:      pixel_values   [1, 3, 224, 224]   float32
             input_ids      [1, 64]            int32
Outputs:     logits_per_image     [1, 1]
             logits_per_text      [1, 1]
             text_embeds          [1, 768]
             image_embeds         [1, 768]

Tests

tests/unit/commands/test_perf_composite.py (new, 15 cases) covers io_config aggregation,
the output describer/probe, input generation, device/EP/task resolution, and the
full-forward() timing path. Existing test_perf_cli.py / test_perf_module.py (31 cases)
still pass — no regression.

🤖 Generated with Claude Code

`winml perf` assumed every model exposes a single `io_config`/`_session`, so composite models (CLIP/SigLIP zero-shot-image-classification) crashed with `AttributeError: ... has no attribute io_config` during input generation. Make `PerfBenchmark` composite-aware: - `_aggregate_io_config()` unions the sub-models inputs (their union is exactly the composite forward() kwargs) for input generation/display. - Time the full `forward()` pass via an external PerfStats; single-session models keep recording pure-ORT time inside session.perf(). The monitored loop is refactored to take a run-iteration callable so both paths share it. - Device/EP/task are resolved from a representative sub-model. - `_probe_composite_outputs()` runs one forward() and introspects the result so reported outputs are the composite task-level tensors (e.g. logits_per_image) rather than a deduped union of sub-model ONNX outputs. Add tests/unit/commands/test_perf_composite.py covering aggregation, output describing/probing, input generation, device/EP/task resolution, and the full-forward timing path.

+    from collections.abc import Callable, Iterable
+
    from ..models.winml.base import WinMLPreTrainedModel
+    from ..models.winml.composite_model import WinMLCompositeModel


xieofxie requested a review from a team as a code owner June 10, 2026 09:21

hualxie added 2 commits June 11, 2026 09:28

Merge remote-tracking branch 'origin/main' into hualxie/fix_siglip

3993ce2

mypy

0bd7bb4

github-advanced-security AI found potential problems Jun 11, 2026

View reviewed changes

Comment thread src/winml/modelkit/commands/perf.py

from collections.abc import Callable, Iterable

from ..models.winml.base import WinMLPreTrainedModel

from ..models.winml.composite_model import WinMLCompositeModel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(perf): support composite (dual-encoder) models in winml perf#866

fix(perf): support composite (dual-encoder) models in winml perf#866
xieofxie wants to merge 3 commits into
mainfrom
hualxie/fix_siglip

xieofxie commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xieofxie commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Result

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xieofxie commented Jun 10, 2026 •

edited

Loading