Skip to content

fix: use resolved path and cache_key in WinMLAutoModel.from_onnx#856

Open
chinazhangchao wants to merge 5 commits into
mainfrom
chinazhangchao/fix-winmlautomodel-cache-dir-name
Open

fix: use resolved path and cache_key in WinMLAutoModel.from_onnx#856
chinazhangchao wants to merge 5 commits into
mainfrom
chinazhangchao/fix-winmlautomodel-cache-dir-name

Conversation

@chinazhangchao

Copy link
Copy Markdown
Contributor

Fixes #827

Problem

Two cache-naming bugs in WinMLAutoModel.from_onnx introduced after PR #659:

  1. Cache directory collisionfrom_onnx called get_model_dir(onnx_path.stem, ...), so two ONNX files with the same filename (e.g. model.onnx) from different directories would map to the same cache directory and overwrite each other's build outputs.

  2. Config change not reflected in cachebuild_onnx_model had no cache_key support (unlike build_hf_model), so different build configs for the same ONNX file would collide on the same model.onnx artifact path.

Fix

src/winml/modelkit/models/auto.pyfrom_onnx

  • Replace get_model_dir(onnx_path.stem, ...) with get_model_dir(str(onnx_path.resolve()), ...) so the cache dir is unique per absolute file path.
  • Compute cache_key = get_cache_key(task_abbrev, config.generate_cache_key()) (mirrors from_pretrained) and pass it to build_onnx_model.

src/winml/modelkit/build/onnx.pybuild_onnx_model

  • Add cache_key: str | None = None parameter.
  • Add _name() helper (same pattern as build_hf_model) so all artifact paths are optionally prefixed — enabling multiple config variants to coexist in one directory.
  • Rebuild cleanup uses the cache_key-scoped glob pattern to avoid removing unrelated artifacts.

Tests

  • 5 new tests in test_auto_onnx.py: resolved path used for model dir, same-stem/different-path collision prevention, cache_key passed through.
  • 5 new tests in test_onnx.py: cache_key=None backward compat, prefixed final artifact, prefixed config path, reuse with prefixed path, unrelated artifact preservation on rebuild.

- from_onnx: replace onnx_path.stem with str(onnx_path.resolve()) for
  get_model_dir so two ONNX files with the same filename but different
  paths get distinct cache directories
- from_onnx: compute cache_key = get_cache_key(task_abbrev, config hash)
  (mirrors from_pretrained) and pass it to build_onnx_model so different
  build configs produce different artifact filenames in the same dir
- build_onnx_model: add cache_key param and _name() helper (same pattern
  as build_hf_model) so all artifact paths are optionally prefixed;
  rebuild cleanup uses cache_key-scoped glob to avoid removing unrelated
  artifacts from other config variants

Fixes #827
@chinazhangchao chinazhangchao requested a review from a team as a code owner June 10, 2026 03:18
@chinazhangchao chinazhangchao marked this pull request as draft June 10, 2026 03:23
…inazhangchao/fix-winmlautomodel-cache-dir-name
@chinazhangchao chinazhangchao marked this pull request as ready for review June 10, 2026 03:49
Comment thread src/winml/modelkit/models/auto.py Outdated
Comment thread src/winml/modelkit/build/onnx.py
Comment thread tests/unit/models/auto/test_auto_onnx.py Fixed
@chinazhangchao chinazhangchao requested a review from xieofxie June 11, 2026 04:04
def get_onnx_model_hash(model_path: str | Path) -> str:
"""Compute a content hash for an ONNX model and referenced external data."""
model_path = Path(model_path).resolve()
hash_obj = hashlib.sha256()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about use hash path + size + mtime? weight could be big

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: WinMLAutoModel.from_onnx uses incorrect cache dir/file name

3 participants