dumps OnnxDiscrepancyCheck result on disc so they can be picked up later and gathered for a dashboard by xadupre · Pull Request #2503 · microsoft/Olive

xadupre · 2026-06-08T17:06:23Z

Describe your changes

The pass OnnxDiscrepancyCheck is used to gather metrics on canary models. They need to be dumped to let the user see them in detail, the user can be a dashboard as well.

When --test is active, add_discrepancy_check_pass now injects max_mae: 0.1 and report_output_dir (set to the run config's output_dir) into the discrepancy check pass configuration alongside reference_model_path. Both report_output_dir and the CLI save_discrepancy_check_results helper normalize file paths to their parent directory to avoid writing artifacts under a path that looks like a file.

Fixed test regressions in test_workflow_run_command_with_test_override and test_workflow_run_command_with_test_reuses_test_output_dir:

Set mock_run.return_value = None so save_discrepancy_check_results exits early at its null-guard check instead of attempting to serialize a MagicMock via json.dumps.
Updated the assert_called_once_with expectation to include max_mae and report_output_dir in the expected discrepancy check pass config.

OnnxDiscrepancyCheck now raises a ValueError after persisting the report when any configured threshold is exceeded (numeric metric thresholds or generation token sequence minimum), failing the run instead of silently succeeding. This restores the documented "the pass fails if any configured threshold is exceeded" behavior for --test runs.

Replaced print statements in olive/cli/base.py with logger.debug calls to follow the project's logging conventions.

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

…e/add_disc

xadupre · 2026-06-09T16:04:47Z

@copilot test\cli\test_cli.py:173: in test_workflow_run_command_with_test_override
cli_main(command_args)
olive\cli\launcher.py:79: in main
service.run()
olive\telemetry\telemetry_extensions.py:137: in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
olive\cli\run.py:96: in run
save_discrepancy_check_results(workflow_output, output_path)
olive\cli\base.py:110: in save_discrepancy_check_results
report_path.write_text(json.dumps(results, indent=2))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
C:\ToolCache\Python\3.12.10\x64\Lib\json_init_.py:238: in dumps
**kw).encode(obj)
^^^^^^^^^^^
C:\ToolCache\Python\3.12.10\x64\Lib\json\encoder.py:202: in encode
chunks = list(chunks)
^^^^^^^^^^^^
C:\ToolCache\Python\3.12.10\x64\Lib\json\encoder.py:439: in _iterencode
o = _default(o)
^^^^^^^^^^^
C:\ToolCache\Python\3.12.10\x64\Lib\json\encoder.py:180: in default
raise TypeError(f'Object of type {o.class.name} '
E TypeError: Object of type MagicMock is not JSON serializable

…tion error

Copilot · 2026-06-09T16:10:02Z

@copilot test\cli\test_cli.py:173: in test_workflow_run_command_with_test_override
cli_main(command_args)
olive\cli\launcher.py:79: in main
...

Fixed in the latest commit. The test was patching olive.workflows.run but the default MagicMock return value was truthy, so save_discrepancy_check_results didn't exit early and tried to serialize the mock via json.dumps. Added mock_run.return_value = None so the function returns immediately at the if not workflow_output guard.

xadupre · 2026-06-09T16:42:45Z

@copilot test/cli/test_cli.py:177: in test_workflow_run_command_with_test_override
mock_run.assert_called_once_with(
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:961: in assert_called_once_with
return self.assert_called_with(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:949: in assert_called_with
raise AssertionError(_error_message()) from cause
E AssertionError: expected call not found.
E Expected: run({'input_model': {'type': 'HfModel', 'model_path': 'hf-internal-testing/tiny-random-LlamaForCausalLM', 'load_kwargs': {'attn_implementation': 'eager', 'trust_remote_code': False}, 'test_model_config': {'hidden_layers': 2}, 'test_model_path': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output/test_model'}, 'output_dir': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output', 'passes': {'discrepancy_check': {'type': 'OnnxDiscrepancyCheck', 'reference_model_path': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output/test_model'}}}, list_required_packages=False, package_config=None, tempdir=None)
E Actual: run({'input_model': {'type': 'HfModel', 'model_path': 'hf-internal-testing/tiny-random-LlamaForCausalLM', 'load_kwargs': {'attn_implementation': 'eager', 'trust_remote_code': False}, 'test_model_config': {'hidden_layers': 2}, 'test_model_path': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output/test_model'}, 'output_dir': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output', 'passes': {'discrepancy_check': {'type': 'OnnxDiscrepancyCheck', 'reference_model_path': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output/test_model', 'max_mae': 0.1, 'report_output_dir': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output'}}}, list_required_packages=False, tempdir=None, package_config=None)

…n discrepancy_check pass config

Copilot · 2026-06-09T16:46:27Z

@copilot test/cli/test_cli.py:177: in test_workflow_run_command_with_test_override
mock_run.assert_called_once_with(
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:961: ...

Fixed in the latest commit. The test assertion was missing max_mae: 0.1 and report_output_dir that add_discrepancy_check_pass now injects into the pass config.

xadupre · 2026-06-09T17:23:12Z

@copilot test/cli/test_cli.py:244: in test_workflow_run_command_with_test_reuses_test_output_dir
cli_main(["run", "--run-config", str(config_path), "--test"])
olive/cli/launcher.py:79: in main
service.run()
olive/telemetry/telemetry_extensions.py:137: in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
olive/cli/run.py:96: in run
save_discrepancy_check_results(workflow_output, output_path)
olive/cli/base.py:110: in save_discrepancy_check_results
report_path.write_text(json.dumps(results, indent=2))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/json/init.py:238: in dumps
**kw).encode(obj)
^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/json/encoder.py:202: in encode
chunks = list(chunks)
^^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/json/encoder.py:439: in _iterencode
o = _default(o)
^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/json/encoder.py:180: in default
raise TypeError(f'Object of type {o.class.name} '
E TypeError: Object of type MagicMock is not JSON serializable

…_test_reuses_test_output_dir

Copilot · 2026-06-09T17:26:00Z

@copilot test/cli/test_cli.py:244: in test_workflow_run_command_with_test_reuses_test_output_dir
cli_main(["run", "--run-config", str(config_path), "--test"])
olive/cli/launcher.py:79: in main
...

Fixed in the latest commit — added mock_run.return_value = None to test_workflow_run_command_with_test_reuses_test_output_dir for the same reason as the previous test.

Copilot

Pull request overview

This PR extends the OnnxDiscrepancyCheck pass and CLI --test flow to persist discrepancy metrics to disk so they can be inspected later (e.g., by a dashboard). It also updates CLI tests to reflect the injected discrepancy-check configuration and to avoid serializing mock objects.

Changes:

Add report_output_dir support and JSON result dumping for OnnxDiscrepancyCheck, plus propagation of results via model_attributes.
In --test mode, inject max_mae=0.1 and report_output_dir=<output_dir> into the discrepancy-check pass config and persist results into the run output folder.
Fix CLI unit tests to align with the new injected config and to avoid attempting to json.dumps a MagicMock.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
`test/cli/test_cli.py`	Updates CLI tests for `--test` to include new discrepancy-check config fields and avoid serializing mocked workflow output.
`olive/passes/onnx/discrepancy_check.py`	Adds report directory support, writes results JSON, stores results into `model_attributes`, and adds optional reference-model artifact export.
`olive/cli/run.py`	Persists discrepancy-check results to the CLI output directory after a `--test` run.
`olive/cli/base.py`	Injects discrepancy-check pass config during `--test` and adds a helper to save discrepancy results from workflow output into the CLI output directory.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

xadupre · 2026-06-11T17:11:49Z

@copilot replace print(f"Adding OnnxDiscrepancyCheck pass with reference_model_path={reference_model_path}") and print(f"OnnxDiscrepancyCheck results saved to {report_path}") by logging

Copilot · 2026-06-11T17:15:36Z

@copilot replace print(f"Adding OnnxDiscrepancyCheck pass with reference_model_path={reference_model_path}") and print(f"OnnxDiscrepancyCheck results saved to {report_path}") by logging

Done — both print calls in olive/cli/base.py are now logger.debug(...) calls (using %s formatting per the project's logging style).

xadupre added 6 commits June 8, 2026 16:09

save discrepancy results

57617dd

enforce

3720462

add threshold

6d3a3a0

log errors

2673bc4

dump

bb4f0ab

dump torch model

507f832

github-advanced-security AI found potential problems Jun 8, 2026

View reviewed changes

xadupre added 2 commits June 9, 2026 14:58

remove prints

ee42c65

Merge branch 'main' of https://github.com/microsoft/Olive into xadupr…

1fa0308

…e/add_disc

Copilot started work on behalf of xadupre June 9, 2026 16:05 View session

fix: set mock_run.return_value=None to avoid MagicMock JSON serializa…

322a773

…tion error

Copilot finished work on behalf of xadupre June 9, 2026 16:10

Copilot started work on behalf of xadupre June 9, 2026 16:42 View session

fix: update test assertion to include max_mae and report_output_dir i…

9639592

…n discrepancy_check pass config

Copilot finished work on behalf of xadupre June 9, 2026 16:46

Copilot started work on behalf of xadupre June 9, 2026 17:23 View session

fix: set mock_run.return_value=None in test_workflow_run_command_with…

d9a33d4

…_test_reuses_test_output_dir

Copilot finished work on behalf of xadupre June 9, 2026 17:26

Merge branch 'main' into xadupre/add_disc

1ab1501

xadupre marked this pull request as ready for review June 10, 2026 12:37

Copilot AI review requested due to automatic review settings June 10, 2026 12:37

Copilot started reviewing on behalf of xadupre June 10, 2026 12:37 View session

Copilot AI reviewed Jun 10, 2026

View reviewed changes

Comment thread olive/cli/base.py Outdated

Comment thread olive/cli/base.py Outdated

Comment thread olive/passes/onnx/discrepancy_check.py Outdated

Comment thread olive/passes/onnx/discrepancy_check.py Outdated

Comment thread olive/passes/onnx/discrepancy_check.py

Potential fix for pull request finding

cfa7028

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

xadupre and others added 3 commits June 11, 2026 14:50

Potential fix for pull request finding

bd7d0f6

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

c3e6698

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

c2dec3f

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot started work on behalf of xadupre June 11, 2026 13:03 View session

Fix OnnxDiscrepancyCheck to raise when thresholds are exceeded

fd8d452

Copilot finished work on behalf of xadupre June 11, 2026 13:06

fix

b9f76d2

Copilot started work on behalf of xadupre June 11, 2026 17:12 View session

Replace print statements with logger.debug in base.py

b000130

Copilot finished work on behalf of xadupre June 11, 2026 17:16

Conversation

xadupre commented Jun 8, 2026 • edited by Copilot AI Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Checklist before requesting a review

(Optional) Issue link

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xadupre commented Jun 9, 2026

Uh oh!

Copilot AI commented Jun 9, 2026

Uh oh!

xadupre commented Jun 9, 2026

Uh oh!

Copilot AI commented Jun 9, 2026

Uh oh!

xadupre commented Jun 9, 2026

Uh oh!

Copilot AI commented Jun 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xadupre commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xadupre commented Jun 8, 2026 •

edited by Copilot AI

Loading