RFC: Estimated-performance regression tests by mkannwischer · Pull Request #454 · slothy-optimizer/slothy

mkannwischer · 2026-06-14T11:30:23Z

Our tests/naive suite only checks that SLOTHY runs, not that the result is any good - so a regression that silently degrades a schedule would pass CI. This RFC proposes pinning SLOTHY's own performance estimate (predicted cycles/stalls) for a given snippet: since that estimate is the optimum of a minimization, it's deterministic and safe to assert exactly, turning "SLOTHY now thinks it found worse code" into a test failure. To enable it, optimize()/optimize_loop() now expose the result on slothy.last_result.

Concretely, a new pytest suite under tests/pytest/ reads per-case YAML sidecars - an assembly snippet plus the expected estimate per target (arch derived from target) and how to invoke SLOTHY - so adding a case is just dropping two files.

I drafted this for discussion and would like to hear your opinions. Is this going to be useful in finding regressions in SLOTHY/ortools in the future? If so we can easily scale this up to the full instruction set supported.

@dop-amin @hanno-becker

Thread the Result object out of Heuristics.periodic (now a 5-tuple) so that optimize() and optimize_loop() can store SLOTHY's own performance estimate on slothy.last_result (cycles, stalls, codesize, codesize_with_bubbles). Previously the straightline optimize() path computed and then discarded the result.

Add a pytest suite under tests/pytest that pins SLOTHY's estimated cycle/stall counts for a given input snippet. Each case is an assembly file plus a YAML sidecar describing how to invoke SLOTHY and the expected estimate per target; the architecture is derived from the target. Because the estimate is the optimum of a minimization, it is deterministic, so pinning it turns a change in what SLOTHY believes it found into a test failure. Seeded with aarch64_simple0 and aarch64_ldst. The legacy OptimizationRunner suite in tests/naive is unaffected and is still driven by test.py.

dop-amin · 2026-06-14T18:20:23Z

+    _apply_config(slothy.config, cfg)
+
+    # On Apple M1, x18 is reserved by the platform ABI.
+    if "m1" in target_name:


It's outside the scope of this PR but do you agree that we should have a mechanism to override arch model defaults from the uArch model? If so, I'd open an issue for that.
Setting this explicitly all the time feels brittle.

dop-amin · 2026-06-14T18:31:04Z

+    slothy.load_source_from_file(str(source_path))
+
+    # variable_size lets SLOTHY minimize stalls; a case may override it.
+    cfg = {"variable_size": True}


Possibly have a default config as yaml as well to make it more transparent?

dop-amin · 2026-06-14T18:40:09Z

I think the feature itself will be useful and the implementation looks fine.

Kind of related to #277 and #226.
Do you think it would be worth to think about a more general way to save the config alongside an optimization run once more?

Could also consider to hook this into tests/examples so we will not have three separate places in which we keep track of "examples". Also, inside examples/tests it would be nice to move the config away from the Python script into yaml files. This would make them much more easily readable and not as convoluted.

mkannwischer added 3 commits June 14, 2026 19:24

Run pytest suite in CI

503de71

dop-amin reviewed Jun 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: Estimated-performance regression tests#454

RFC: Estimated-performance regression tests#454
mkannwischer wants to merge 3 commits into
mainfrom
pytest-estimated-performance

mkannwischer commented Jun 14, 2026

Uh oh!

dop-amin Jun 14, 2026

Uh oh!

dop-amin Jun 14, 2026

Uh oh!

dop-amin commented Jun 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mkannwischer commented Jun 14, 2026

Uh oh!

dop-amin Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

dop-amin Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

dop-amin commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dop-amin commented Jun 14, 2026 •

edited

Loading