perf: self-describing label dtypes via best_int (widen instead of raise) by FBumann · Pull Request #38 · fluxopt/linopy

FBumann · 2026-06-30T17:58:44Z

Stacked on top of PyPSA#566 (base branch perf/int32).

Note

The following content was generated by AI.

Follow-up to the int32 default: instead of a fixed dtype guarded by a hard ValueError at the int32 ceiling, derive each label allocation's dtype from its known max value, floored at options["label_dtype"].

What changed

fitting_label_dtype(max_value) in common.py: narrowest int dtype holding max_value, never narrower than the configured label_dtype. The option becomes a floor.
Variable/constraint label allocation uses it, so models that fit int32 stay uniform int32; models past ~2.1 B labels widen to int64 automatically instead of raising.
The label cast-back paths no longer hardcode the default dtype (which would truncate widened int64 labels):
- Variable.ffill / bfill preserve the source label dtype directly (no extra compute).
- The float round-trip paths (Variable.sanitize, LinearExpression init/assign/combine, save_join) use astype_labels, which sizes the result to the actual max value.

Why
Per-allocation best_int is value-correct because the label counters are global and monotonic, so end bounds every label in the group. The only real hazard was the ~8 sites that assumed "array dtype == configured default"; those are fixed here so a promoted int64 array survives ffill/sanitize/etc. without silent truncation.

Non-goal: narrowing below the configured default (int8/int16 for tiny models). It saves nothing at solve time (scipy sparse is int32; concat promotes to the widest block) and would make dtypes non-uniform across groups. Flooring at the default keeps the common case predictable.

Tests: old overflow-guard tests replaced with widen-past-int32 tests (labels become int64, no raise); added coverage for fitting_label_dtype flooring/widening and for astype_labels not truncating values beyond the int32 ceiling. Full suite green locally (1857 passed), ruff + mypy clean.

linopy/constants.py — Added DEFAULT_LABEL_DTYPE = np.int32 linopy/model.py — Variable and constraint label assignment now uses np.arange(..., dtype=DEFAULT_LABEL_DTYPE) with overflow guards that raise ValueError if labels exceed int32 max. linopy/expressions.py — _term coord assignment and all .astype(int) for vars arrays now use DEFAULT_LABEL_DTYPE (int32). linopy/common.py — fill_missing_coords uses np.arange(..., dtype=DEFAULT_LABEL_DTYPE). Polars schema inference now checks array.dtype.itemsize instead of the old OS/numpy-version hack. test/test_constraints.py — Updated 2 dtype assertions to use np.issubdtype instead of == int. test/test_dtypes.py (new) — 7 tests covering int32 labels, expression vars, solve correctness, and overflow guards.

…k to int64 via astype(int), now use DEFAULT_LABEL_DTYPE. Also Variables.to_dataframe arange for map_labels. - linopy/constraints.py: Constraints.to_dataframe arange for map_labels. - linopy/common.py: save_join outer-join fallback was casting to int64.

…ords. Here's what changed: - test_linear_expression_sum / test_linear_expression_sum_with_const: v.loc[:9].add(v.loc[10:], join="override") → v.loc[:9] + v.loc[10:].assign_coords(dim_2=v.loc[:9].coords["dim_2"]) - test_add_join_override → test_add_positional_assign_coords: uses v + disjoint.assign_coords(...) - test_add_constant_join_override → test_add_constant_positional: now uses different coords [5,6,7] + assign_coords to make the test meaningful - test_same_shape_add_join_override → test_same_shape_add_assign_coords: uses + c.to_linexpr().assign_coords(...) - test_add_constant_override_positional → test_add_constant_positional_different_coords: expr + other.assign_coords(...) - test_sub_constant_override → test_sub_constant_positional: expr - other.assign_coords(...) - test_mul_constant_override_positional → test_mul_constant_positional: expr * other.assign_coords(...) - test_div_constant_override_positional → test_div_constant_positional: expr / other.assign_coords(...) - test_variable_mul_override → test_variable_mul_positional: a * other.assign_coords(...) - test_variable_div_override → test_variable_div_positional: a / other.assign_coords(...) - test_add_same_coords_all_joins: removed "override" from loop, added assign_coords variant - test_add_scalar_with_explicit_join → test_add_scalar: simplified to expr + 10

- Move DEFAULT_LABEL_DTYPE from constants.py into options["label_dtype"] - Widen OptionSettings types from int to Any - Add validation: label_dtype only accepts np.int32 or np.int64 - Fix matrices.py empty clabels fallback to use configured dtype - Fix f-string quoting and trailing spaces in overflow error messages - Add -> None annotations and importorskip guard in test_dtypes.py - Add tests for int64 override and invalid dtype rejection - Add release notes entry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Dimension coordinates (fill_missing_coords, _term coord) are small index arrays, not the large label/vars arrays that benefit from int32. xarray's index creation is slower with int32 than the default int64, causing a 13-38% build regression. Revert these to default int while keeping int32 for labels and vars where the memory savings matter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

# Conflicts: # doc/release_notes.rst # linopy/common.py # linopy/config.py # linopy/matrices.py # linopy/model.py # linopy/variables.py # test/test_constraints.py

Derive each label allocation's int dtype from its known max value (`fitting_label_dtype`), floored at `options["label_dtype"]`. Models that fit the default keep a single predictable dtype (int32); models exceeding the int32 ceiling widen to a larger dtype instead of raising ValueError. Update the label cast-back paths (ffill/bfill/sanitize, save_join, expression combines) to preserve the array's own width rather than hardcoding the default, so widened int64 labels are not silently truncated. ffill/bfill keep the source dtype directly; the float round-trip paths use `astype_labels`, which sizes the result to the actual max value. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codspeed-hq · 2026-07-01T14:06:58Z

Merging this PR will improve performance by 25.46%

⚡ 77 improved benchmarks
✅ 96 untouched benchmarks
⏩ 845 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	Memory	`test_to_lp[storage-n=250]`	1,319.2 KB	663 KB	+98.96%
⚡	Memory	`test_to_lp[storage-n=10]`	56.5 KB	30.3 KB	+86.29%
⚡	Memory	`test_to_lp[basic-n=250]`	2 MB	1.3 MB	+56.84%
⚡	Memory	`test_to_lp[kvl_cycles-severity=100]`	38.5 MB	25.6 MB	+49.99%
⚡	Memory	`test_to_lp[sparse_network-n=250]`	34.5 MB	23 MB	+49.99%
⚡	Memory	`test_to_lp[rolling-severity=100]`	45.8 MB	30.5 MB	+49.99%
⚡	Memory	`test_to_lp[kvl_cycles-severity=50]`	38.5 MB	25.6 MB	+49.99%
⚡	Memory	`test_to_lp[nodal_balance-severity=100]`	17.9 MB	11.9 MB	+49.98%
⚡	Memory	`test_to_lp[cumsum-severity=100]`	29.3 MB	19.5 MB	+49.98%
⚡	Memory	`test_to_lp[merge_balance-severity=100]`	17.6 MB	11.7 MB	+49.98%
⚡	Memory	`test_to_lp[nodal_balance-severity=50]`	9.2 MB	6.1 MB	+49.97%
⚡	Memory	`test_to_lp[merge_balance-severity=50]`	9 MB	6 MB	+49.97%
⚡	Memory	`test_to_lp[kvl_cycles-severity=0]`	38.6 MB	25.7 MB	+49.94%
⚡	Memory	`test_to_lp[rolling-severity=50]`	45.9 MB	30.6 MB	+49.93%
⚡	Memory	`test_to_lp[nodal_balance-severity=0]`	385.3 KB	258.8 KB	+48.91%
⚡	Memory	`test_to_lp[masked-n=100]`	238.4 KB	160.3 KB	+48.74%
⚡	Memory	`test_to_lp[milp-n=50]`	63.5 KB	43.9 KB	+44.45%
⚡	Memory	`test_to_lp[sparse_network-n=10]`	29.8 KB	21.2 KB	+40.5%
⚡	Memory	`test_to_lp[merge_balance-severity=0]`	367.8 KB	262.8 KB	+39.95%
⚡	Memory	`test_to_lp[cumsum-severity=50]`	9.8 MB	7.3 MB	+33.32%
...	...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing perf/label-best-int (d91aa29) with master (9a8ab88)}

845 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

FBumann · 2026-07-01T14:28:30Z

@codspeedbot explain why this is faster

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codspeed-hq · 2026-07-01T14:29:04Z

@codspeedbot explain why this is faster

Here's why this PR shows up as faster.

It's a memory win, not a CPU win. All 77 improved benchmarks are in Memory mode — there are no wall-time or simulation regressions/improvements, and CodSpeed reports no environment differences between perf/label-best-int (7182e52) and master (9a8ab88). The headline "+25.53%" is the aggregate reduction in memory allocated during model building/export.

The root cause is the label dtype change. Variable and constraint labels are stored as dense integer arrays. Previously they defaulted to int64 (8 bytes/label); this stack floors them at int32 (4 bytes/label) and only widens when a model actually exceeds the int32 ceiling. Every label array — plus the intermediate arrays that carry labels through ffill/bfill, sanitize, expression init/combine, and the LP/solver export round-trips — is now half the size.

That explains the shape of the results:

Benchmarks whose memory is dominated by label arrays cluster right at ~+49.99% (a clean 2× reduction) — e.g. test_to_lp[kvl_cycles-severity=100] 38.5 MB → 25.6 MB, sparse_network-n=250 34.5 MB → 23 MB, rolling-severity=100 45.8 MB → 30.5 MB. Halving 8-byte labels to 4-byte labels is exactly a 2× drop.
Benchmarks where labels are only part of the footprint (float64 coefficients, solver matrices, other overhead sit alongside them) improve proportionally less — e.g. test_build[nodal_balance-severity=100] +10.5%, the test_op[...] arithmetic micro-benchmarks +8–11%.
The tiny-model cases (test_to_lp[basic-n=10] +13.5%) improve least, since fixed per-object overhead dominates when there are few labels.

So there's no algorithmic speedup here — the models build the same way, they just allocate roughly half as many bytes for label storage, and the biggest, label-heavy models benefit most.

Summary

Warming up the performance wizard...
Analyzing performance changes — Confirmed all 77 improvements are Memory-mode reductions driven by the int32 label dtype; posting explanation
Handling comment — Comment handling finished successfully

_{Was this response helpful? React with 👍 or 👎 to help us improve CodSpeedBot.}

FBumann and others added 12 commits February 1, 2026 19:29

Add memory becnhmark

b5df113

bench: improve benchmark_lp_writer.py

d0a8c74

Add dtype tests

2f3e87e

Delete dev-scripts/benchmark_lp_writer.py

532126d

Merge remote-tracking branch 'origin/master' into perf/int32-resolve

52e3185

# Conflicts: # doc/release_notes.rst # linopy/common.py # linopy/config.py # linopy/matrices.py # linopy/model.py # linopy/variables.py # test/test_constraints.py

Merge master into perf/label-best-int

7182e52

FBumann changed the base branch from perf/int32 to master July 1, 2026 14:03

FBumann closed this Jul 1, 2026

FBumann reopened this Jul 1, 2026

ci: empty commit to re-trigger CodSpeed (baseline refresh)

d91aa29

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: self-describing label dtypes via best_int (widen instead of raise)#38

perf: self-describing label dtypes via best_int (widen instead of raise)#38
FBumann wants to merge 13 commits into
masterfrom
perf/label-best-int

FBumann commented Jun 30, 2026

Uh oh!

codspeed-hq Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

FBumann commented Jul 1, 2026

Uh oh!

codspeed-hq Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FBumann commented Jun 30, 2026

Uh oh!

codspeed-hq Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 25.46%

Performance Changes

Footnotes

Uh oh!

FBumann commented Jul 1, 2026

Uh oh!

codspeed-hq Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codspeed-hq Bot commented Jul 1, 2026 •

edited

Loading

codspeed-hq Bot commented Jul 1, 2026 •

edited

Loading