Skip to content

Use saturating arithmetic to avoid usize overflow panics (#1066, #1067, #1068)#1109

Open
patrickwehbe wants to merge 1 commit into
rust-itertools:masterfrom
patrickwehbe:fix-usize-max-overflow-panics
Open

Use saturating arithmetic to avoid usize overflow panics (#1066, #1067, #1068)#1109
patrickwehbe wants to merge 1 commit into
rust-itertools:masterfrom
patrickwehbe:fix-usize-max-overflow-panics

Conversation

@patrickwehbe

Copy link
Copy Markdown

Three Itertools methods can panic with an overflow in debug builds when
given very large inputs. Each one does an unchecked + 1 or * 2 on a
usize that can already be near the maximum. This switches them to
saturating arithmetic.

  • k_smallest_relaxed (issue Bug in k_smallest_relaxed? #1066): it allocates room for 2 * k items,
    which overflows once k is greater than usize::MAX / 2. Now it uses
    saturating_mul, so a huge k saturates to usize::MAX and the whole
    iterator is collected, the same result you already get from k_smallest.
  • peek_nth and peek_nth_mut (issue Peeking usize::MAX with peek_nth #1067): both compute n + 1, which
    overflows when n == usize::MAX. Now they use saturating_add, so
    peeking that far just returns None, which is what peeking past the end
    already does.
  • InterleaveShortest::size_hint (issue interleave_shortest's size_hint panics with large iterators. #1068): the lower bound is computed
    as combined_lower + 1, which overflows when combined_lower is
    usize::MAX. Now it uses saturating_add. The upper-bound branch a few
    lines down already uses checked_add, so this just brings the two into
    line.

I added a regression test for each in tests/test_std.rs that calls the
affected method with usize::MAX-ish input and checks it returns instead of
panicking.

The reporter (@Takashiidobe) suggested the saturating approach in #1066;
the same idea applies cleanly to all three.

Fixes #1066
Fixes #1067
Fixes #1068

Three methods could panic with an attempt-to-add/multiply-with-overflow
when given very large inputs, in debug builds:

- k_smallest_relaxed allocated room for `2 * k` items, which overflowed
  for k greater than usize::MAX / 2. Use saturating_mul so it saturates
  to usize::MAX and collects the whole iterator instead.
- peek_nth and peek_nth_mut computed `n + 1`, which overflowed for
  n == usize::MAX. Use saturating_add so peeking that far just returns
  None, matching the behaviour of peeking past the end.
- InterleaveShortest::size_hint computed `combined_lower + 1` for the
  lower bound, which overflowed when the combined lower bound was
  usize::MAX. Use saturating_add. The upper-bound branch right below
  already used checked_add, so this lines the two up.

Added regression tests for each in tests/test_std.rs.

Fixes rust-itertools#1066
Fixes rust-itertools#1067
Fixes rust-itertools#1068
@codecov

codecov Bot commented Jun 21, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.87%. Comparing base (6814180) to head (697856c).
⚠️ Report is 209 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1109      +/-   ##
==========================================
- Coverage   94.38%   93.87%   -0.51%     
==========================================
  Files          48       52       +4     
  Lines        6665     6682      +17     
==========================================
- Hits         6291     6273      -18     
- Misses        374      409      +35     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@phimuemue

phimuemue commented Jun 25, 2026

Copy link
Copy Markdown
Member

Hi there,
can you close this and split these into different PRs so that we review them separately?

I only had a look at the k_smallest_relaxed and I am hesitant to include it as proposed:

  • afaik k_smallest_relaxed benefits from the fact that it allocates 2*k items:
    • it can cheaply (read O(k)) populate the entries at indices k..2*k
    • it can cheaply (iirc estimated O(k)) rearrange the entries so that it can throw away the upper k entries
  • if the difference between the elements requested (k) and the entries reserved becomes smaller and smaller, this advantage vanishes.
    As an extreme case consider k==usize::MAX-1. The difference between k and "entries reserved" is 1, so we would have to rearrange after each and every item put into the buffer, which might lead to k^2 runtime.

If my assessment is correct, the better solution might be one of these in case k is too large:

  • Delegate to the classical k_smallest.
  • panic!, and document this.

Honestly, I would be ok with a documented panic, and see if we run into real-world scenarios where this is problematic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

interleave_shortest's size_hint panics with large iterators. Peeking usize::MAX with peek_nth Bug in k_smallest_relaxed?

2 participants