test: de-flake test_heatmap_reshape irregular-data case#707
Conversation
`test_with_irregular_data` randomly dropped 30% of timestamps using the module-level `np.random` state. Under pytest-xdist the global RNG state at this test's execution depends on test ordering, so ~0.7% of runs dropped all four timestamps in the final hour, yielding 24 reshaped hours instead of 25 and failing the `== 25` assert. Use a local `np.random.default_rng(42)` and always retain the final timestamp, so the 25-hour span is deterministic regardless of test order. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 36 minutes and 16 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Problem
tests/plotting/test_heatmap_reshape.py::test_with_irregular_datais flaky (~0.7% of runs), failing withassert 24 == 25. It surfaced on PR #706's CI but is unrelated to that PR.Root cause
The test drops 30% of timestamps using the module-level
np.randomstate. Underpytest-xdist, the global RNG state when this test runs depends on which other tests ran first on the worker — effectively random. When all four timestamps in the final hour (indices 96–99) happen to be dropped,ffillstops at hour 23, producing 24 reshaped hours instead of 25, and the hard== 25assert fails.Exact failure probability:
C(96,70)/C(100,70) ≈ 0.70%. Not a dependency regression —mainpassed June 14 by luck.Fix
Use a local
np.random.default_rng(42)and always retain the final timestamp, so the 25-hour span is deterministic regardless of test order. Verified immune across 500 hostile global-RNG states (always 70 kept, max index 99).🤖 Generated with Claude Code