Type stats follow renaming by marcotc · Pull Request #5805 · DataDog/dd-trace-rb

marcotc · 2026-05-22T05:30:06Z

What does this PR do?

Makes typing stats comparison follow renames and line moves for RBS declarations and steep:ignore comments.

Motivation:

Without this change, type reports currently show file renames as "introduced + cleared".

Untyped methods (+3 -3)
Partially typed methods (+5 -5)

Change log entry

None.

Additional Notes:

Tests were added for .github/scripts/typing_stats_compare.rb, for both old and new behavior.
To ensure test run, I added it to the same Raketask as our custom Rubocop cops (spec:custom_cop), but renamed the task to spec:tooling.
These tests do not run on spec:main.

How to test the change?

bundle exec rake spec:tooling

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 258e456f4c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-22T05:38:03Z

-      path: path.to_s,
-      line: ignore.line
-    }
+    ignored_source = source_lines[ignore.line - 1]&.strip


Distinguish block steep ignores by their ignored code

When the suppression is a block form (# steep:ignore:start), ignore.line points at the start comment itself, so ignored_source is also just that comment. Since the new comparison key intentionally ignores line numbers, two block suppressions with the same start comment in the same file become indistinguishable; removing one block and adding an unrelated block elsewhere in that file will cancel out and the typing-stats comment will hide the newly introduced suppression.

Useful? React with 👍 / 👎.

I agree, now that we're not comparing with line numbers anymore but the content of the ignored lines, we should probably use everything that is between the steep:ignore:start and steep:ignore:end (which would also correctly report any change between start and end). But maybe this should be for another PR

datadog-official · 2026-05-22T05:53:34Z

Tests

🎉 All green!

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
• Patch Coverage: 100.00%
• Overall Coverage: 97.07% (-0.03%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 6331a4e | Docs | Datadog PR Page | Give us feedback!}

vpellan

Awesome work, thank you! I never really had time to go back to this tooling, but this should prevent a lot of noise due to renaming files or adding/removing lines. I still requested a change because as-is, this PR might not work as expected (see my comment in AST traversal)

vpellan · 2026-05-22T09:39:29Z

-      path: path.to_s,
-      line: ignore.line
-    }
+    ignored_source = source_lines[ignore.line - 1]&.strip


I agree, now that we're not comparing with line numbers anymore but the content of the ignored lines, we should probably use everything that is between the steep:ignore:start and steep:ignore:end (which would also correctly report any change between start and end). But maybe this should be for another PR

vpellan · 2026-05-22T09:51:27Z

         ::RBS::AST::Declarations::Class,
         ::RBS::AST::Declarations::Interface
-      ast_traversal(declaration.members, result)
+      ast_traversal(declaration.members, result, context + [declaration.name.to_s])


When you rename a file, you'll usually also rename the corresponding Module, including in the .rbs file, which means the context will be different. I don't see any operation on context in typing_stats_compare.rb so I assume it will report them as two different declaration and not fix the issue you're trying to solve here. We could probably normalize the file name, and when comparing the context, check every module name against their corresponding renamed one (although there's most likely some edge cases but it would work most of the time)

I'd also add a test for that (the current tests do not rename modules)

pr-commenter · 2026-05-27T21:51:08Z

Benchmarks

Benchmark execution time: 2026-05-27 21:51:05

Comparing candidate commit ba9e851 in PR branch typing-stats-deterministic-comparison with baseline commit f87530c in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

🟩 = significantly better candidate vs. baseline
🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

marcotc self-assigned this May 22, 2026

marcotc added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label May 22, 2026

marcotc force-pushed the typing-stats-deterministic-comparison branch 2 times, most recently from 3fd76e5 to 4d05ff2 Compare May 22, 2026 05:33

Type stats follow renaming

258e456

marcotc force-pushed the typing-stats-deterministic-comparison branch from 4d05ff2 to 258e456 Compare May 22, 2026 05:33

marcotc marked this pull request as ready for review May 22, 2026 05:35

marcotc requested a review from a team as a code owner May 22, 2026 05:35

chatgpt-codex-connector Bot reviewed May 22, 2026

View reviewed changes

vpellan requested changes May 22, 2026

View reviewed changes

Merge branch 'master' into typing-stats-deterministic-comparison

ba9e851

Fix for constant rename

6331a4e

Strech approved these changes Jun 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Type stats follow renaming#5805

Type stats follow renaming#5805
marcotc wants to merge 3 commits into
masterfrom
typing-stats-deterministic-comparison

marcotc commented May 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Uh oh!

vpellan May 22, 2026

Uh oh!

datadog-official Bot commented May 22, 2026 •

edited by datadog-datadog-prod-us1-2 Bot

Loading

Uh oh!

vpellan left a comment

Uh oh!

vpellan May 22, 2026

Uh oh!

vpellan May 22, 2026

Uh oh!

pr-commenter Bot commented May 27, 2026

Explanation

More details about the CI and significant changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

marcotc commented May 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

vpellan May 22, 2026

Choose a reason for hiding this comment

Uh oh!

datadog-official Bot commented May 22, 2026 • edited by datadog-datadog-prod-us1-2 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vpellan left a comment

Choose a reason for hiding this comment

Uh oh!

vpellan May 22, 2026

Choose a reason for hiding this comment

Uh oh!

vpellan May 22, 2026

Choose a reason for hiding this comment

Uh oh!

pr-commenter Bot commented May 27, 2026

Benchmarks

Explanation

More details about the CI and significant changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

datadog-official Bot commented May 22, 2026 •

edited by datadog-datadog-prod-us1-2 Bot

Loading