Skip to content

perf: optimize DFA minimizer by reducing algorithmic complexity#166

Open
Tugamer89 wants to merge 1 commit into
mainfrom
perf-optimize-dfa-minimizer-10479104078313539079
Open

perf: optimize DFA minimizer by reducing algorithmic complexity#166
Tugamer89 wants to merge 1 commit into
mainfrom
perf-optimize-dfa-minimizer-10479104078313539079

Conversation

@Tugamer89

Copy link
Copy Markdown
Owner

What

Optimized the DFA minimization algorithm in Minimizer.java by:

  1. Hoisting the dfa.getTransitionTable() call out of the state iteration loop in splitGroup.
  2. Introducing an Arrays.fill(targets, -1) fast-path in splitGroup to skip iterating over the entire alphabet for states without any transitions.
  3. Simplifying buildMinimalDfa to iterate only over actual defined transitions (transitions.entrySet()) instead of checking every character in the alphabet via getDestination().

Why

  1. Previously, splitGroup fetched the transitionTable from the DFA on every iteration of the group, which is redundant.
  2. Sink states (states with no outbound transitions) would still cause the algorithm to iterate over the entire alphabet to assign -1, taking O(|Alphabet|) time per sink state.
  3. In buildMinimalDfa, looking up the destination for every character in the alphabet takes O(|States| * |Alphabet|) complexity, which is highly inefficient for sparse DFAs or regexes with large alphabets (e.g., wildcards). Iterating only over defined transitions reduces this to O(|Edges|).

Impact

Improves the theoretical and practical time complexity of the Minimizer, particularly for DFAs with sparse transitions, sink states, and large alphabets. This reduces redundant map lookups and iterations, making the compilation step measurably faster.

Measurement

Run the test suite mvn test -Dmaven.compiler.source=21 -Dmaven.compiler.target=21 -Dmaven.compiler.release=21 to verify that MinimizerTest and all other tests pass, ensuring functional equivalence. The performance improvement can be measured by profiling the compilation pipeline on large inputs.


PR created automatically by Jules for task 10479104078313539079 started by @Tugamer89

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@google-labs-jules

Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant