perf: optimize DFA minimizer by reducing algorithmic complexity#166
perf: optimize DFA minimizer by reducing algorithmic complexity#166Tugamer89 wants to merge 1 commit into
Conversation
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|



What
Optimized the DFA minimization algorithm in
Minimizer.javaby:dfa.getTransitionTable()call out of the state iteration loop insplitGroup.Arrays.fill(targets, -1)fast-path insplitGroupto skip iterating over the entire alphabet for states without any transitions.buildMinimalDfato iterate only over actual defined transitions (transitions.entrySet()) instead of checking every character in the alphabet viagetDestination().Why
splitGroupfetched thetransitionTablefrom the DFA on every iteration of the group, which is redundant.-1, takingO(|Alphabet|)time per sink state.buildMinimalDfa, looking up the destination for every character in the alphabet takesO(|States| * |Alphabet|)complexity, which is highly inefficient for sparse DFAs or regexes with large alphabets (e.g., wildcards). Iterating only over defined transitions reduces this toO(|Edges|).Impact
Improves the theoretical and practical time complexity of the Minimizer, particularly for DFAs with sparse transitions, sink states, and large alphabets. This reduces redundant map lookups and iterations, making the compilation step measurably faster.
Measurement
Run the test suite
mvn test -Dmaven.compiler.source=21 -Dmaven.compiler.target=21 -Dmaven.compiler.release=21to verify thatMinimizerTestand all other tests pass, ensuring functional equivalence. The performance improvement can be measured by profiling the compilation pipeline on large inputs.PR created automatically by Jules for task 10479104078313539079 started by @Tugamer89