Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
495751c
Add generic trait for method parameters
alexvbrdn Jul 1, 2025
8abe573
WIP
alexvbrdn Jul 6, 2025
1e7ec95
WIP
alexvbrdn Jul 8, 2025
9cf30a6
WIP
alexvbrdn Jul 10, 2025
671f3a3
WIP
alexvbrdn Jul 10, 2025
75c06b3
add parallel intersection
alexvbrdn Jul 27, 2025
b37ef65
WIP
alexvbrdn Jul 28, 2025
3ea0dec
update readme
alexvbrdn Jul 29, 2025
a47c779
rename methods
alexvbrdn Aug 2, 2025
bcc9d7d
Update description
alexvbrdn Aug 2, 2025
8d5b66e
Update docs
alexvbrdn Aug 2, 2025
90c462b
update
alexvbrdn Aug 3, 2025
7884f73
update most descriptions
alexvbrdn Aug 3, 2025
eb79826
fix bench
alexvbrdn Aug 3, 2025
691a972
fix docs test
alexvbrdn Aug 3, 2025
4fe1d94
update docs
alexvbrdn Aug 3, 2025
a42c87a
Update README.md
alexvbrdn Aug 4, 2025
aec5c39
Update naming and docs
alexvbrdn Aug 4, 2025
19aef3f
improve test
alexvbrdn Aug 4, 2025
6878356
Fix bad repetition case
alexvbrdn Aug 6, 2025
29697f8
fix algo repeat
alexvbrdn Aug 8, 2025
e24624e
update serialization
alexvbrdn Aug 9, 2025
f874caa
remove some errors
alexvbrdn Aug 11, 2025
c2cc842
Change regex convertion algo
alexvbrdn Sep 16, 2025
135cca6
update tests
alexvbrdn Sep 16, 2025
c3d800a
fix clippy
alexvbrdn Sep 16, 2025
afbacc6
add test
alexvbrdn Sep 16, 2025
7b576f7
update readme
alexvbrdn Sep 16, 2025
0a0d91b
update readme
alexvbrdn Sep 16, 2025
05b6802
update method signature
alexvbrdn Sep 16, 2025
a2dc371
add concat all for regex
alexvbrdn Sep 17, 2025
7afac62
update docs
alexvbrdn Sep 19, 2025
2119ea6
update doc
alexvbrdn Sep 19, 2025
0315e7d
additional updates
alexvbrdn Sep 19, 2025
1fd6bfc
update readme
alexvbrdn Sep 19, 2025
499735d
update docs
alexvbrdn Sep 20, 2025
b67597d
update method signatures
alexvbrdn Sep 21, 2025
c243522
fix failed build
alexvbrdn Sep 21, 2025
02852f9
fix serialization
alexvbrdn Sep 24, 2025
9a1266f
Huge improvements in generate strings
alexvbrdn Oct 2, 2025
fb5eb1a
Fix bad implementation of to_embedding
alexvbrdn Oct 3, 2025
1e4980b
improve assert_not_timed_out clock cycle
alexvbrdn Oct 3, 2025
68e87c4
Parallelize state selection for elimination
alexvbrdn Oct 8, 2025
863fdce
Fix misuse of hashmap for determinize
alexvbrdn Oct 8, 2025
2dcf141
improve generate_strings + fix clippy issues
alexvbrdn Mar 17, 2026
06377a0
Fix generate_strings recomputing strings in offset
alexvbrdn Mar 18, 2026
cac1a5d
Add Hopcroft minimization algorithm
alexvbrdn Mar 21, 2026
284bad6
Update generate_strings
alexvbrdn Mar 21, 2026
aced1ab
Fix generate_strings
alexvbrdn Mar 21, 2026
7c10284
Fix buggy automaton to regex conversion
alexvbrdn Apr 18, 2026
0cc654c
remove serialization
alexvbrdn Apr 19, 2026
cc793ff
Make remove_dead_transitions public
alexvbrdn Apr 23, 2026
5856ebc
Rename method
alexvbrdn Apr 23, 2026
4b41ff3
Add new method to unaccept a state
alexvbrdn May 6, 2026
9bba5b4
Make methods public
alexvbrdn May 23, 2026
2529e58
A lot of small bug fixes
alexvbrdn Jun 3, 2026
d8d8fef
WIP: Improve testing and redesign lib
alexvbrdn Jun 4, 2026
d37ac9a
Big refactoring
alexvbrdn Jun 5, 2026
4c2f063
Update README.md
alexvbrdn Jun 5, 2026
eacd8a9
Remove minimize execution profile
alexvbrdn Jun 5, 2026
518ec3b
Some lib updates
alexvbrdn Jun 9, 2026
7fd08dc
Big refactoring
alexvbrdn Jun 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 26 additions & 9 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,40 @@ name: Rust

on:
push:
branches: [ "main" ]
branches: ["main"]
pull_request:
branches: [ "main" ]
branches: ["main"]

env:
CARGO_TERM_COLOR: always
RUSTDOCFLAGS: -D warnings

jobs:
build:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- name: Format
run: cargo fmt --all -- --check
- name: Build
run: cargo build --verbose
- name: Test & Lint
run: |
cargo test
cargo clippy -- -D warnings
- name: Test & Lint (no default features)
run: |
cargo test --no-default-features
cargo clippy --no-default-features -- -D warnings
- name: Docs
run: cargo doc --no-deps

audit:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- name: Build
run: cargo build --all-features --verbose
- name: Test & Lint
run: |
cargo test --all-features
cargo clippy --all-features
- uses: actions/checkout@v4
- uses: rustsec/audit-check@v2
with:
token: ${{ secrets.GITHUB_TOKEN }}
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,7 @@ Cargo.lock
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
#.idea/

# cargo mutants output
mutants.out*/
105 changes: 105 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Changelog

All notable changes to this crate are documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased] - upcoming 1.0.0

This is a major redesign of the public API around the `Term` enum (wrapping
either a `RegularExpression` or a `FastAutomaton`), which dispatches each
operation to the cheaper representation when possible. See the crate-level
docs for the architecture. Almost the entire public surface changed; highlights
below.

### Added
- New `Term` constructors and conversions: `new_empty`, `new_total`,
`new_empty_string`, `from_pattern`, `from_regex(RegularExpression)`,
`from_automaton(FastAutomaton)`, plus `From<RegularExpression>`,
`From<FastAutomaton>`, `FromStr`, `Display`, and `Default` (= `new_empty`).
- New `Term` operations: `concat`, `complement`, `determinize`, `minimize`,
`matches`, `is_deterministic`, `is_minimal`, `is_finite`, `to_pattern`, and
`iter_strings` (lazy `StringGenerator` iterator).
- `to_regex`/`to_automaton` now return `Cow` to avoid unnecessary cloning.
- `FastAutomaton` gained corresponding low-level constructors/operations
(`new_empty`, `new_total`, `new_empty_string`, `determinize`, `minimize`
using Hopcroft's algorithm, `is_minimal`, `unaccept`, `print_dot`,
`try_add_transition`) and inspection helpers (`states`, `direct_states`,
`transitions_from`, `transitions_to_vec`, `has_transition`, ...).
- New `EngineError` variants: `InvalidRepetitionBounds`,
`IncompatibleSpanningSet`, `DeterministicAutomatonRequired`; the enum is
now `#[non_exhaustive]`.
- `tracing` instrumentation on the core `Term`, `FastAutomaton`, and
`RegularExpression` operations (concat, union, intersection, difference,
complement, repeat, determinize, minimize, equivalence/subset checks,
string generation, conversions). No-op unless a `tracing` subscriber is
installed.
- Parallel (Rayon-backed) variants of union/intersection for >3 operands,
gated behind the default-on `parallel` feature, with sequential fallbacks
for `--no-default-features`.
- `cargo fmt --check`, `cargo clippy -- -D warnings`, `cargo doc --no-deps`
(with `RUSTDOCFLAGS=-D warnings`), and a dependency vulnerability audit
(`rustsec/audit-check`) to CI.

### Changed
- `ExecutionProfile` redesigned as an immutable, thread-local-aware config
built via the new `ExecutionProfileBuilder`, governing execution timeouts,
state-count limits, and an `implicit_determinization` toggle.
- `union`/`intersection`/`concat` now take
`impl IntoIterator<Item = impl Borrow<Term>>` instead of `&[Term]`, so
`&[a, b]`, `[&a, &b]`, and `Vec<Term>` all work without cloning.
- `repeat` now takes `impl RangeBounds<u32>` (e.g. `3..6`, `..=2`) instead of
explicit min/max parameters.
- `generate_strings` now takes `(limit, offset)` for pagination instead of a
single `count`.
- `is_empty`, `is_total`, and `is_empty_string` now return
`Result<bool, EngineError>` instead of `bool`.
- `are_equivalent`/`is_subset_of` renamed to `equivalent`/`subset`.
- `subtraction` renamed to `difference`, kept single-operand by design.
- Renamed `FastAutomaton::as_dot` to `to_dot` (old printing `to_dot` is now
`print_dot`), matching the crate's `to_*` convention for allocating
conversions (`to_pattern`, `to_regex`, `to_automaton`, `to_range`).
- Renamed `get_*` accessors to drop the `get_` prefix, per the Rust API
Guidelines' C-GETTER convention: `Term::get_length` to `length`,
`Term::get_cardinality` to `cardinality`, `FastAutomaton::get_length` to
`length`, `FastAutomaton::get_cardinality` to `cardinality`,
`FastAutomaton::get_number_of_states` to `number_of_states`,
`FastAutomaton::get_condition` to `condition`,
`FastAutomaton::get_start_state` to `start_state`,
`FastAutomaton::get_accept_states` to `accept_states`,
`FastAutomaton::get_spanning_set` to `spanning_set`,
`FastAutomaton::get_live_states` to `live_states`,
`FastAutomaton::get_spanning_bases` to `spanning_bases`,
`RegularExpression::get_length` to `length`,
`RegularExpression::get_cardinality` to `cardinality`,
`SpanningSet::get_spanning_ranges` to `spanning_ranges`,
`SpanningSet::get_number_of_spanning_ranges` to
`number_of_spanning_ranges`, `SpanningSet::get_spanning_range` to
`spanning_range`, `SpanningSet::get_rest` to `rest`,
`Condition::get_cardinality` to `cardinality`,
`Condition::get_binary_representation` to `binary_representation`,
`ConditionConverter::get_from_spanning_set`/`get_to_spanning_set` to
`from_spanning_set`/`to_spanning_set`.
- Edition bumped to 2024 and `Cargo.toml` metadata (`description`,
`categories`) updated.

### Removed
- The `serde` feature and all serialization, FAIR (base85) encoding,
encryption, and compression support (`serde`, `ciborium`, `z85`,
`aes-gcm-siv`, `sha2`, `flate2` dependencies).
- `Term::get_details` and the `Details` type.
- The `tokenizer` module.
- Unused `log`, `rand`, and `lazy_static` dependencies, and the `regex`
crate dependency (now dev-only, used by integration tests).
- `EngineError` variants `AutomatonShouldBeDeterministic`, `TooMuchTerms`,
`ConditionIndexOutOfBound`, `TokenError`, and the `is_server_error` method.
- The `max_number_of_terms` execution-profile limit (no longer enforced).

## Earlier releases

Releases prior to 1.0.0 (`v0.1.0` through `v0.3.1`) predate this changelog;
see the [GitHub tags](https://github.com/RegexSolver/regexsolver/tags) and
commit history for details.

[Unreleased]: https://github.com/RegexSolver/regexsolver/compare/v0.3.1...HEAD
48 changes: 16 additions & 32 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,50 +1,34 @@
[package]
name = "regexsolver"
version = "0.3.1"
edition = "2021"
version = "1.0.0"
edition = "2024"
authors = ["Alexandre van Beurden"]
repository = "https://github.com/RegexSolver/regexsolver"
license = "MIT"
keywords = ["automaton", "intersection", "union", "difference", "regex"]
description = "Manipulate regex and automaton as if they were sets."
categories = ["text-processing", "mathematics", "algorithms"]
description = "High-performance Rust library for building, combining, and analyzing regular expressions and finite automata"
readme = "README.md"

[dependencies]
serde = { version = "1.0", features = ["derive"], optional = true }
ciborium = { version = "0.2.2", optional = true }
z85 = { version = "3.0.5", optional = true }
aes-gcm-siv = { version = "0.11.1", optional = true }
sha2 = { version = "0.10.8", optional = true }
flate2 = { version = "1.0.30", features = [
"zlib-ng",
], default-features = false, optional = true }
tracing = "0.1"
nohash-hasher = "0.2"
ahash = "0.8.11"
log = "0.4.21"
rand = "0.8.5"
lazy_static = "1.4.0"
regex = "1.10.3"
regex-syntax = "0.8.5"
regex-charclass = { version = "1.0.3" }
rayon = { version = "1.10.0", optional = true }
bit-set = "0.8.0"
indexmap = "2.13.0"

[features]
default = ["parallel"]
parallel = ["dep:rayon"]

[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
env_logger = "0.11.3"
serde_json = "1.0.114"


[features]
default = ["serde"]
serde = [
"regex-charclass/serde",
"dep:serde",
"dep:ciborium",
"dep:z85",
"dep:aes-gcm-siv",
"dep:sha2",
"dep:flate2",
]
proptest = "1"
regex = "1.10.3"

[[bench]]
name = "my_benchmark"
harness = false
name = "operations"
harness = false
Loading
Loading