Add a mathematical constraint system by kripken · Pull Request #8816 · WebAssembly/binaryen

kripken · 2026-06-08T23:12:29Z

This allows defining constraints like { x >= 0 && x <= 100 } and to then check if they
imply something else is true or false, like { x >= 0 && x <= 100 } => { x < 9999 }
(example of a valid inference).

This is the minimal first part of such a system, focusing on ==, !=, and very simple
solving. Putting up for design feedback before I work in depth on the rest.

Next steps are to add >=, < etc., and to add a pass that uses this in a control-flow
aware way, that is, the goal is to optimize things like

if (x > 10) {
   assert(x > 0); // this can be removed
}

This is important to remove userspace bounds checks for Kotlin (and likely Java).

inplace_vector part here is from #8814 (will rebase once it lands).

tlively

I highly recommend explicitly framing the constraint space as a lattice:

Both and_ and fuzzyOr are effectively merging constraints. You want both (but especially fuzzyOr) to have all the properties of a lattice join operator: monotonicity, associativity, commutativity, idempotency, etc. You also want fuzzyOr to be as precise as possible; it has to lose some precision sometimes, but you only want it to lose as much precision as necessary given the representation of constraints. So you want it to be a least upper bound, i.e. a join.
Making the constraint space a lattice will give you all the nice properties you want for using it in a program analysis: order-independence, guaranteed convergence, etc. It also reduces all the novelty and complexity to just generating the constraints in the first place; getting to the fixed point after that is just the classic worklist + graph traversal pattern.
Making the constraint space a lattice will let you test it in the lattice fuzzer, which can do a better job than just unit tests alone of making sure it has all the properties we want, including that we do not unnecessarily lose precision in the merge operation.

tlively · 2026-06-08T23:28:49Z

+
+// We limit constraints to a low number to ensure good performance even with
+// simple brute-force solving.
+// TODO: use a generic constraint solver..?


I did have that POC for pulling in Z3. In the limit I guess that's what we'd want. 5c2bbb7

tlively · 2026-06-08T23:32:15Z

+  //   { this } => { condition }
+  //
+  // https://en.wikipedia.org/wiki/Material_conditional#Truth_table
+  Result check(const Constraint& condition) const;


Perhaps proves or implies?

Hmm, yeah. Another option is eval as @MaxGraey suggests?

Thinking more on this, I think that x.implies(y) is not quite right, as it reads as 'x implies y' i.e. x is not checking if it implies y, but sounds like a new constraint, that makes x somehow imply y. Ditto for x.proves(y).

So checkImplies might work, but is longer than check? I am leaning towards check or eval.

I would strongly prefer some variant of proves or implies. I don't think those names sound like they're adding new constraints, but something like checkImplication would also be fine with me. eval does not suggest the correct operation to me.

(FWIW, the tests use "proves" in their comments)

Ok, I don't feel strongly. Renamed to proves.

MaxGraey · 2026-06-09T07:41:13Z

That's awesome!

Have you considered more academic and conventional naming for lattice-like stuff?

Value -> Term
Result -> KnownTruth
ConstraintSet -> Conjunction

check(conj) -> eval(conj)
and_(conj) -> meet(conj) / meetWith(conj)
fuzzyOr(conj) -> join(conj) / joinWith(conj)

or something like this?

kripken · 2026-06-09T16:36:29Z

@tlively Definitely making this a lattice would have benefits, but it would add overhead and complexity, I worry. Specifically, having a limited capacity (number of constraints in a set), as in the current design, is really nice for efficiency, but makes it not a lattice. Here is a concrete example. For a lattice we need this absorption law: (a ^ b) v b == b. Take

a = { x >= 10 && x <= 20 }  ;; span of numbers: 10, 11, .., to 20
b = { x & 1 }               ;; all odd numbers

a ^ b should be the set of odd numbers in that range, i.e., 11, 13, .., 19. However, that can't be written if the capacity is 2. So a ^ b loses something. That doesn't mean it isn't useful! We can define a ^ b to contain any 2 of the 3 constraints being combined (this can prove fewer things, but more than nothing). E.g. a ^ b = a (just ignore b). But then

(a ^ b) v b == a v b != b

which breaks the absorption rule.

(This is sort of parallel to the issue with multiple constants in possible-constants - we only support one constant, not an arbitrary number. An arbitrary number is necessary for all the nice mathematical properties we want, but the overhead isn't worth it in GUFA.)

kripken · 2026-06-09T16:39:45Z

@MaxGraey

Value -> Term

Good idea, I think that makes sense.

Result -> KnownTruth

I think this is clear enough already, and shorter?

ConstraintSet -> Conjunction

I left this intentionally vague as this may expand in the future. A set of constraints is, atm, a conjunction, but if we find a nice way to allow OR and not just AND, we should add it. The idea is, conceptually, a set of constraints that can prove things.

MaxGraey · 2026-06-09T16:58:49Z

Btw binaryen already has some basic semi and full lattices: https://github.com/WebAssembly/binaryen/blob/main/src/analysis/lattice.h and https://github.com/WebAssembly/binaryen/blob/main/src/analysis/lattices/abstraction.h infra. So how about this?

class LowerBound : Lattice { ... }
class UpperBound : FullLattice { ... }
class RangeBound : FullLattice { ... }

tlively · 2026-06-10T00:14:11Z

No, that's exactly correct. See the parenthetical note I added to my previous comment in an edit. And it's mostly fine that it's a semilattice because the generic worklist algorithm that propagates information to find a fixed point only does joins. The only catch is that the transfer function will use boundedMeet, which will not be monotonic :( AFAICT, this means that we might not get order-independence after :(((( But the factoring of the code will still be much nicer IMO :)

kripken · 2026-06-10T00:27:10Z

Ok, good, then we are on the same page - this is not a lattice, so we lose all the nice properties that a lattice normally has.

That leaves the code factoring as a possible benefit. But when I ran Gemini on this, I didn't see a code benefit either - mostly a bunch of new boilerplate to fit into the Lattice framework. Unless you have a way to do this without boilerplate that actually reduces code rather than adds?

tlively · 2026-06-10T00:37:27Z

#8821 and #8824 show the generic lattices we could add. Obviously the code is more complex if you count the heavily-templated lattice implementations, but I don't think that's the right way to look at it. Even supposing that we never reuse the lattices for anything else (although we could!), factoring the constraint system into composed lattices makes it much easier to focus on the interesting things and abstract away all the complexity around managing our knowledge of independent constraints. It also makes the code much more unit-testable and fuzzable.

kripken · 2026-06-10T01:58:57Z

#8821 and #8824 show the generic lattices we could add. Obviously the code is more complex if you count the heavily-templated lattice implementations, but I don't think that's the right way to look at it.

I agree. But, ignoring the generic template code, is it actually shorter than my current code? I'd like to look at that diff if you have it.

MaxGraey · 2026-06-10T07:29:56Z

I thought a bit more about how this could work. Here's a rough sketch:

Term =
  | Bottom
  | Interval(min, max) (*  it's more generic instead to have separated lower/upper/range bounds *)
  | Top

Single predicates x > a, x < a are trivial

Let's check how we can represent compound (range-like) predicates:

x > a && x < b -> (a, b) -> Interval(a, b)
x < a && x < b -> x < min(a, b) or Interval(-inf, min(a,b))
x > a && x > b -> x > max(a, b) or Interval(max(a,b), +inf)
x > a || x < b when a > b -> (-inf, a) U (b, +inf) -> Powerset of Intervals (but this rare in code)
x > a || x < b when a < b -> Top (example: x > 0 || x < 10 which is no sense)

The powerset of intervals is quite tricky, and I recommend skipping it for now. It can lead to infinite growth.

// or perhaps use std::variant + std::get_if?
enum class TermKind { Bottom, Interval, Top };
struct Bound { 
    int64_t value; // it can be abstract value as well
    bool negInf, posInf; 
    // bool inclusive;  // open / closed ?
};

struct Term {
    TermKind kind;
    Bound lower, upper;
    static Term top();
    static Term bottom();
    static Term interval(Bound lo, Bound hi);
};

struct IntervalLattice final : FullLattice<Term> {
    Term top() const override;
    Term bottom() const override;

    bool  leq(const Term& a, const Term& b) const override;
    Term join(const Term& a, const Term& b) const override;
    Term meet(const Term& a, const Term& b) const override;
};

struct IntervalLatticeSolver {
    Term eval(const Term& current, const Predicate& pred) const; // or refine
}

Term IntervalLattice::meet(const Term& a, const Term& b) {
    if (a.isTop() || b.isTop()) return top();
    if (a.isBottom() || b.isBottom()) return bottom();
    auto lo = max(a.lower, b.lower);
    auto hi = min(a.upper, b.upper);
    if (lo > hi) {
        return bottom();
    }
    return Term::Interval(lo, hi);
}

Term IntervalLattice::join(const Term& a, const Term& b) {
    if (a.isTop() || b.isTop()) return top();
    if (a.isBottom()) return b;
    if (b.isBottom()) return a;
    auto lo = min(a.lower, b.lower);
    auto hi = max(a.upper, b.upper);
    return Term::Interval(lo, hi);
}

bool IntervalLattice::leq(const Term& a, const Term& b) {
    return subset(a, b);
}

Term IntervalLatticeSolver::eval(const Term& current, const Predicate& pred) {
   ...
}

kripken · 2026-06-10T14:13:25Z

@MaxGraey Intervals or ranges can work that way, yes. But as I wrote above, I think we should support more than that, things like !=, == (for e.g. null checks) and things like subtyping. We don't need to support all the possible things or all their combinations, but it is useful to support the common ones, and not very hard.

MaxGraey · 2026-06-10T14:27:21Z

!=, == better to handle as separate lattice solver which also covered subtyping/equality for heap types (ref.eq, ref.is_null). wasm doesn't have ref.ne, so we required i32.ne/i32.eq anyway even for pure heap type relational analysis. wdyt?

kripken · 2026-06-10T15:46:52Z

!=, == better to handle as separate lattice solver which also covered subtyping/equality for heap types (ref.eq, ref.is_null).

As mentioned above, using separate lattices adds overhead (either a product lattice, or multiple passes one for each).

Also, these things can interact: imagine a range x >= 0 && x < 10 plus x == 10, which a general mathematical constraint system can in theory handle.

I suggest we start with the approach I have here. As I said above, not a lot more code remains past this initial PR, for us to get to useful optimizations.

And we can always reconsider and replace it all with a lattice later if it gets messy - the pass that will use this will not depend on the details of the constraint solving. Here is draft and unpolished code for that pass, hopefully enough to see that this will be used very simply:

https://github.com/kripken/binaryen/blob/constraint/src/passes/RangeAnalysis.cpp

tlively · 2026-06-10T17:33:50Z

#8821 and #8824 show the generic lattices we could add. Obviously the code is more complex if you count the heavily-templated lattice implementations, but I don't think that's the right way to look at it.

I agree. But, ignoring the generic template code, is it actually shorter than my current code? I'd like to look at that diff if you have it.

I don't have a full diff (this PR would have to define Constraint as a lattice, and I haven't done that), but for instance you would be able to entirely remove AndedConstraintSet, including fuzzyOr and its // TODO smarts, as well as and_, which no longer would burden its users with having to avoid adding constraints to a full set. Instead you would use BoundedConjunction<Constraint, 3>.

Beyond the complexity wins, this would make it trivial to experiment with different points in the performance/precision trade off space, e.g. by using BoundedConjunction<Constraint, 1> or Conjunction<Constraint> instead. In the future it would be easy to add new kinds of independent constraints, e.g. subtyping: BoundedConjunction<OneOf<Constraint, SubtypeConstraint>, 3>.

kripken · 2026-06-10T18:46:01Z

@tlively

I don't have a full diff

I really think it would be valuable to see that diff. When I tried to produce it, as I said above, I didn't like the result due to boilerplate. But maybe I was doing it wrong. This is your idea, so you will be able to implement it best, and then we can evaluate it - does that make sense?

If you don't have time, another option is to land this, and consider your idea later. It will be a drop-in replacement for the code in this PR, as used by the future pass (as can be seen in the draft version of that pass), so landing this is not locking us into anything.

Beyond the complexity wins, this would make it trivial to experiment with different points in the performance/precision trade off space,

I don't follow that. In the current PR it is also trivial to adjust the constant or even make it unbounded.

tlively · 2026-06-10T20:14:30Z

Sure, I can implement Constraint as a lattice.

tlively · 2026-06-10T22:51:51Z

Ok, the draft PR adding a general Constraint lattice (which I ended up calling Bound) is #8827 . I unfortunately got if through my skull that there's no way to support EQ, and NE constraints and maintain a lattice, though. The draft PR that adds the final utility that approximates the one in this PR is #8828.

tlively · 2026-06-10T23:35:28Z

FWIW, I'm thoroughly convinced that there's no need to use a proper lattice here. I'd even be fine not using a join semilattice if we really wanted to support EQ and NE constraints. The main thing that would make me happy would be if we could compose this utility out of general parts, e.g. separating BoundedConjunction and its join/meet logic from Constraint/Bound and its join/meet logic. Being able to read, understand, and test each part in isolation is the benefit I'm after.

kripken · 2026-06-12T17:54:56Z

Ok, thanks for that code, that definitely helps me understand your point of view.

Yeah, it's unfortunate we can't use a proper lattice here. But limited capacity is just necessary, and goes against the point of a lattice.

And I think we do need !=, == here. Supporting simple math like that will get us null checks and other stuff very easily.

So what is left is your point about

Being able to read, understand, and test each part in isolation is the benefit I'm after.

I agree to that in general. But does this PR not do that already? Look at how short and sweet constraint.h is. It defines a Term, a Constraint, and an eval operation. No more.

constraint.cpp is, similarly, very clear and obvious: evalPair evals a pair, just implementing the math basically. Which part is difficult to follow or in need of improvement?

Put another way, the current code in this PR is really, a minimal mathematical constraint system. It just does that, and in a simple and modular way.

I am not opposed to a larger amount of code, as in your PRs, if there is a benefit. I still don't see what it is, though? The key eval/implies code is not improved, in particular.

And, look: in the end, a mathematical constraint system must do math. Math is not perfectly separable into composed lattices. If we want to do x >= 0 && x != 0 => x > 0 then we need to handle interactions between >=, !=. We can definitely do that in a nice way using modular code, but lattices don't help afaict (other things in math might, though - though I doubt we will need to get into such complexity).

MaxGraey · 2026-06-12T18:08:52Z

The main benefit of a lattice idiom is it provides a well-defined merge/join operation for dataflow analysis, guarantees convergence and makes it easier to compose. It also gives a common API and semantically defined behaviour that can be reused for other lattices. On upstream's lattices folder alread implemented inverted lattice which can use for invert conditions such as if (!(x >= 5 && x <= 10)). That's a good case how lattices can be composed and stacked. Also it seems we already have machinery for abstract eval such compositions of sub-lattices: https://github.com/WebAssembly/binaryen/blob/main/src/analysis/lattices/abstraction.h#L34

MaxGraey · 2026-06-12T18:16:01Z

@juj btw it would be interesting to hear what you think about all this

tlively · 2026-06-12T18:42:24Z

+  if (auto* aConstant = std::get_if<Literal>(&a.term)) {
+    if (auto* bConstant = std::get_if<Literal>(&b.term)) {


Please use helper functions heavily here. If each function just does some case analysis and forwards each case to a helper function, then there's much less state to keep in mind as we read through the code. In contrast, right now I have to remember the state of multiple outer ifs and switches to know what case the code is supposed to be handling.

Done, avoided nesting and added a helper. Code is much simpler now.

tlively · 2026-06-12T18:55:13Z

+  // If this is already implied by current constraints, then it is redundant.
+  // E.g. if we are { x = 10 } and other is { x >= 0 } then all we need is
+  // { x >= 0 } as the result of the OR.
+  if (eval(other) == True) {
+    *this = other;
+    return;
+  }
+  if (other.eval(*this) == True) {
+    return;
+  }


This doesn't handle the case where the constraints can be relaxed in both directions separately. For example:

{ x >= 2 /\ x <= 4 } \/ { x >= 1 /\ x <= 3 }

This should give { x >= 1 /\ x <= 4 }, but right now it just gives up.

This might be included in the // TODO smarts below, but I think it's important to see the full complexity here so we can get it factored as nicely as possible.

What do you mean by "see the full complexity"? I'm not sure what you are asking this PR to do aside from have the existing TODO.

I'm suggesting we resolve the TODO :)

But I guess I can just say in advance that the simplest way to do this will be in terms of a fuzzyOr operation on a pair of Constraints, so I guess we don't need to do it now.

tlively · 2026-06-12T19:00:17Z

+// A constraint: some operation and some value, like "is equal to 17" or "is
+// less than local 6".
+struct Constraint {
+  Abstract::Op op = Abstract::Invalid;


I don't think there's much value in making invalid constraints representable (nor in reusing the Abstract enum). How about using a new enum that can be specific to this use case that does not have an Invalid member?

Reusing Abstract is useful because we have code to parse IR into it. E.g. we need to parse AddInt32 into Abstract::Add (the next PR does this).

Without this reuse, we'd need to duplicate that code, or add a mapping of Abstract into a new enum.

I think this is exactly what Abstract is meant for: an abstract operation, without the details of a Type. This is precisely the right level of abstraction for a mathematical constraint system, mirroring the == etc. notation in math.

Makes sense to reuse the parsing code 👍

We can still avoid adding Abstract::Invalid, though.

tlively · 2026-06-12T19:07:00Z

+  //   { this } => { condition }
+  //
+  // https://en.wikipedia.org/wiki/Material_conditional#Truth_table
+  Result check(const Constraint& condition) const;


I would strongly prefer some variant of proves or implies. I don't think those names sound like they're adding new constraints, but something like checkImplication would also be fine with me. eval does not suggest the correct operation to me.

tlively · 2026-06-12T19:07:51Z

+  Result eval(const Constraint& condition) const;
+
+  // Check an entire other set.
+  Result eval(const AndedConstraintSet& other) const {


Let's put this implementation in the .cpp file as well.

tlively · 2026-06-12T19:12:05Z

+  // Add a constraint to the set, ANDed with the others. The caller must make
+  // sure not to add too many (i.e. it is invalid to call this when full()).
+  void and_(const Constraint& c) {


It would be nice to avoid burdening the caller with making this decision. We could just arbitrarily drop extra constraints, or allow the user to supply a heuristic for determining which to keep. The end result will be the same if we remove this burden from the caller, since the caller will otherwise have to make the same decisions.

Interesting, yeah, maybe that is better. Should it then be fuzzyAnd..?

Yeah. Maybe "bounded" or "approximate" instead of "fuzzy?" Or alternatively we could use "join" and "meet" language to differentiate these operations from the precise logical connectives.

tlively · 2026-06-12T20:58:20Z

+    }
+  }
+
+  return Unknown;


Suggested change

return Unknown;

TODO: handle >, >=, <, and <=

return Unknown;

tlively · 2026-06-12T21:00:18Z

+    if (currResult == Unknown) {
+      // If something is unknown, it all is.
+      return Unknown;
+    }


IIUC, proving one of the other constraints False can take precedence over Unknown.

tlively · 2026-06-12T21:06:34Z

+  // If this is already implied by current constraints, then it is redundant.
+  // E.g. if we are { x = 10 } and other is { x >= 0 } then all we need is
+  // { x >= 0 } as the result of the OR.
+  if (eval(other) == True) {
+    *this = other;
+    return;
+  }
+  if (other.eval(*this) == True) {
+    return;
+  }


I'm suggesting we resolve the TODO :)

But I guess I can just say in advance that the simplest way to do this will be in terms of a fuzzyOr operation on a pair of Constraints, so I guess we don't need to do it now.

tlively · 2026-06-12T21:07:24Z

+// A constraint: some operation and some value, like "is equal to 17" or "is
+// less than local 6".
+struct Constraint {
+  Abstract::Op op = Abstract::Invalid;


We can still avoid adding Abstract::Invalid, though.

kripken added 17 commits June 8, 2026 14:07

go

0fc9e55

go

5ad1b75

go

8181363

go

d0ad2f4

go

c5b7d1d

go

0e35b2b

go

60d50b4

feedback

626b5d7

feedback

5896406

fix

cf29b58

clean

cdaff6b

Merge remote-tracking branch 'myself/inplace' into constraint.by.itself

c02ba2e

clean

94d2161

clean

735d7ea

const

6e80fde

undo CFP change

cf7fcc6

Merge branch 'inplace' into constraint.by.itself

ac02454

kripken requested a review from a team as a code owner June 8, 2026 23:12

kripken requested review from stevenfontanella and removed request for a team June 8, 2026 23:12

kripken added 4 commits June 8, 2026 16:14

tidy

0930461

add.assert

7b7d2ac

fix.comment

679bd24

Merge remote-tracking branch 'origin/main' into constraint.by.itself

920e7a9

tlively reviewed Jun 8, 2026

View reviewed changes

value => term

44ad794

kripken mentioned this pull request Jun 9, 2026

Add a OneOf lattice #8821

Draft

kripken added 2 commits June 10, 2026 09:49

check => eval

2f0bdd7

fix new clang-18 warning

7f77016

tlively reviewed Jun 12, 2026

View reviewed changes

kripken added 9 commits June 12, 2026 12:45

refactor to use more helpers and less nesting

ea3db17

format

060d8aa

simpl

088a425

form

9267485

simpl

3965c27

remove

280e7b3

move code as requested

26dfc30

eval => proves

c1dacec

form

3952dfc

tlively reviewed Jun 12, 2026

View reviewed changes

		if (auto* aConstant = std::get_if<Literal>(&a.term)) {
		if (auto* bConstant = std::get_if<Literal>(&b.term)) {

Conversation

kripken commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tlively left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MaxGraey commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jun 9, 2026

Uh oh!

MaxGraey commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tlively commented Jun 10, 2026

Uh oh!

kripken commented Jun 10, 2026

Uh oh!

tlively commented Jun 10, 2026

Uh oh!

kripken commented Jun 10, 2026

Uh oh!

MaxGraey commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jun 10, 2026

Uh oh!

MaxGraey commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jun 10, 2026

Uh oh!

tlively commented Jun 10, 2026

Uh oh!

kripken commented Jun 10, 2026

Uh oh!

tlively commented Jun 10, 2026

Uh oh!

tlively commented Jun 10, 2026

Uh oh!

tlively commented Jun 10, 2026

Uh oh!

kripken commented Jun 12, 2026

Uh oh!

MaxGraey commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MaxGraey commented Jun 12, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

kripken commented Jun 8, 2026 •

edited

Loading

MaxGraey commented Jun 9, 2026 •

edited

Loading

kripken commented Jun 9, 2026 •

edited

Loading

MaxGraey commented Jun 9, 2026 •

edited

Loading

MaxGraey commented Jun 10, 2026 •

edited

Loading

MaxGraey commented Jun 10, 2026 •

edited

Loading

MaxGraey commented Jun 12, 2026 •

edited

Loading