SMTChecker: Inline let expressions in solvers' models #15336

blishko · 2024-08-15T13:40:24Z

Models computed by CHC solvers (especially Z3) can contain SMT-LIB's let expressions. These are used to name a subterm and use the name instead of repeating the subterm possibly many times.

The easiest (but not necessary the best) way to deal with let terms is to inline (expand) them. In our current experiments inlining did not pose any problem. If, in the future, this would turn out to cause problems, we can devise a way to process let terms without expanding the term to its full tree-like representation.

This is part of the preparation to use Z3 as an external solver. It allows us to properly process models returned by Z3.
It has been separated from #15252.

ekpyron · 2024-08-15T18:44:48Z

We potentially also have quantifiers and thus otherwise bound variables in the smt expressions, right?
Could stuff like

  let x = 0 in (forall x . x < 0)

(I'm not fluent in writing smtlib, so however you'd actually write this properly)

happen?

If so, the current code looks like it'd still just substitute x, even if it's bound in a subexpression of the let-body. (So you'd get something like forall x . 0 < 0 or even forall 0 . 0 < 0, which is either no longer equivalent or even invalid...)
Will this only ever be run on smt expressions that don't contain other binders like quantifiers? Or do we have some unique-name property that will prevent a nested binder from using the same name that was previously bound by a let?

If this can happen, then I guess, the let-body (the part in which the let-bound things occur I mean by that) would need to be checked for other binders of the let-bound names and they'd need to be prevented from being replaced in further subexpressions of such nested binders or such...

But yeah, I haven't thought about what the expressions could look like that this runs on, so maybe this will just never happen and it's fine - but for arbitrary smtlib2 expressions I'd guess this could be an issue...
But scopedParser.toSMTUtilExpression that is run on the result of all this in the end at least does have quantifier cases :-).

ekpyron · 2024-08-15T19:20:35Z

If I'm right, maybe something along the lines of this is safer?

void inlineLetExpressions(SMTLib2Expression& _expr, std::unordered_map<std::string, std::reference_wrapper<SMTLib2Expression>> const& _substitutions = {})
{
	if (isAtom(_expr))
	{
		if (auto* bound = util::valueOrNullptr(_substitutions, asAtom(_expr)))
			_expr = *bound;
		return;
	}
	auto& subexprs = asSubExpressions(_expr);
	solAssert(!subexprs.empty());
	auto const& first = subexprs.at(0);
	if (isAtom(first))
	{
		if (asAtom(first) == "let")
		{
			solAssert(subexprs.size() == 3);
			solAssert(!isAtom(subexprs[1]));
			auto& bindingExpressions = asSubExpressions(subexprs[1]);
			auto newSubstitutions = _substitutions;
			for (auto& binding: bindingExpressions)
			{
				solAssert(!isAtom(binding));
				auto& bindingPair = asSubExpressions(binding);
				solAssert(bindingPair.size() == 2);
				solAssert(isAtom(bindingPair.at(0)));
				inlineLetExpressions(bindingPair.at(1), _substitutions);
				if (
					auto [it, newlyInserted] = newSubstitutions.emplace(asAtom(bindingPair.at(0)), bindingPair.at(1));
					!newlyInserted
				)
					// overwrite existing bindings
					it->second = bindingPair.at(1);
			}
			
			inlineLetExpressions(subexprs.at(2), newSubstitutions);

			// update the expression
			auto tmp = std::move(subexprs.at(2));
			_expr = std::move(tmp);
			return;
		}
		else if (asAtom(first) == "forall" || asAtom(first) == "exists") // any other binders?
		{
			auto newSubstitutions = _substitutions;
			smtAssert(subexprs.size() == 3);
			for (auto const& sortedVar: asSubExpressions(subexprs.at(1)))
				// remove substitutions for the quantifier-bound names
				newSubstitutions.erase(asAtom(asSubExpressions(sortedVar).at(0)));
			inlineLetExpressions(subexprs.at(2), newSubstitutions);
			return;
		}
	}

	// not a let expression, just process all arguments recursively
	for (auto& subexpr: subexprs)
		inlineLetExpressions(subexpr, _substitutions);
}

?

(I've rather blindly written that - the snippet stores references to SMTLib2Expressions and I haven't fully thought through whether what they reference stays alive long enough - and not sure how expensive it'd be to copy instead)

ekpyron · 2024-08-15T19:25:24Z

libsmtutil/CHCSmtLib2Interface.cpp

+	auto const& first = subexprs.at(0);
+	if (isAtom(first) && asAtom(first) == "let")
+	{
+		solAssert(subexprs.size() >= 3);


Suggested change

solAssert(subexprs.size() >= 3);

solAssert(subexprs.size() == 3);

or why can the rest be ignored, if there can be any more subexpressions?

Fixed to assert exactly 3 subexpressions.

blishko · 2024-08-16T07:02:22Z

You are right in the sense that the expression could contain quantifiers and theoretically it could bind the same variable.
In such cases, it would indeed not work correctly.
I was counting on the solver being sane and not doing that.
In general, AFAIK, solvers behave reasonably and use unique names when introducing let terms.

I will check your suggestion.

ekpyron · 2024-08-16T14:29:31Z

You are right in the sense that the expression could contain quantifiers and theoretically it could bind the same variable. In such cases, it would indeed not work correctly. I was counting on the solver being sane and not doing that. In general, AFAIK, solvers behave reasonably and use unique names when introducing let terms.

I will check your suggestion.

Yeah, ok - this is part of parsing the solver-generated counterexamples, right? It stands to reason that they internally use unique names and then also serialize them sanely, yeah. So if my suggestion doesn't work due to some life-time issue (or you think it copying the substitutions more often is too expensive for the kinds of expressions we expect here), I'd be fine with just documenting it - but that would be nice at least then, just to make sure that nobody ever tries to use this on code for which we couldn't assume sane/unique names. Conversely, if we assume unique names anyways, I guess that'd also go for the same let-name not being "re-bound", which could simplify the code a bit further... unfortunately, asserting against non-unique names in binders there is almost as much of a complication as treating them properly :-).

But yeah, your call, I'd approve this as is, if this part is noted in a comment or such.

Models computed by CHC solvers (especially Z3) can contain SMT-LIB's let expressions. These are used to name a subterm and use the name instead of repeating the subterm possibly many times. The easiest (but not necessary the best) way to deal with let terms is to inline (expand) them. In our current experiments inlining did not pose any problem. If, in the future, this would turn out to cause problems, we can devise a way to process let terms without expanding the term to its full tree-like representation. This is part of the preparation to use Z3 as an external solver. It allows us to properly process models returned by Z3.

blishko · 2024-08-19T09:18:36Z

@ekpyron, I added a case to handle quantifiers based on your suggestion, but I kept my original data structure.
I think your approach would work, but I wanted to avoid all the copying (and for very deep nested lets, keeping all copies around might consume unexpectedly large amounts of memory).
I can use my data structure by explicitly rebinding names of quantified variables to the variable itself.
This way, if a let contains a quantifier with the same variable name as the let expression, inside the quatified expression the variable would stay as is, and not be expanded according to its let definition.

blishko · 2024-08-19T09:20:03Z

And yes, this is only for parsing the solver-generated counterexamples. I could even hide the method in the cpp file and not expose it, if that is preferable.

blishko force-pushed the smt-lets branch from 6d6a1f8 to 29c6f3b Compare August 15, 2024 14:40

ekpyron added the 🟡 PR review label label Aug 15, 2024

ekpyron reviewed Aug 15, 2024

View reviewed changes

blishko force-pushed the smt-lets branch from 29c6f3b to 9be8d3d Compare August 19, 2024 08:22

blishko force-pushed the smt-lets branch from 9be8d3d to b172de5 Compare August 19, 2024 08:48

ekpyron approved these changes Aug 19, 2024

View reviewed changes

blishko merged commit 98b22f8 into develop Aug 19, 2024
72 checks passed

blishko deleted the smt-lets branch August 19, 2024 12:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SMTChecker: Inline let expressions in solvers' models #15336

SMTChecker: Inline let expressions in solvers' models #15336

blishko commented Aug 15, 2024 •

edited

Loading

ekpyron commented Aug 15, 2024 •

edited

Loading

ekpyron commented Aug 15, 2024 •

edited

Loading

ekpyron Aug 15, 2024

blishko Aug 19, 2024

blishko commented Aug 16, 2024

ekpyron commented Aug 16, 2024

blishko commented Aug 19, 2024

blishko commented Aug 19, 2024

	solAssert(subexprs.size() >= 3);
	solAssert(subexprs.size() == 3);

SMTChecker: Inline let expressions in solvers' models #15336

SMTChecker: Inline let expressions in solvers' models #15336

Conversation

blishko commented Aug 15, 2024 • edited Loading

ekpyron commented Aug 15, 2024 • edited Loading

ekpyron commented Aug 15, 2024 • edited Loading

ekpyron Aug 15, 2024

Choose a reason for hiding this comment

blishko Aug 19, 2024

Choose a reason for hiding this comment

blishko commented Aug 16, 2024

ekpyron commented Aug 16, 2024

blishko commented Aug 19, 2024

blishko commented Aug 19, 2024

blishko commented Aug 15, 2024 •

edited

Loading

ekpyron commented Aug 15, 2024 •

edited

Loading

ekpyron commented Aug 15, 2024 •

edited

Loading