Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counterexamples As Assumptions #5013

Merged
merged 86 commits into from
Apr 9, 2024

Conversation

Dargones
Copy link
Collaborator

@Dargones Dargones commented Jan 25, 2024

Problem

Right now Dafny reports counterexamples using special syntax that may be difficult to understand. For example, consider the following Dafny program that defines a standard List datatype with a function View that maps the list to a sequence of integers:

datatype Node = Cons(next:Node, value:int) | Nil {
  function View():seq<int> 
    ensures Cons? ==> |View()| > 0 && View()[0] == value && View()[1..] == next.View()
  {
    if Nil? then [] else [value] + next.View()
  }
}

Suppose we were to (incorrectly) assert that the list cannot correspond to the sequence of integers [1, 2, 3], like so:

method m(list:Node) {
  assert list.View() != [1, 2, 3];
}

Currently, Dafny would return the following counterexample:

At "method m(list:Node) {" (file.dfy:20):
    list:Problem.Node = Cons(next := @0, value := 1)
    @0:Problem.Node = Cons(next := @2, value := 2)
    @2:Problem.Node = Cons(next := @4, value := 3)
    @4:Problem.Node = Nil

This counterexample is confusing because it does not explain what the meaning of @0, @1, etc. is. The notation is different from the Dafny syntax, and it also does not capture some of the information that is actually contained within the counterexample because there is no way to express this information using this custom syntax. In particular, the counterexample constrains the value returned by calling View() on the list variable. This constrain might be redundant in this particular example but in general, we would want to capture all the information contained in the counterexample.

Solution

This PR redesigns the counterexample generation functionality so that counterexamples are represented internally and can be printed as Dafny assumptions. For example, for the program above, the counterexample will now look like this:

assume Node.Cons(Node.Cons(Node.Cons(Node.Nil, 3), 2), 1) == list 
    && Node.Cons(Node.Cons(Node.Cons(Node.Nil, 3), 2), 1).View.requires() 
    && [1, 2, 3] == Node.Cons(Node.Cons(Node.Cons(Node.Nil, 3), 2), 1).View() 
    && Node.Cons(Node.Cons(Node.Nil, 3), 2).View.requires() 
    && [2, 3] == Node.Cons(Node.Cons(Node.Nil, 3), 2).View() 
    && Node.Cons(Node.Nil, 3).View.requires() 
    && [3] == Node.Cons(Node.Nil, 3).View() 
    && Node.Nil.View.requires() 
    && [] == Node.Nil.View();

While admittedly more verbose because it explores the return values of functions, this counterexample precisely constrains the argument that leads to the assertion violation. In particular, if you were to insert this assumption into the code and revert the assertion, Dafny would verify that this counterexample is correct.

At a high level, this makes the following changes:

  1. Represent counterexamples as Dafny statements that the user can insert directly into their code to debug the failing assertion.
  2. Make sure the counterexamples are wellformed, i.e. the constraints are ordered in such a way that the resulting assumption verifies.
  3. Support counterexample constraints over function return values (as an example of something that can really only be done using native Dafny syntax)
  4. Automatically fix counterexamples that are internally inconsistent, when such an inconsistency can be easily detected. For instance, a counterexample will never describe negative indices of an array or sequence, call a set empty if it contains elements, etc. All of these are possible in the counterexample model returned by Boogie but we prune such inconsistencies before reporting the counterexample, when possible.

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

@Dargones
Copy link
Collaborator Author

You are right that in this particular case, the output of View() is fully determined by the contents of the list. There are two reasons to show it as part of the counterexample, I think. First, the user might find its difficult to compute the return value themselves. Second, if View were an uninterpreted function, or if its specification were not strong enough, we would want the counterexample to constrain the return value. And because one cannot determine how strong a specification is at compile time, we have to constrain return values at all times, including in this case. In practice, the counterexample will only constrain return values that are specifically mentioned in the Z3 counterexample model, so the number of such constraints is bounded.

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Mar 19, 2024

Second, if View ... its specification were not strong enough, we would want the counterexample to constrain the return value. And because one cannot determine how strong a specification is at compile time, we have to constrain return values at all times, including in this case.

I see. If the function has a body and is revealed in the test, doesn't that imply that it is fully specified, and no fresh information is provided by showing its return values?

@Dargones
Copy link
Collaborator Author

Dargones commented Mar 19, 2024

Ah, I see. Yes, I think the only situation in which you can drop a function constraint for certain is if (i) the function has a body, and (ii) all function arguments are fully constrained, including the receiver. Both are true for this example but making this check would be non-trivial in general.

@keyboardDrummer
Copy link
Member

It seems like there's two ways of using the unknown-model. One is to generate a test and then run the program, to check whether this model is a counter-example, and the other is to generate an assumption that's placed at the start of the operation under test.

Are you enabling users to use both approaches when asking the Dafny CLI for a counter-example? How does a user get the test generating behavior?

@Dargones
Copy link
Collaborator Author

Dargones commented Mar 20, 2024

Edit: right now, one can only get a counterexample in the form of assumptions. To enable test generation behavior, the counterexample model would need to be cross-referenced with the source code, so this is something that could be left as a future PR.

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Mar 21, 2024

Edit: right now, one can only get a counterexample in the form of assumptions. To enable test generation behavior, the counterexample model would need to be cross-referenced with the source code, so this is something that could be left as a future PR.

I think that would be valuable, because if it can be generated, and it does cause an assertion to fail at runtime, it provides a guaranteed counter-example, instead of a counter-example guess (the model produced after an unknown result). It seems like the only way we're aware of to determine if a counter-example guess is correct.

@Dargones
Copy link
Collaborator Author

Have just pushed a commit that adds a warning to notify the user that the counterexample may not be valid due to the solver returning UNKOWN. I would prefer to leave counterexample execution for future work, since it will be a substantial change/addition to the PR. Executing a counterexample is, I think, indeed the only certain way to determine that the counterexample is valid but it is also not clear how applicable such a feature would be (a lot of Dafny code is not intended for execution and code that is intended for execution can fail verification for reasons that would not come up at runtime).

Note that Danfy cannot guarantee that the counterexample
it reports provably violates the assertion or that the assumptions are not
mutually inconsistent (see [^smt-encoding]), so this output should be
expected manually and treated as a hint.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected => inspected

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Mar 22, 2024

Have just pushed a commit that adds a warning to notify the user that the counterexample may not be valid due to the solver returning UNKOWN. I would prefer to leave counterexample execution for future work, since it will be a substantial change/addition to the PR. Executing a counterexample is, I think, indeed the only certain way to determine that the counterexample is valid but it is also not clear how applicable such a feature would be (a lot of Dafny code is not intended for execution and code that is intended for execution can fail verification for reasons that would not come up at runtime).

Sounds good, thanks. I'm not sure what the best terminology is for these potential counterexamples, but we can iterate on that.

where to write the counterexample, as well as the
`-proverOpt:O:model_compress=false` and
`-proverOpt:O:model.completion=true` options.
* `-extractCounterexample` - if verification fails, report a possible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subjective nitpick: possible => potential

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@atomb atomb added this pull request to the merge queue Apr 8, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Apr 8, 2024
@atomb atomb added this pull request to the merge queue Apr 8, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Apr 8, 2024
@Dargones Dargones mentioned this pull request Apr 9, 2024
@atomb atomb enabled auto-merge (squash) April 9, 2024 15:28
@atomb atomb merged commit 281ed82 into dafny-lang:master Apr 9, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants