Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update --enforce-determinism #3052

Merged
merged 12 commits into from
Nov 22, 2022

Conversation

keyboardDrummer
Copy link
Member

@keyboardDrummer keyboardDrummer commented Nov 14, 2022

Changes

  • --enforce-determinism enforces assigning variables with an auto-init type as well.
  • Replace the --strict-definite-assignment option with the inverse --relax-definite-assignment, which can only be used without --enforce-determinism. Why?
    • We expect changing the default leads to more Dafny programs using the same setting, since many programs use --enforce-determinism and that requires strict definite assignment rules.
    • There seem to be more use-cases in which --strict-definite-assignment is useful, see the discussion for more information.

Testing

  • Remove usage of --strict-definite-assignment in tests
  • Have a test use both --enforce-determinism and --relax-definite-assignment
  • Have a test use only --relax-definite-assignment

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

…nit type as well. Remove --strict-definite-assignment option
Copy link
Member

@cpitclaudel cpitclaudel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to enforce-determinism looks good, but why remove strict-definite-assignment? Does this mean that a customer cannot reproduce the full range of /definiteAssignment:<n> anymore?

@keyboardDrummer
Copy link
Member Author

keyboardDrummer commented Nov 14, 2022

The change to enforce-determinism looks good, but why remove strict-definite-assignment? Does this mean that a customer cannot reproduce the full range of /definiteAssignment:<n> anymore?

That's correct, 0 (no check) is already no longer available and 2 wouldn't be available after this PR. Do you think there's a use-case for 0 or 2?

@cpitclaudel
Copy link
Member

definiteAssignment:2 avoids a lot of mistakes without significantly weakening the language; for beginners that's the value I would use by default. A common case is something like this:

function Sum(s: seq<nat>): (sum: nat)
{
  if s == [] then 0 else s[0] + Sum(s[1..])
} by method {
  for i := 0 to |s|
    invariant sum + Sum(s[i..]) == Sum(s)
  {
    sum := sum + s[i];
  }
}

… where the missing initialization of sum causes an error saying the invariant may not hold on entry. I make this mistake all the time, and the definiteAssignment:2 flag has a much better error message:

Error: variable 'sum', which is subject to definite-assignment rules, might be uninitialized here

@keyboardDrummer
Copy link
Member Author

keyboardDrummer commented Nov 15, 2022

definiteAssignment:2 avoids a lot of mistakes without significantly weakening the language; for beginners that's the value I would use by default. A common case is something like this:

function Sum(s: seq<nat>): (sum: nat)
{
  if s == [] then 0 else s[0] + Sum(s[1..])
} by method {
  for i := 0 to |s|
    invariant sum + Sum(s[i..]) == Sum(s)
  {
    sum := sum + s[i];
  }
}

… where the missing initialization of sum causes an error saying the invariant may not hold on entry. I make this mistake all the time, and the definiteAssignment:2 flag has a much better error message:

Error: variable 'sum', which is subject to definite-assignment rules, might be uninitialized here

Thanks! I forgot that the uninitialised value is non-deterministic. Easy access counter-examples would make this problem relatively trivial to debug though.

Do you have a similar example where definiteAssignment:2 would be too strict? I have trouble imagining a scenario where the non-deterministic value would be useful.

@cpitclaudel
Copy link
Member

Do you have a similar example where definiteAssignment:2 would be too strict?

Not really (it's easy in general to fix the errors that definiteAssignment:2 creates). It would be interesting to see how much code already respects this flag.

@keyboardDrummer keyboardDrummer changed the title Update --enforce-determism Update --enforce-determinism Nov 15, 2022
@cpitclaudel
Copy link
Member

Following our chat, :2 rejects this, which may be useful:

method M() {
  var v := new int[15];
}

Copy link
Member

@cpitclaudel cpitclaudel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks reasonable, but if we go this route won't we break large amounts of code? At the very least we'd have to rewrite the manual (e.g. http:https://dafny.org/dafny/DafnyRef/DafnyRef.html#sec-definite-assignment), and there's also the worry that this doesn't match Rustan's book. Maybe @RustanLeino can comment?

@RustanLeino
Copy link
Collaborator

As a baseline, the language always requires:

  • A compiled local variable, out-parameter, field, or array element whose type does not support auto-initialization is subject to definite-assignment rules
  • A ghost local variable, out-parameter, field, or array element whose type is possibly empty is subject to definite-assignment rules
  • A compiled yield-parameter must be of an auto-init type, and such a parameter is not subject to definite-assignment rules
  • A ghost yield-parameter must be of a nonempty type, and such a parameter is not subject to definite-assignment rules

Note that an assignment with a right-hand side * (e.g., x := *;) is not considered to be a definite assignment. (We could change this, if this were a problem or if it would give us an opportunity to do something else interesting.)

Auto-initialization for compiled variables and the equivalent for ghost variables is nondeterministic. The language only promises to pick a value in the variable's type.

Here are some additional things that some users may want:

  • A0: Enforce definite-assignment rules for every compiled local variable, out-parameter, field, and array element
  • A1: Enforce definite-assignment rules for every ghost local variable, out-parameter, field, and array element
  • A2: Enforce that the compiled portion of every statement is deterministic
  • A3: Enforce that the ghost portion of every statement is deterministic

Note that A0 and A1 still allow nondeterministic statements, such as if-case statements.

A2 implies A0 and implies that an iterator cannot declare any compiled yield-parameters.

A3 implies A0, A1, and A2 (since compiled variables can be used in ghost statements) and implies that an iterator cannot declare any yields parameters.

A2 (and thus also A3) has the property that values seen during execution will be the same in every run of the program.

Here are some combinations of these options, in terms of /definiteAssignment and what I'd suggest based on the names of flags in this PR.

/definiteAssignment checking my suggestion
0 not even baseline not supported
1 baseline default
2 baseline + A0 + A1 not supported
3 baseline + A0 + A1 + A2 + A3 --enforce-determinism --strict-definite-assignment
baseline + A0 --strict-definite-assignment
baseline + A0 + A2 --enforce-determinism

I agree we should drop /definiteAssignment:0. The only reason I'm suggesting the combination of new flags is to have something that's backward with /definiteAssignment:3. My guess is that such users would be happy with just --enforce-determinism. Indeed, I don't see why anyone would want A1.

@keyboardDrummer
Copy link
Member Author

What surprises me about your suggestion @RustanLeino is that combining --enforce-determinism and --strict-definite-assignment will introduce more checks than the sum of the checks you get when using those two options individually, since A3 is added.

Why do you think A0 is more useful than A1 ? My impression is that A0 and A1 both protect against thinking a variable is assigned a particular default value while it is really assigned an arbitrary value, and then getting a surprising error message, as is the case in @cpitclaudel 's example (#3052 (comment)).

Also, what's the use-case of A3? Why would I want my ghost code to be deterministic?

Also, do you think it would make sense to change the default of A0? I'm suggesting to include A0 by default, and to allow turning it off using --relax-definite-assignment, because I think this will lead to more consistent Dafny code in general since A0 is also included when using --enforce-determinism, and I think this default accommodates more use-cases.

@RustanLeino
Copy link
Collaborator

RustanLeino commented Nov 18, 2022

@keyboardDrummer You're right that my suggestion that both switches together give something more than the union of the switches individually is odd. I hadn't noticed that myself.

I propose we first decide which modes we want to support and then figure out names of switches for them.

  • I'm guessing that no one is interested in the unsound "not even baseline" mode. So, let's not support it.
  • Given the implications among the various baseline additions above, the possible modes are:
    • baseline
    • baseline + A0
    • baseline + A1
    • baseline + A1 + A0
    • baseline + A2 + A0
    • baseline + A2 + A1 + A0
    • baseline + A3 + A2 + A1 + A0

The argument for "just baseline" is that Dafny will still verify that all executions meet their specifications. So, the language doesn't need any further restrictions.

There are two arguments for A0. One is that a program may contain compiled variables that were not included in specifications. This can happen if it doesn't seem important enough to write specifications for that variable or if writing such specifications would involve so much effort that the engineer would rather resort to testing. In this case, a user may be surprised at the observed value at run time. The other argument for A0 is that it may be a cheap and easy-to-understand way to catch some errors. Case in point: @cpitclaudel's example above, where some postcondition of your program does not hold, yet everything "looks" okay to you, because you're not thinking about the missing initialization.

Regarding A1, the first argument for A0 does not apply, since the variable is ghost. However, the second argument for A0 still applies to A1.

The argument for "not A2, not A3" is that it allows program to omit arbitrary decisions. For example, you can write

if
case x <= y => return y;
case y <= x => return x;

without having to decide which of the two alternatives should be executed when x and y are equal. (In this example, the statement has the same effect whatever choice is made for x == y. However, the detection mechanism for A2 and A3 is either to exclude certain syntactic constructs or to verify that the guards in this example are non-overlapping--it is too difficult to verify that two statements do the same thing.)

A2 is all about getting repeatable results. If the language had concurrency, this determinacy would matter less.

A3 says ghost code gives deterministic results. This seems misguided to me. In fact, in proofs are often more beautifully written if you don't have to make certain choices. So, the only reason I can think of to keep A3 is that this is what we happened to do with /definiteAssignment:3, which is currently being used.

@RustanLeino
Copy link
Collaborator

From your comments and the above, I could imagine

mode flags
baseline --relax-definite-assignment
baseline + A0 possible default
baseline + A1 + A0 possible default
baseline + A2 + A0 possibly --enforce-determinism
baseline + A2 + A1 + A0 possibly --enforce-determinism
baseline + A3 + A2 + A1 + A0 --strict-enforce-determinism

with the idea that you cannot combine these flags.

Since it's nice to have more flexibility in proofs and ghost things, I'm not so fond of either A3 or A1. But one could argue that it's more consistent to bundle A0+A1 together.

If we do elect to change the language to make either A0 or A0+A1 the default, then we could also discuss repurposing the * RHS of assignments. Currently, * does not count as a definite assignment, but we could change this. This would allow you to express "I'm happy with an arbitrary value and I request of the compiler to fill one in for me". In essence, this would give the flexibility of the "just baseline" option, but catching simple mistakes by default. If we made this choice, we would enforce that * could only be used for auto-init types (or, for ghost assignments, nonempty types).

@keyboardDrummer
Copy link
Member Author

keyboardDrummer commented Nov 19, 2022

There are two arguments for A0

I feel that these two arguments are different scenario's in which the same confusion occurs. The confusion is that a programmer expects that a variable which has not been assigned but can be read, contains a particular default value. This expectation leads to a bug if there is no specification or test, and an unexpected verification error if there is a spec.

I think A1 protects against the same confusion but then in ghost code (a third scenario), so I would argue A0 and A1 should be bundled.

Currently, * does not count as a definite assignment, but we could change this.

Allowing := * with the A0 and A1 but not the A2 check seems strictly better. Is there a reason not to do that? If we have that, I would even propose marking --relax-definite-assignment as deprecated.

Is your reason for --strict-enforce-determinism only to be able to mimick /definiteAssignment:3 ? I think we can also not have it at first and only add it if a user requests it, at which point we can investigate the use-case.


I believe this PR currently has:

checks mode
baseline --relax-definite-assignment
baseline + A1 + A0 default
baseline + A3 + A2 + A1 + A0 --enforce-determinism

I suggest that we merge this PR as is, and cut an issue to remove A3 from --enforce-determinism and another one to allow := * with A0 and A1.

@RustanLeino
Copy link
Collaborator

Okay, sounds good. Let's cut this PR as is (with the small correction I remarked on in my PR review). Then, in separate PRs, let's

  • change the definite-assignment behavior of := *
  • mark --relax-definite-assignment as deprecated
  • make "baseline + A0 + A1" the default
  • drop A3 from --enforce-determinism

@keyboardDrummer keyboardDrummer dismissed cpitclaudel’s stale review November 22, 2022 10:29

Clement is on holiday and Rustan's review can cover for his

@keyboardDrummer keyboardDrummer merged commit 00ba953 into dafny-lang:master Nov 22, 2022
@keyboardDrummer keyboardDrummer deleted the enforceDeterminism2 branch November 23, 2022 11:25
@keyboardDrummer
Copy link
Member Author

@RustanLeino I'm not sure what additional check A3 adds. The following is currently allowed with /definiteAssignment:3

method Foo() {
    ghost var z: int :| true;
}

Is A3 already not a part of /definiteAssignment:3 or is there a particular Dafny construct that it's disallowing?

keyboardDrummer added a commit that referenced this pull request Feb 27, 2023
Fixes #3640

This PR fixes some issues with definite assignment and determinacy.

Recently, we changed the default rules for definite assignment in Dafny
(PR #3052). The main change was
to insist that all local variables and out-parameters be definitely
assigned before they are used. This was followed up by PR
#3311, which started considering
`:= *` assignments as fulfilling definite-assignment requirements. PR
#3311 also fixed a point where #3052 had gone too far; with PR #3311, it
is now again possible to allocate an array of an auto-init type without
having to explicitly assign the array elements.

There was one more point where PR #3052 had gone too far (reported in
#3640), and the present PR fixes that point. The issue regards fields of
a class. The present PR once again allows fields to be auto-initialized
(provided their types allow that).

This PR also makes `/definiteAssignment:4` available in the legacy CLI.
(Previously, it had been added internally, but not exposed in the legacy
CLI.)

Finally, this PR reports an error in the `--enforce-determinism` mode
for any class without a `constructor`. This was a latent omission in the
previous checking. Together with once again allowing auto-initialization
of fields, this change puts

``` dafny
class Cell { // constructor-less class not allowed with --enforce-determinism
  var data: int
}
method NondeterministicReturn() returns (r: int) {
  var c := new Cell;
  return c.data;
}
```

on the same footing as

``` dafny
class Cell {
  var data: int
  constructor () { // with --enforce-determinism, this constructor must initialize .data    
  }
}
method NondeterministicReturn() returns (r: int) {
  var c := new Cell();
  return c.data;
}
```

I argue this PR fixes the new definite-assignment mode in an important
way and therefore should be included in Dafny 4.0.

The documentation (see, e.g.,
#3584) may need to be updated.

<small>By submitting this pull request, I confirm that my contribution
is made under the terms of the [MIT
license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>

---------

Co-authored-by: Remy Willems <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants