Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for disjunctive (“or”) patterns #2448

Merged
merged 19 commits into from
Jul 22, 2022

Conversation

cpitclaudel
Copy link
Member

  • Source/Dafny/AST/DafnyAst.cs (DisjunctivePattern):
    New subclass of ExtendedPattern.
  • Source/Dafny/AST/Printer.cs (PrintExtendedPattern):
    Add support for disjunctive patterns.
  • Source/Dafny/Cloner.cs (CloneExtendedPattern):
    Refactor, add support for disjunctive patterns.
  • Source/Dafny/Dafny.atg
    (ExtendedPattern): Rename to SingleExtendedPattern.
    (ExtendedPattern): New production for |-separated patterns.
  • Source/Dafny/Resolver.cs
    (FreshTempVarName): Refactor, remove obsolete comment.
  • Source/Dafny/Resolver.cs
    (RBranch, RBranchStmt): Reject disjunctive patterns.
    (RemoveDisjunctivePatterns): New function to detect, report, and remove
    nested disjunctive patterns and variables in disjunctive patterns.
    (FlattenDisjunctivePatterns): New function to convert a single
    DisjunctivePattern into one extended pattern per alternative.
    (FlattenNestedMatchCaseExpr): New wrapper around
    FlattenDisjunctivePatterns for NestedMatchCaseExpr.
    (CompileNestedMatchExpr): Use it.
    (FlattenNestedMatchCaseStmt): New wrappers around
    FlattenDisjunctivePatterns for NestedMatchCaseStmt.
    (CompileNestedMatchStmt): Use it.
    (CheckLinearExtendedPattern): Check the branches of each disjunctive pattern
    separately.

Fixes #2220

* `Source/Dafny/AST/DafnyAst.cs` (`DisjunctivePattern`):
  New subclass of `ExtendedPattern`.
* `Source/Dafny/AST/Printer.cs` (`PrintExtendedPattern`):
  Add support for disjunctive patterns.
* `Source/Dafny/Cloner.cs` (`CloneExtendedPattern`):
  Refactor, add support for disjunctive patterns.
* `Source/Dafny/Dafny.atg`
  (`ExtendedPattern`): Rename to `SingleExtendedPattern`.
  (`ExtendedPattern`): New production for `|`-separated patterns.
* `Source/Dafny/Resolver.cs`
  (`FreshTempVarName`): Refactor, remove obsolete comment.
* `Source/Dafny/Resolver.cs`
  (`RBranch`, `RBranchStmt`): Reject disjunctive patterns.
  (`RemoveDisjunctivePatterns`): New function to detect, report, and remove
  nested disjunctive patterns and variables in disjunctive patterns.
  (`FlattenDisjunctivePatterns`): New function to convert a single
  `DisjunctivePattern` into one extended pattern per alternative.
  (`FlattenNestedMatchCaseExpr`): New wrapper around
  `FlattenDisjunctivePatterns` for `NestedMatchCaseExpr`.
  (`CompileNestedMatchExpr`): Use it.
  (`FlattenNestedMatchCaseStmt`): New wrappers around
  `FlattenDisjunctivePatterns` for `NestedMatchCaseStmt`.
  (`CompileNestedMatchStmt`): Use it.
  (`CheckLinearExtendedPattern`): Check the branches of each disjunctive pattern
  separately.

Closes #2220.
@@ -2870,6 +2870,14 @@ public class Printer {
wr.Write(")");
}
break;
case DisjunctivePattern dp:
var psep = "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we have a difference in style preference. I always write names out in full, so here it would be pattern or disjunctivePattern and patternSeparator. Maybe we can agree on that we should optimise for reading? Would that change your preference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I find the shorter names more readable:

(-b + sqr(b*b - 4*a*c)) / (2 * a)
vs
(-linearFactor +- squareRoot(linearFactor * linearFactor - 4 * quadraticFactor * constantFactor)) / (2 * quadraticFactor)

In the snippet you highlighted (where there's a single variable) I don't have strong preferences; I just followed the style of the code right above.

For externally visible parameters (like function arguments or class fields) I prefer long names.

Copy link
Member

@keyboardDrummer keyboardDrummer Jul 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your example is more fair if you also include the legend for the small variable names, and add line breaks:

var a = quadraticFactor;
var b = linearFactor;
var c = constantFactor;
return (-b + sqr(b*b - 4*a*c)) / (2 * a)

vs

return 
    (-linearFactor + squareRoot(linearFactor * linearFactor - 
                                4 * quadraticFactor * constantFactor)) 
  / (2 * quadraticFactor)

private IEnumerable<NestedMatchCaseStmt> FlattenNestedMatchCaseStmt(NestedMatchCaseStmt c) {
var cloner = new Cloner();
foreach (var pat in FlattenDisjunctivePatterns(c.Pat)) {
yield return new NestedMatchCaseStmt(c.Tok, pat, c.Body.ConvertAll(cloner.CloneStmt), cloner.CloneAttributes(c.Attributes));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to clone here instead of use the old body and attributes? Will you be using the old cases?

Copy link
Member Author

@cpitclaudel cpitclaudel Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the old body and attributes are used in the linearity checker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But still, what happens if you don't clone them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most likely Dafny will crash. My answer above wasn't complete: notice how I don't clone pat, because this one isn't being shared. But I do need to clone c.Body, because that's the same body being copied into each newly created NestedMatchCaseStmt. Basically this code goes from:

case A | B | C => body

to

case A => Clone(body)
case B => Clone(body)
case C => Clone(body)

If we don't clone the body we'll have multiple places in the AST sharing the same reference to the body (and typechecking it multiple times, among other things)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made sense, but I was wrong :/ The crash I had observed was not due to not cloning, and in fact not cloning is wrong here, because of #2334 (the key point being that the branches are already partly resolved, and hence cloning them isn't safe). Thanks for flagging this.

@@ -12510,12 +12545,13 @@ private class RBranchExpr : RBranch {
Console.WriteLine("DEBUG: CompileNestedMatchExpr for match at line {0}", e.tok.line);
}

MatchTempInfo mti = new MatchTempInfo(e.tok, e.Cases.Count, opts.codeContext, DafnyOptions.O.MatchCompilerDebug);
var cases = e.Cases.SelectMany(FlattenNestedMatchCaseExpr).ToList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the right time for flattening the disjunctive patterns? Can't that be done before resolution as a separate rewriter?

Copy link
Member Author

@cpitclaudel cpitclaudel Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, but I'm not sure it's what we want: I wrote it this way to give other code access to the disjunctive patterns. I think of this flattening as an implementation detail of the pattern compiler.

Copy link
Member

@keyboardDrummer keyboardDrummer Jul 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote it this way to give other code access to the disjunctive patterns.

Didn't we want to use many small lowering transformations when possible, to make each individual transformation simpler? I think it would be a big benefit if we can avoid adding code to Resolver.cs.

Also, in what scenario do you envision someone needs access to disjunctive patterns? And if they need it, why couldn't they sit before the lowering transformation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we want to use many small lowering transformations when possible, to make each individual transformation simpler?

Indeed, and we have that here, right? There's just a handful of calls in the resolver to new code that pre-processes the pattern.

I think it would be a big benefit if we can avoid adding code to Resolver.cs.

I can move it to a new file, but all of the other pattern matching compilation code is there, so… ?

Originally, the pattern matching compiler was implemented all in the parser. It was moved to the resolver, and in fact it should really be moved to the compiler. In fact, in the long run, disjunctive patterns should be supported at any level of nesting, so the nice separation we have right now won't survive. But for now it's good to have the separated implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, and we have that here, right? There's just a handful of calls in the resolver to new code that pre-processes the pattern.
I can move it to a new file, but all of the other pattern matching compilation code is there, so… ?

I meant moving the code to a separate file and not calling it from Resolver.cs, except for possibly adding a DisjunctCaseRewriter to the rewriters list. Would that be possible?

In fact, in the long run, disjunctive patterns should be supported at any level of nesting, so the nice separation we have right now won't survive. But for now it's good to have the separated implementation.

I don't know the pattern matching transformations well enough to comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant moving the code to a separate file and not calling it from Resolver.cs, except for possibly adding a DisjunctCaseRewriter to the rewriters list. Would that be possible?

Possibly? I think it might work, because this pass doesn't use type information. The problem is that pattern matching compilation happens in the middle of resolution, and it uses some type info.

Also, do we have the architecture in place for writing such a pass? I need something that traverses all statements and expressions in a program. Do we have a visitor that does that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editing to add this following our conversation: it would be nice to add such a visitor, but one issue with rewriting this systematically would be that clients would lose access to the unresolved AST containing the full match structure.

@@ -12501,6 +12494,48 @@ private class RBranchExpr : RBranch {
}
}

private ExtendedPattern RemoveDisjunctivePatterns(ExtendedPattern pat, bool allowVariables) {
Copy link
Member

@keyboardDrummer keyboardDrummer Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name only captures half of the behavior, so maybe replace Disjunctive with Illegal. And maybe replace Remove with Handle since you are both replacing patterns and emitting an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@keyboardDrummer
Copy link
Member

Cool!

case LitPattern:
return pat;
case IdPattern p:
if (p.Arguments == null && !p.Id.StartsWith("_") && !allowVariables) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the check !p.Id.StartsWith("_") for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wildcards (_ patterns) are implemented as _-prefixed variables, and we don't want to complain about those.

Copy link
Member

@keyboardDrummer keyboardDrummer Jul 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the documentation:

WildIdent = NoUSIdent | "_"
Identifier, disallowing leading underscores, except the “wildcard” identifier _. When _ appears it is replaced by a unique generated identifier distinct from user identifiers. This wildcard has several uses in the language, but it is not used as part of expressions.

Shouldn't it be p.Id == "_" then?
If the point of a wildcard is to effectively not capture the variable, then why would we permit an identifier after the underscore?

Also, shouldn't there be a function Identifier.IsWildcard(p.id) to make this more explicit?

Also, is wildcard a good name? Hole seems more fitting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be p.Id == "_" then? If the point of a wildcard is to effectively not capture the variable, then why would we permit an identifier after the underscore?

No: internally, "_" is treated as a variable and gets renamed into a unique variable name starting with an underscore, so by the time we get here the holes are named something like _v64.

A future refactoring should probably just add a WildcardPattern (or HolePattern, indeed; I think it's a better fit)

Also, shouldn't there be a function Identifier.IsWildcard(p.id) to make this more explicit?

OK, I will do that. Even better, I can make it a property of the identifier pattern.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@cpitclaudel cpitclaudel enabled auto-merge (squash) July 21, 2022 21:06
@cpitclaudel
Copy link
Member Author

It turns out that I do need resolution information, and hence this transformation can't be implemented as a pre-resolve pass. The is the relevant case, for which I've added a test just now:

module Constants {
  const ONE := 1
  const TWO := 2

  method M(i: int) {
    match i
      case ONE | TWO => return; // `ONE` and `TWO` are not variables here
      case _ => // Not redundant
  }
}

@cpitclaudel cpitclaudel enabled auto-merge (squash) July 22, 2022 07:07
@cpitclaudel cpitclaudel merged commit a47f0c0 into master Jul 22, 2022
@cpitclaudel cpitclaudel deleted the cpitclaudel_deep-or-patterns branch July 22, 2022 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support or patterns in matches
2 participants