Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring needed: Replace "Tok" with a format #3290

Open
MikaelMayer opened this issue Dec 29, 2022 · 0 comments
Open

Refactoring needed: Replace "Tok" with a format #3290

MikaelMayer opened this issue Dec 29, 2022 · 0 comments
Labels
kind: language development speed Slows down development of Dafny the language, flaky tests

Comments

@MikaelMayer
Copy link
Member

MikaelMayer commented Dec 29, 2022

Context

Currently, every AST node stores a "Tok", which is used to report errors.
We recently introduced "RangeToken" on nodes so that we would get also the first and last token of every node, and that RangeToken is assigned in the parser

Some nodes produced by the resolver are actually clones of existing nodes, or entirely new nodes, both of which only reuse "Tok" and not "RangeToken"
This is why, for these nodes, there remains an old computation for RangeToken which tries to infer the Range from sub-nodes's Tok, format tokens, and range tokens:
https://github.com/dafny-lang/dafny/blob/master/Source/DafnyCore/AST/DafnyAst.cs#L89

Problems:

  • The manual computation of RangeToken is not reliable. RangeTokens should be computed on the pre-resolved AST (which they are in Parser.cs), but the manual computation usually is executed post-resolution on resolved nodes, which sometimes creates weird ranges that heuristics can't rule out. For example, there are instances where the computed range takes into account tokens that should be marked as implicit and which are not, which makes unreasonably large ranges feat: More hovering information on the LSP #3282 (comment) (EDIT: This bug was fixed in this particular PR, but avoiding manually computing a RangeToken would have solved the problem as well, and who knows what other problems are lurking in the background because of similar cases) (EDIT: Indeed, found another bug feat: More hovering information on the LSP #3282 (comment))
  • We recently introduced "OwnedTokens", which should entirely replace "FormatTokens", but because of this old computation, we have still to keep "FormatTokens". OwnedTokens is never called except for the formatter. which works on pre-resolution phase.
  • Moreover, because of this old computation, there is a debate on whether "RangeToken" should be made a method instead of a property. This debate shouldn't exist, because this old computation should never need to be executed normally.
  • We can't however go in every place where tokens are transferred from one node to a new node and also transfer the RangeToken, it would be extremely hard to find all such cases, but also to maintain.

Suggested solution

Since we use "Tok" in nodes only for reporting, I suggest we do the following:

  • We replace "Tok" in every constructor by a new "Format" field, stored in INode
  • We remove the old computation altogether, so that RangeToken can only be assigned (in the Parser), no debate on whether it should be a property or a method.
  • class Format contains Tok, the RangeToken, its quick shortcuts StartToken and EndToken, but not the OwnedTokens that is lazily computed on demand and requires access to the children of the node. OwnedTokens stays in INode. However, its cache OwnedTokensCache can surely be stored in the Format
  • We remove the field FormatTokens on expressions and all the arguments that were used to depend on it.
  • For compatibility:
    • Tok => Format.Tok
    • Assigning RangeToken on a node actually assigns Format.RangeToken, so that it can be transferred from one node to the other.
    • FormatTokens => OwnedTokens on expressions

This solution, although laborious to implement (only refactorings), will ensure that:

  • We don't miss any transfer of RangeToken from a node to the other
  • We avoid recomputing RangeToken
  • Maintenance is easy, adding a new node will require providing a Format anyway.
  • We can safely remove FormatTokens
  • No more debate on property vs. method for RangeToken
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: language development speed Slows down development of Dafny the language, flaky tests
Projects
None yet
Development

No branches or pull requests

2 participants