Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully underline parse nodes in diagnostics. #3442

Merged
merged 13 commits into from
Dec 13, 2023

Conversation

domisterwoozy
Copy link
Contributor

Another incremental change to diagnostic formatting. I simply recurse over all the tokens in the subtree of a parse node and construct a DiagnosticLocation that covers all of the tokens.

I believe it's nicer for the user to be directed at the entire chunk of source where the error is occurring rather then just pointing at the bracketing/terminator tokens, but let me know if you all agree.

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zygoloid What do you think of this? I think I'm hesitant on this due to several cases:

  • For missing return statements, it tries to highlight the full function (including the signature).
    • e.g., "ERROR: Missing return at end of function with declared return type."
  • For cases where an open paren is the target, it highlights everything up to and including the open paren.
    • But not including the close paren.
    • e.g., "ERROR: addr self method cannot be invoked on a value."
  • For something like &foo where & is invalid, it points at the entire expression instead of &.
    • e.g., "ERROR: Cannot take the address of non-reference expression."
    • Similar can be seen of addr, etc.
  • In "Cannot implicitly convert from i32 to f64, the + seems slightly more clear than highlighting all of 12 + 3.4 (this is also a case where more clear associations of what's what would be useful).

Some cases seem more likely improvements, such as the highlight when a returned expression is of the incorrect type, or "Cannot initialize tuple of N element(s) from tuple with M element(s)."

Other cases look mostly like we're just not associating the right places. e.g., saying something can't be indexed seems like it should highlight the [, not necessarily the full a[b] expression. Right now it gets the latter because of the association with ]; with this change, associating with [ instead would result in a[ being highlighted.

This also makes me think about more complex cases. e.g., if we had a.b.c.d.e[x] where e cannot be indexed, I believe this would minimally highlight a.b.c.d.e[.

I think we'd want to keep iterating on highlight ranges, so it's not clear to me we should take this step. But this is where maybe zygoloid will have different thoughts about the parse tree structure associations for diagnostics.

Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zygoloid What do you think of this?

I think some of these changes are improvements, others are regressions. But I think we do want functionality along these lines.

I think when we pass a location to a diagnostic builder, we should distinguish between (at least) two cases:

  1. Passing a Parse::NodeId and wanting that subtree to be highlighted
  2. Passing a Parse::NodeId and wanting just the corresponding token to be highlighted.

For example, we could change NodeLocationTranslator to take a new class NodeLocation as its location type, where NodeLocation can either be constructed from a NodeId (to highlight the range) or from

auto TokenLoc(Parse::NodeId node_id) -> NodeLocation;

(to point the caret at the token location for the parse node).

Then we'd need to go through the diagnostic changes here and switch to using emitter.Emit(TokenLoc(parse_node), MyDiag); for the ones we don't like.

I think highlighting the whole parse subtree is doing the right thing most of the time here, though, so that's probably the better default.

This also makes me think about more complex cases. e.g., if we had a.b.c.d.e[x] where e cannot be indexed, I believe this would minimally highlight a.b.c.d.e[.

Yeah, a few of our diagnostics have bad locations due to the structure of the parse tree. I think in this case using the TokenLoc of the [ is probably good enough for now, but ideally I think I'd want

  a.b.c.d.e[x]
  ~~~~~~~~~^

... for which I think we want to pass a highlighted range to the diagnostic infrastructure separately from the diagnostic location.

I think we'd want to keep iterating on highlight ranges, so it's not clear to me we should take this step. But this is where maybe zygoloid will have different thoughts about the parse tree structure associations for diagnostics.

I think this is an iterative step forward, especially if we add control over whether to highlight the subtree or just point at the one node, and gives us some useful functionality for adding more general highlighting.

toolchain/parse/tree_node_location_translator.h Outdated Show resolved Hide resolved
@domisterwoozy
Copy link
Contributor Author

I think when we pass a location to a diagnostic builder, we should distinguish between (at least) two cases:

  1. Passing a Parse::NodeId and wanting that subtree to be highlighted
  2. Passing a Parse::NodeId and wanting just the corresponding token to be highlighted.

For example, we could change NodeLocationTranslator to take a new class NodeLocation as its location type, where NodeLocation can either be constructed from a NodeId (to highlight the range) or from

auto TokenLoc(Parse::NodeId node_id) -> NodeLocation;

(to point the caret at the token location for the parse node).

Then we'd need to go through the diagnostic changes here and switch to using emitter.Emit(TokenLoc(parse_node), MyDiag); for the ones we don't like.

Let me know if I implemented this correctly. My question now is should we go through and add TokenOnly where we deem fit now or do that in a follow up? If we want to do it now can we enumerate each case where we only want the token? I can start with each instance in @jonmeow's comment and we can go from there?

@zygoloid
Copy link
Contributor

zygoloid commented Dec 2, 2023

Let me know if I implemented this correctly.

Yes, looks great, thanks!

My question now is should we go through and add TokenOnly where we deem fit now or do that in a follow up? If we want to do it now can we enumerate each case where we only want the token? I can start with each instance in @jonmeow's comment and we can go from there?

Normally I'd be in favor of doing this incrementally, but I think it'll be a lot easier to see where we want to make this change if we do it now, because we can see the effect of the combined patch on the test suite. So let's start by fixing the things that @jonmeow pointed out (plus optionally anything else that stands out to you), then I can do another pass over the diagnostic changes and point out any other ones where a point diagnostic would be better.

Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think there's only one class of highlighting here that still looks a bit off to me: diagnostics for a function call are highlighting from the start of the callee to the (, inclusive. I'm not sure what's best here -- I'd be happy with highlighting until the ) or just pointing at the ( token, but switching that case to TokenOnly is probably simplest.

// CHECK:STDERR: c.F();
// CHECK:STDERR: ^
// CHECK:STDERR: ^~~~
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is highlighting the ( but not the ), which seems a bit surprising. I think any of these would be OK:

 c.F();
 ^~~

 c.F();
 ^~~~~

 c.F();
    ^

(Same for other calls in this file.)

Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a significant improvement. Can you resolve the merge conflicts?

@zygoloid zygoloid added this pull request to the merge queue Dec 8, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 8, 2023
@domisterwoozy
Copy link
Contributor Author

Sorry about the broken tests, I believe it should be good now.

@zygoloid zygoloid added this pull request to the merge queue Dec 13, 2023
Merged via the queue into carbon-language:trunk with commit 6419568 Dec 13, 2023
6 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Dec 14, 2023
Builds upon @domisterwoozy 's excellent #3442 . Removes the need to
store the first node of a declaration in the declaration state stack.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants