Support semantic highlighting #18

hackwaly · 2016-06-28T03:42:25Z

Like WebStorm and VS does: eg. Symbol is a type or parameter or namespace or unresolved ...

Textmate based grammars are hard to do this. Since we did support Diagnostics, why not support semantic highlighting?

dbaeumer · 2016-06-29T10:16:28Z

Fully agree and we already discussed it since we would like to have it in VS Code as well.

daviwil · 2016-06-29T11:18:54Z

I'd love to have this for the PowerShell extension also. We provide the ability to create "dynamic keywords" for the purpose of writing domain-specific languages. Semantic highlighting would allow us to colorize those keywords in VS Code even though they aren't part of the PowerShell language spec.

/cc @BrucePay

smarr · 2016-07-07T13:55:40Z

I was looking for that as well.

What's exactly the meaning of the current 'highlighting' for read/write/text (see document highlights)?

I implemented that part but didn't see any for of visual feedback in the editor.

cdietrich · 2016-07-07T14:01:07Z

DocumentHighlight is for "mark occurrences"

smarr · 2016-07-07T14:10:39Z

@cdietrich sorry for hijacking the topic, but this is very unclear to me. How is document highlights used to realize a 'mark occurrences" feature? How does the read/write/text distinction fit in, and why does it only expect a single result in return? For mark occurrences, I'd expect to be able to return a collection, no?

Would be great if that could be clarified also in protocol.md.

cdietrich · 2016-07-07T14:20:28Z

@smarr yes you are right. this makes no sense. i asume the return type should be an array.
if i have a look at vscode i can find

export interface DocumentHighlightProvider {
        provideDocumentHighlights(model: editor.IReadOnlyModel, position: Position, token: CancellationToken): DocumentHighlight[] | Thenable<DocumentHighlight[]>;
    }

thus looks like a bug in the protocol

smarr · 2016-07-07T14:25:54Z

@cdietrich thanks, I'll open a separate issue.

svenefftinge · 2016-11-15T15:23:49Z

Please find a proposal in PR #124

jpike88 · 2018-05-11T11:44:50Z

I’m guessing this is dead... anyone got any updates?

svenefftinge · 2018-05-11T19:09:44Z

I am working on a new proposal for semantic coloring.

axelson · 2020-06-22T16:51:47Z

As you can see with the new proposed protocol there is now a format and capabilities to indicate what requests the client will sent.

Is there a link where the updated protocol can be reviewed? Is it still primarily in the typescript server?

dbaeumer · 2020-06-22T18:19:29Z

@axelson the proposed version is here in terms of implementation. I am in the process of writing the markdown. https://github.com/microsoft/vscode-languageserver-node/blob/master/protocol/src/common/protocol.semanticTokens.proposed.ts#L1

dbaeumer · 2020-06-23T15:02:55Z

And here is a first version of the spec. No word polish and no spell check :-)

https://microsoft.github.io/language-server-protocol/specifications/specification-3-16/#textDocument_semanticTokens

kjeremy · 2020-06-23T15:14:44Z

@dbaeumer When will we need to move our LSP server implementation over to this one from the current semantic tokens implementation supported in vscode?

dbaeumer · 2020-06-24T09:11:56Z

@kjeremy Are you relying on the implementation in the next version of the LSP libs ?

kjeremy · 2020-06-24T12:31:02Z

@dbaeumer Our server-side implementation is in rust and based on https://github.com/gluon-lang/lsp-types which will need to be updated. The client-side opts in via https://github.com/rust-analyzer/rust-analyzer/blob/master/editors/code/src/client.ts#L151

ghost · 2020-06-24T13:28:07Z

@dbaeumer I really think a remark that clients are expected to cache locally with ranges would be quite important unless you want to make deltas de-facto required for servers. Right now, I can't see such a remark, it should probably go into "Implementation considerations." (Unless you don't think it should be optional @ deltas, but you did sound like you wanted to keep it in but as optional.)

If you don't write that in, some clients will just not do such caching and then servers will be required to implement deltas to guarantee good performance or users will blame it on them, making deltas basically a must.

Edit: also, right now the general concepts part is mixed with the pretty specific integer encoding in one block, I find that this is suboptimal for readability.

Suggestion in detail to split up text:

I think the General Concepts section would benefit from the integer encoding part split into a separate section placed separately. In detail, I am suggesting to split at the following start/end points:

Split start at:

On the capability level types and modifiers are defined using strings. However the real encoding happens using numbers to save bandwidth. [SPLIT HERE]

I would start to cut out parts for a new section after this, named Integer Encoding for Tokens, which contains the text up to BEFORE this paragraph:

Split end at:

[SPLIT HERE] The protocol defines an additional token format capability to allow future extensions of the format. The only format that is currently specified is relative expressing that the tokens are described using relative positions.

So the new General Concepts starting section would go like:

[... General Concepts as before ...]
On the capability level types and modifiers are defined using strings. However the real encoding happens using numbers to save bandwidth, see Integer Encoding (link to section).
The protocol also defines an additional token format capability to allow future extensions of the format. The only format that is currently specified is relative expressing that the tokens are described using relative positions.
[... General Concepts ending parts as before...]

... with the Integer Encoding for Tokens section separated out to below based on above cut out starting with:

The server therefore needs to let the client know which numbers it is using for which types and modifiers. They do so using a legend, which is defined as follows: [...remaining Integer Encoding concepts up to split end...]

This new section could be additionally prefixed with an introducing sentence like: This section describes the integer encoding used for data transfer.

kjeremy · 2020-07-24T21:01:39Z

@dbaeumer

I'm trying to figure out the delta behavior and have some questions:

Does the server fill in the resultId for the range request? That seems odd to me (I'm not sure what I would do with that information).
For the delta I am holding onto all the tokens for the file and computing a diff against that and returning it like how SemanticTokensBuilder works. Does the resultId returned from the delta request represent the state of ALL the tokens in the file?

I guess ultimately I'm trying to figure out what I need to hold onto to compute the deltas. I am assuming that each deltas really asks for the delta between "now" and the previous delta or full request. It does not appear to be spec'd that way however and it could be that a client asks for a delta between "now" and many revisions ago.

puremourning · 2020-08-08T13:28:42Z

@dbaeumer i'm reading through the initial draft spec (thanks for writing it up!). I have a handful of comments/questions/clarifications. What's the best way to provide feedback? would you like comments on a PR or something like that?

dbaeumer · 2020-08-19T09:33:28Z

@puremourning I think the comments are best provided as a PR.

dbaeumer · 2020-08-19T09:37:34Z

@kjeremy you are correct a delta for range makes no sense. I am pretty sure VS Code will basically drop a previous result ID.

For the delta is should be always reported against the last result independent whether this was a full or a delta response. So in an easy implementation the id is simply incremented. I have clarified this in the spec.

dbaeumer · 2020-08-19T09:50:43Z

@etc0de thanks for the suggestions. I made them in the 3.16 version of the spec.

MarFren · 2020-08-27T19:55:35Z

Regarding #18 (comment)

The idea is that the client pulls for semantic tokens for all visible files. The one not having the focus should be pulled with less frequency. Assuming that the server implements the delta mechanism a pull for a file that has no semantic color changes will not produce any additional payload. This model was chosen to not sync any UI state. Otherwise the server might do an expensive impact analysis about which files need to refresh semantic tokens although non of them are visible in the UI.

If I understand it correctly, the basic idea is to let the client pull semantic tokens for all open files on a regular basis, the one having the focus with high frequency, the other files with low frequency (but automatically). That would allow for updating the semantic highlighting in all open files, even if file A has the focus and a change here does induce a change in highlighting in another file B - am I correct? If so, the current reference implementation in VSCode Insiders (1,49.0-insiders) does not implement automated pulls of semantic tokens of open files, right? Wouldn't it be good to describe the expected pull behavior in the 3.16 specs of LSP?

kaby76 · 2020-09-01T01:13:43Z

Now that my LSP server is more stable, I decided to update the extension for VSCode and give the semantic highlighting feature a spin. On the server, I decided to first try out SemanticTokensOptions { range = false, full = true}. However, I see the following:

[Error - 8:40:58 PM] Request textDocument/semanticTokens failed.
  Message: No method by the name 'textDocument/semanticTokens' is found.
  Code: -32601

Well, yeah, that's right. There is no "textDocument/semanticTokens" message because it only mentions in the spec "textDocument/semanticTokens/range", "textDocument/semanticTokens/full", and "textDocument/semanticTokens/full/delta", but no "textDocument/semanticTokens". Is this spec up to date? I can debug VSCode, figure out what is it expecting, and program to the implementation, but it would be nice to know what's going on.

DanTup · 2020-09-01T13:20:52Z

@kaby76 I think the request is constructed by the vscode-languageclient package so the issue might be that the extension is not using the latest version of that package (7.0.0-next.9 is the one I've been using to test in the Dart extension and it seems to work well - though I've only implemented /full so far).

dbaeumer · 2020-09-01T14:19:04Z

@MarFren adding this as a recommendation to the spec makes sense. A PR is welcome. However the LSP spec never enforces this. If a client decides to not do this it should still be fine.

DanTup · 2020-09-01T14:42:51Z

@dbaeumer in the spec for semanticToken/range it says the result is:

result: SemanticTokens | null where SemanticTokensDelta

It seems like the line is incomplete (or the last part shouldn't be there).

If the return value is SemanticTokens, is the lineDelta from the range that was requested, or from the start of the document? (I couldn't find this mentioned anywhere).

kaby76 · 2020-09-02T01:28:32Z

@DanTup That worked. I had to also set up a few other dependencies

    "vscode-jsonrpc": "^6.0.0-next.5",
    "vscode-languageclient": "^7.0.0-next.9",
    "vscode-languageserver": "^7.0.0-next.7",
    "vscode-languageserver-protocol": "^3.16.0-next.7",
    "vscode-languageserver-types": "^3.16.0-next.3"

use the "Insiders" VSCode, and change an import because LanguageClient was moved around:

import * as vscodelc from 'vscode-languageclient/node';

But, "textDocument/semanticTokens/full" starting to work. Computing the start line/col deltas a little challenging. And VSCode does not work the same as the LSP client in VS2019 with edits.

DanTup · 2020-09-02T08:47:14Z

I had to make the /node import changes, though I don't think I needed to use insiders (nor add the other dependencies).

Computing the start line/col deltas a little challenging

I've been refactoring my server work to collect the tokens using absolute data initially (line/cols, enum types) and then at the end do the conversion to the LSP format. This made it much simpler than when I was also doing things like splitting up multiline/nested tokens at the same time. The final conversion is now relatively simple:

var lastLine = 0;
var lastColumn = 0;

_tokens.sort(
  (t1, t2) => t1.line == t2.line
      ? t1.column.compareTo(t2.column)
      : t1.line.compareTo(t2.line),
);

for (final token in _tokens) {
  var relativeLine = token.line - lastLine;
  var relativeColumn = relativeLine == 0
      ? token.column - lastColumn
      : token.column;

  encodedTokens.addAll([
    relativeLine,
    relativeColumn,
    token.length,
    semanticTokenLegend.indexForType(token.type),
    semanticTokenLegend.bitmaskForModifiers(token.modifiers) ?? 0
  ]);

  lastLine = token.line;
  lastColumn = token.column;
}

kaby76 · 2020-09-02T11:42:50Z

@DanTup Yes, my code looks more or less just like your code after I noticed initially that only the first symbol was being colored. I tried out some hardwired values, then understood what had to be done (i.e., sort + compute diffs).

dbaeumer · 2020-11-03T21:02:15Z

I will close the issue since SC is now part of the upcoming 3.16 spec.

dbaeumer added this to the 3.0 milestone Jun 29, 2016

This was referenced Jul 7, 2016

Document Highlight spec unclear and possibly wrong #32

Closed

Syntax highlight #33

Closed

daviwil mentioned this issue Jul 18, 2016

Color coding problem when using [ValidateScript({Test-Path "${_}`:\" -PathType Container})] PowerShell/vscode-powershell#225

Closed

RLovelett mentioned this issue Aug 3, 2016

SourceKit could provide more refined code highlighting RLovelett/vscode-swift#9

Open

hackwaly mentioned this issue Aug 8, 2016

Lexer based syntax grammar & semantic highlighting hackwaly/vscode-ocaml#44

Closed

mickaelistria mentioned this issue Nov 9, 2016

Token based Highlighting vs. Self Containing Extensions #120

Closed

cdietrich mentioned this issue Feb 20, 2017

[Language Server] Syntax Highlighting Support eclipse/xtext-core#283

Closed

dbaeumer modified the milestones: Backlog, 4.0 Apr 12, 2017

dbaeumer added the feature-request Request for new features or functionality label Apr 12, 2017

octref mentioned this issue Dec 18, 2017

Feature request: add syntax highlighting for component options vuejs/vetur#595

Closed

3 tasks

HighCommander4 mentioned this issue Feb 7, 2018

Upstreaming LSP protocol extensions? jacobdufault/cquery#431

Open

DJMcNab mentioned this issue May 11, 2018

Inconsistent TS syntax highlighting microsoft/vscode#49674

Closed

numirias mentioned this issue May 22, 2018

Make semantic highlighting available via LSP numirias/semshi#6

Open

wingrunr21 mentioned this issue Jun 6, 2018

Groupings rubyide/vscode-ruby#345

Closed

This was referenced Jun 20, 2018

Semantic Coloring eclipse-theia/theia#1845

Closed

Semantic highlighting protocol extension microsoft/vscode-languageserver-node#368

Closed

Levertion mentioned this issue Jul 1, 2018

Highlight Levertion/mcfunction-langserver#34

Closed

kittaakos mentioned this issue Jul 4, 2018

Semantic highlighting protocol extension #513

Closed

kieferrm mentioned this issue Jun 22, 2020

Iteration Plan for June 2020 microsoft/vscode#100100

Closed

57 tasks

larshp mentioned this issue Jul 24, 2020

highlight to calling methods and "me" larshp/vscode-abap#20

Open

This was referenced Aug 6, 2020

Proposal of the semantic highlighting protocol extension microsoft/vscode-languageserver-node#367

Closed

Define a protocol for syntax tokens #1063

Open

dbaeumer added a commit that referenced this issue Aug 19, 2020

Make suggested changes from: #18 (comment)

084f5f2

nomasprime mentioned this issue Sep 17, 2020

Relationship with LSP (especially coc.nvim) nvim-treesitter/nvim-treesitter#484

Closed

apexskier mentioned this issue Sep 18, 2020

syntax highlighting issue apexskier/nova-typescript#68

Closed

dbaeumer closed this as completed Nov 3, 2020

vscodebot bot locked and limited conversation to collaborators Dec 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support semantic highlighting #18

Support semantic highlighting #18

hackwaly commented Jun 28, 2016

dbaeumer commented Jun 29, 2016

daviwil commented Jun 29, 2016 •

edited

Loading

smarr commented Jul 7, 2016

cdietrich commented Jul 7, 2016

smarr commented Jul 7, 2016

cdietrich commented Jul 7, 2016

smarr commented Jul 7, 2016

svenefftinge commented Nov 15, 2016

jpike88 commented May 11, 2018

svenefftinge commented May 11, 2018

axelson commented Jun 22, 2020

dbaeumer commented Jun 22, 2020

dbaeumer commented Jun 23, 2020

kjeremy commented Jun 23, 2020

dbaeumer commented Jun 24, 2020

kjeremy commented Jun 24, 2020

ghost commented Jun 24, 2020 •

edited by ghost

Loading

kjeremy commented Jul 24, 2020 •

edited

Loading

puremourning commented Aug 8, 2020

dbaeumer commented Aug 19, 2020

dbaeumer commented Aug 19, 2020

dbaeumer commented Aug 19, 2020

MarFren commented Aug 27, 2020

kaby76 commented Sep 1, 2020

DanTup commented Sep 1, 2020

dbaeumer commented Sep 1, 2020

DanTup commented Sep 1, 2020

kaby76 commented Sep 2, 2020

DanTup commented Sep 2, 2020

kaby76 commented Sep 2, 2020

dbaeumer commented Nov 3, 2020

Support semantic highlighting #18

Support semantic highlighting #18

Comments

hackwaly commented Jun 28, 2016

dbaeumer commented Jun 29, 2016

daviwil commented Jun 29, 2016 • edited Loading

smarr commented Jul 7, 2016

cdietrich commented Jul 7, 2016

smarr commented Jul 7, 2016

cdietrich commented Jul 7, 2016

smarr commented Jul 7, 2016

svenefftinge commented Nov 15, 2016

jpike88 commented May 11, 2018

svenefftinge commented May 11, 2018

axelson commented Jun 22, 2020

dbaeumer commented Jun 22, 2020

dbaeumer commented Jun 23, 2020

kjeremy commented Jun 23, 2020

dbaeumer commented Jun 24, 2020

kjeremy commented Jun 24, 2020

ghost commented Jun 24, 2020 • edited by ghost Loading

Suggestion in detail to split up text:

kjeremy commented Jul 24, 2020 • edited Loading

puremourning commented Aug 8, 2020

dbaeumer commented Aug 19, 2020

dbaeumer commented Aug 19, 2020

dbaeumer commented Aug 19, 2020

MarFren commented Aug 27, 2020

kaby76 commented Sep 1, 2020

DanTup commented Sep 1, 2020

dbaeumer commented Sep 1, 2020

DanTup commented Sep 1, 2020

kaby76 commented Sep 2, 2020

DanTup commented Sep 2, 2020

kaby76 commented Sep 2, 2020

dbaeumer commented Nov 3, 2020

daviwil commented Jun 29, 2016 •

edited

Loading

ghost commented Jun 24, 2020 •

edited by ghost

Loading

kjeremy commented Jul 24, 2020 •

edited

Loading