Add support for the proposed Semantic Highlighting Protocol #954

jackguo380 · 2020-01-16T21:45:58Z

Mostly done the initial implementation for supporting for the semantic highlighting protocol proposed by people from the vscode language server. As far as I know currently clangd version 9 and eclipse.jdt.ls both have working implementations of this protocol.

This PR adds a new setting to configure semantic highlight g:LanguageClient_semanticHighlightMaps. Which is a per language map like serverCommands of mappings from semantic/textmate scopes to vim highlight groups.

Here are a couple of sample configurations and screenshots to go along.

Using clangd 9 for C++

let g:LanguageClient_semanticHighlightMaps['cpp'] = [
            \ {'Function': ['entity.name.function.cpp']},
            \ {'Function': ['entity.name.function.method.cpp']},
            \ {'CppNamespace': ['entity.name.namespace.cpp']},
            \ {'CppEnumConstant': ['variable.other.enummember.cpp']},
            \ {'CppMemberVariable': ['variable.other.field.cpp']},
            \ {'Type': ['entity.name.type.class.cpp']},
            \ {'Type': ['entity.name.type.enum.cpp']},
            \ {'Type': ['entity.name.type.template.cpp']},
            \ ]

hi! CppEnumConstant ctermfg=Magenta guifg=#AD7FA8 cterm=none gui=none
hi! CppNamespace ctermfg=Yellow guifg=#BBBB00 cterm=none gui=none
hi! CppMemberVariable ctermfg=White guifg=White

Using eclipse.jdt.ls for Java

let g:LanguageClient_semanticHighlightMaps['java'] = [
            \ {"JavaStaticMemberFunction": ['storage.modifier.static.java', 'entity.name.function.java', '**']},
            \ {"JavaMemberVariable": ['meta.definition.variable.java', 'meta.class.body.java', 'meta.class.java', '**']},
            \ {"Function": ['entity.name.function.java', '**']},
            \ {"Function": ['*', 'entity.name.function.java', '**']},
            \ {"Type": ['entity.name.type.class.java', '**']},
            \ {"Type": ['*', 'entity.name.type.class.java', '**']},
            \ ]

hi! JavaStaticMemberFunction ctermfg=Green cterm=none guifg=Green gui=none
hi! JavaMemberVariable ctermfg=White cterm=italic guifg=White gui=italic

TODO:

Support for vim 8.1 with +textprop
Disable feature when the version of vim or neovim does not support the needed APIs
More tests?

The PR is broken down into 2 commits since I had to upgrade the lsp-types crate to get the types needed for semantic highlighting.

Let me know what you think and if there are any improvements/fixes needed!

Related PRs/Issues:
#383
microsoft/vscode-languageserver-node#367

Also see this document for exactly how the protocol works:
https://github.com/microsoft/vscode-languageserver-node/blob/5a9b33c23de84c3341e011e79221795a8059375b/protocol/src/protocol.semanticHighlighting.proposed.md

YaLTeR · 2020-01-19T03:48:35Z

Oh, this is so cool! Can't wait for this to get merged.

A number of these scopes like entity.name.function.java seem to follow the pattern of entity.name.function.something, maybe add a couple of default rules mapping all of these to a reasonable Function? And the same for any other common scopes.

jackguo380 · 2020-01-19T07:49:10Z

Thanks @YaLTeR

I was considering adding default values but I don't see any good way to do it since the semantic scopes are kind of language server specific. I'm kind of worried that language servers are going to use different scope names or map the scopes differently, so the defaults may look good on one server but look awful on a another one.

I'll probably just add them into the wiki page for each server once this gets merged.

autozimu

Awesome work!

Just left some comments here.

autozimu · 2020-01-21T02:34:02Z

doc/LanguageClient.txt

+
+The maps associate the highlight group with a semantic scope matching pattern.
+Any symbols that have a scope that matches the pattern will be highlighted
+wtih that highlight group.


wtih => with

autozimu · 2020-01-21T02:39:06Z

autoload/LanguageClient.vim

@@ -1375,6 +1385,67 @@ function! LanguageClient_contextMenu() abort
    return LanguageClient_handleContextMenuItem(l:options[l:selection - 1])
 endfunction

+function! LanguageClient_semanticScopes(...) abort


Let's rename this a bit to ..._show/print to make the intension more clear.

autozimu · 2020-01-21T02:39:46Z

autoload/LanguageClient.vim

+    return LanguageClient#Call('languageClient/showSemanticHighlightSymbols', l:params, l:Callback)
+endfunction
+
+function! LanguageClient_showCursorSemanticHighlightSymbols() abort


Suggest ..._showSemanticHighlightUnderCursor

autozimu · 2020-01-21T02:41:47Z

doc/LanguageClient.txt

+In order for a pattern to match a semantic scope the lists must fully match
+in length and list items.
+>
+     ['x', 'y', 'z'] == ['x', 'y', 'z']


Let's replace the == with matches to make it more clear.

autozimu · 2020-01-21T02:47:33Z

src/types.rs

+    assert_eq!(do_matches(&t3), (false, false, true, true));
+    assert_eq!(do_matches(&t4), (false, true, false, true));
+}
+


Can you add unit tests covering case of **?

When a ** gets used it gets converted into the corresponding SemanticHighlightMatcher so the string ** doesn't ever appear in the matcher itself. See buildSemanticHighlightMatchers in language_server_protocol.rs for the construction.

E.g. an array ['a', 'b', 'c', '**'] would construct the ArrayStart matcher.

This is kind of limiting right now since the ** can only be used at the beginning/end of the array (although it should be sufficient for most use cases). I don't see any good way to evaluate arbitrary patterns of ** without an overly complex implemenation or hacks like concatentating the items and regexing. Any suggestions?

src/vimext.rs

autozimu · 2020-01-21T02:52:10Z

doc/LanguageClient.txt

+>
+    let g:LanguageClient_semanticHighlightMaps = {
+        \ 'java': [
+        \   {"Function": ['entity.name.function.java', '**']},


Do you think if it's more intuitive if the map is mapping semantic groups to highlight groups?

Another related question, what is the motivation of supporting two styles of this setting? Should one suffice?

The main issue/reason for the reversed map is that vimscript doesn't support arrays as a key into a dictionary.

Originally I just had a single map but I realized afterward that theres often a need to map multiple scopes to the same highlight group. I guess I could simplify the documentation and only mention the second style but still allow the first if it ever gets used.

Another idea is to represent semantic scopes as a string with some separator, e.g. storage.modifier.static.java/entity.name.function.java/**. That would allow using them as keys. Although idk if there are any banned characters from the semantic scope names that would be usable as separators.

Let's reduce the number of supported styles here. More styles supported only leads to complicated implementation to maintainer and more confusion to end user.

@YaLTeR cleaver idea! Agreed the tricky part is the choice of separator. As different language have different rules, I don't know there are possible separators working for all languages.

Continuing the same approach presented by @YaLTeR. To solve the issue of ambiguity of separators, a buffer local variable with default can be introduced, say b:LanguageClient_semanticScopeSeparator with default of /.

e.g., user might have config like

let g:LanguageClient_semanticHighlightMaps['java'] = { \ 'storage.modifier.static.java/entity.name.function.java/.*': 'Function', }

In a different filetype where / might be part of the scope themselves,

autocmd *.elisp setlocal LanguageClient_semanticScopeSeparator='|' let g:LanguageClient_semanticHighlightMaps['elisp'] = { \ 'storage.modifier.static|name.function|.*': 'Function', }

We would construct scope by concatenate the scopes with this separator. The actual highlight group are then being looked up through this map.

IMO, this way is more intuitive and solves the original issue that ** can only appear in beginning or end.

YaLTeR · 2020-01-21T03:53:51Z

doc/LanguageClient.txt

+                 \ {"JavaStaticMemberFunction": ['storage.modifier.static.java', 'entity.name.function.java', '**']},
+                 \ {"JavaMemberVariable": ['meta.definition.variable.java', 'meta.class.body.java', 'meta.class.java', '**']},
+                 \ {"Function": ['entity.name.function.java', '**']},
+                 \ {"Function": ['*', 'entity.name.function.java', '**']},


Could these two lines be replaced with one {"Function": ['**', 'entity.name.function.java', '**']}?

YaLTeR · 2020-01-21T03:55:37Z

doc/LanguageClient.txt

+>
+     let g:LanguageClient_semanticHighlightMaps = {}
+     let g:LanguageClient_semanticHighlightMaps['java'] = [
+                 \ {"JavaStaticMemberFunction": ['storage.modifier.static.java', 'entity.name.function.java', '**']},


So this list is parsed in order top to bottom until a matching semantic scope is found? What about the first case, in which order is that parsed?

I'll add a note about the order. It is from top to bottom.

YaLTeR · 2020-01-21T03:57:49Z

doc/LanguageClient.txt

+>
+    let g:LanguageClient_semanticHighlightMaps = {
+        \ 'java': [
+        \   {"Function": ['entity.name.function.java', '**']},


Another idea is to represent semantic scopes as a string with some separator, e.g. storage.modifier.static.java/entity.name.function.java/**. That would allow using them as keys. Although idk if there are any banned characters from the semantic scope names that would be usable as separators.

autozimu · 2020-01-21T04:21:18Z

src/language_server_protocol.rs

+                    mapping_pairs
+                        .extend(mapping.into_iter().map(|(hl_group, val)| (val, hl_group)));
+                }
+                Value::Array(values) => {


Let's start with supporting one style first and begin from there.

With either two styles or one style, serde would be able to deserialize json string to the variable, without manually unwrapping and mapping. The added benefits is serde would be able to tell where and why deserialize failed if there is any error.

autozimu · 2020-01-21T04:33:44Z

doc/LanguageClient.txt

+>
+    let g:LanguageClient_semanticHighlightMaps = {
+        \ 'java': [
+        \   {"Function": ['entity.name.function.java', '**']},


Let's reduce the number of supported styles here. More styles supported only leads to complicated implementation to maintainer and more confusion to end user.

@YaLTeR cleaver idea! Agreed the tricky part is the choice of separator. As different language have different rules, I don't know there are possible separators working for all languages.

src/vimext.rs

autozimu · 2020-01-21T04:42:49Z

src/language_server_protocol.rs

+        Ok(())
+    }
+
+    fn buildSemanticHighlightMatchers(


This can probably be extracted to a utility function.

autozimu · 2020-01-21T04:46:54Z

src/language_server_protocol.rs

            InitializeParams {
+                client_info: Some(ClientInfo {
+                    name: "LanguageClient_neovim".into(),


Let's use "LanguageClient-neovim" instead.

autozimu · 2020-01-21T04:49:31Z

src/language_server_protocol.rs

@@ -1577,7 +1803,11 @@ impl LanguageClient {
                    tab_size,
                    insert_spaces,
                    properties: HashMap::new(),
+                    trim_trailing_whitespace: None,
+                    insert_final_newline: None,
+                    trim_final_newlines: None,


Let's use default to fill properties we don't care.

autozimu · 2020-01-21T04:54:54Z

src/language_server_protocol.rs

+                            use std::cmp::Ordering;
+
+                            match existing_hl.line.cmp(&new_hl.line) {
+                                Ordering::Less => {


Instead of comparing each highlight group, can we just delete all highlights on screen and add all newer visible highlights? Similarly to how diagnostics are highlighted. In my opinion, the implementation would be much simplier.

The main issue is that the protocol supports incremental highlighting. So if that case is not handled then it may only apply the changed highlighting.

Currently eclipse.jdt.ls sends a single line of highlighting if the line itself was changed, if a line was added/removed the entire document's worth of highlighting is sent.

From this comment (prabirshrestha/vim-lsp#633 (comment)) it looks like clangd 10 has similar behavior except it can also send half the document's worth of highlighting upon add/remove lines.

I'll continue looking at simplifying this logic a bit. Might try a region based approach rather than the current line by line one.

jackguo380 · 2020-02-05T10:07:34Z

@autozimu @YaLTeR Thanks for all the review comments, I just pushed a fairly major change to use regexs rather than '*' and '**', please have a look.

In summary, the configuration of LanguageClient_semanticHighlightMaps now looks like this:

let g:LanguageClient_semanticHighlightMaps['java'] = {
        \ '^storage.modifier.static.java:entity.name.function.java': 'JavaStaticMemberFunction',
        \ '^meta.definition.variable.java:meta.class.body.java:meta.class.java': 'JavaMemberVariable',
        \ '^entity.name.function.java': 'Function',
        \ '^[^:]*entity.name.function.java': 'Function',
        \ '^[^:]*entity.name.type.class.java': 'Type',
        \ }

The keys of the map are vim regexs which are then matched against the semantic scopes concatentated together using the string LanguageClient_semanticScopeSeparator which is : by default.

E.g.

['invalid.deprecated.java', 'meta.class.java', 'source.java']
# becomes
'invalid.deprecated.java:meta.class.java:source.java'

I figured this is the simplest to implement and most vim users are competent enough in regex to be able to easily use this. A minor implementation defail is that we have to evaluate the regex's in vim due to differences in syntax between vim and rust's regexes.

autozimu · 2020-02-06T00:33:42Z

Awesome! Will take another look when I got a chance.

autozimu · 2020-02-14T06:26:58Z

Thank you very much for the contribution!

jackguo380 requested a review from autozimu January 16, 2020 21:57

jackguo380 mentioned this pull request Jan 17, 2020

Support for clangd jackguo380/vim-lsp-cxx-highlight#14

Closed

jackguo380 mentioned this pull request Jan 20, 2020

tokens in SemanticHighlightingInformation should be optional gluon-lang/lsp-types#133

Closed

autozimu reviewed Jan 21, 2020

View reviewed changes

YaLTeR reviewed Jan 21, 2020

View reviewed changes

autozimu reviewed Jan 21, 2020

View reviewed changes

jackguo380 force-pushed the semantic_highlight branch from 5df7f1f to 787ae93 Compare January 21, 2020 04:21

autozimu reviewed Jan 21, 2020

View reviewed changes

jackguo380 force-pushed the semantic_highlight branch 2 times, most recently from 9dc369d to dc627f2 Compare January 21, 2020 21:53

jackguo380 added 12 commits February 5, 2020 01:50

Upgrade lsp-types

ee149c8

Implement Semantic Highlighting

665a462

Fix missing abort in print_semantic_scopes

fdbd3f1

Fix formatting for semantic highlighting code

32a4df7

doc and vimscript fixes for code review

541f087

Make Java semantic highlight example more general

2c330de

Simplify parsing of semanticHighlightMaps

b6a67c2

Revise incremental update algorithm, formatting fixes

24e9ee4

Move buildSemanticHighlightMatchers to utils, add test

9dc0d75

Explain order of pattern matching in docs

1b0190c

Upgrade LSP types to 0.70.0, final fixes to incremental highlighting

721bfec

Implement regex based matching of semantic scopes

002a020

jackguo380 force-pushed the semantic_highlight branch from dc627f2 to 002a020 Compare February 5, 2020 09:50

autozimu merged commit 7c741d0 into autozimu:next Feb 14, 2020

Shatur mentioned this pull request Mar 22, 2020

Add preview for snippets hrsh7th/vim-compete#2

Open

Shatur mentioned this pull request Mar 24, 2020

Protocol extensions hrsh7th/vim-lamp#15

Open

purpleP mentioned this pull request Apr 20, 2020

Optionally allow extra client capabilities (semanticHighlight) neoclide/coc.nvim#1671

Closed

theimpostor mentioned this pull request Apr 21, 2020

LanguageClient-neovim + clangd support jackguo380/vim-lsp-cxx-highlight#23

Closed

martskins mentioned this pull request Jul 7, 2020

Feature Request: Semantic syntax highlighting #383

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for the proposed Semantic Highlighting Protocol #954

Add support for the proposed Semantic Highlighting Protocol #954

jackguo380 commented Jan 16, 2020 •

edited

Loading

YaLTeR commented Jan 19, 2020

jackguo380 commented Jan 19, 2020

autozimu left a comment

autozimu Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 21, 2020

jackguo380 Jan 21, 2020

autozimu Jan 21, 2020

jackguo380 Jan 21, 2020

YaLTeR Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 28, 2020

YaLTeR Jan 21, 2020

YaLTeR Jan 21, 2020

jackguo380 Jan 21, 2020

YaLTeR Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 21, 2020

autozimu Jan 21, 2020

jackguo380 Jan 21, 2020

jackguo380 commented Feb 5, 2020 •

edited

Loading

autozimu commented Feb 6, 2020

autozimu commented Feb 14, 2020

Add support for the proposed Semantic Highlighting Protocol #954

Add support for the proposed Semantic Highlighting Protocol #954

Conversation

jackguo380 commented Jan 16, 2020 • edited Loading

YaLTeR commented Jan 19, 2020

jackguo380 commented Jan 19, 2020

autozimu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackguo380 commented Feb 5, 2020 • edited Loading

autozimu commented Feb 6, 2020

autozimu commented Feb 14, 2020

jackguo380 commented Jan 16, 2020 •

edited

Loading

jackguo380 commented Feb 5, 2020 •

edited

Loading