Further semantic tokens improvements #1641

0dinD · 2021-01-10T19:33:09Z

Fixes redhat-developer/vscode-java#1753.
Fixes part of redhat-developer/vscode-java#1384.

New token types

~~javaModule~~ namespace (for module names inside module-info.java)
~~javaClassLiteralKeyword~~ keyword (Example: class in String.class)
- This is especially helpful in vscode-java, since the TextMate grammar doesn't assign a scope for this.
~~javadocBlockTag~~ keyword (@author, @param, @return etc.)
~~javadocInlineTag~~ keyword (@code, @link, @value etc.)

Edit: the new token types were changed to more generic token types from LSP, see discussion with Eskibear.

New token modifiers

documentation (Applied to all tokens inside of Javadoc comments)
- Apart from the new Javadoc tag tokens, tokens for parameters, types, methods etc. are now provided when referenced by @param, @link etc. in a Javadoc comment.

I refactored the semantic tokens visitor a little bit by adding more helper methods, which clean up the visitation logic and prevent code duplication. This also helps make the visitation logic easier to read and write, which has allowed me to spot and fix a few minor bugs in the previous logic. Notably, type arguments of method and constructor invocations weren't taken care of, so they were missing the typeArgument modifier.

A note about the tests: I mostly just added test cases for the new token types and modifiers, as well as the bugs I fixed. But in order to add a module-info.java to the test project, I had to make some changes. Mainly, you'll notice I converted the project from a simple Eclipse project to a Maven project. This was done in order to prevent the extension from freaking out about conflicting packages when there's an Eclipse project in the test projects folder that has a module-info.java on it's classpath. Looking at the only other test project with a module-info.java, I found out that by only having a pom.xml, and not a .classpath, the errors no longer appear. So in order to attach the test library for the semantic tokens project without creating a .classpath, I added a method in SemanticTokensCommandTest.

More details

This is what happened when adding the module-info.java:

There are thousands of problems like this:

And lots of errors like these in the log:

!ENTRY org.eclipse.jdt.core 4 4 2021-01-08 15:35:58.592
!MESSAGE Failed to find package fragment root for projects/maven/semantic-tokens/src/main/java (not open) [in org.eclipse.jdt.ls.tests]
!STACK 1
Java Model Exception: Java Model Status [projects/maven/semantic-tokens/src/main/java [in org.eclipse.jdt.ls.tests] does not exist]
  at org.eclipse.jdt.internal.core.JavaElement.newNotPresentException(JavaElement.java:573)
  at org.eclipse.jdt.internal.core.PackageFragmentRoot.getUnderlyingResource(PackageFragmentRoot.java:761)
  at org.eclipse.jdt.internal.core.SearchableEnvironment.findModuleContext(SearchableEnvironment.java:1047)
  at org.eclipse.jdt.internal.core.SearchableEnvironment.findType(SearchableEnvironment.java:480)
  at org.eclipse.jdt.internal.core.CancelableNameEnvironment.findType(CancelableNameEnvironment.java:63)
  at org.eclipse.jdt.internal.compiler.lookup.LookupEnvironment.createPlainPackage(LookupEnvironment.java:1135)
  at org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.buildTypeBindings(CompilationUnitScope.java:135)
  at org.eclipse.jdt.internal.compiler.lookup.LookupEnvironment.buildTypeBindings(LookupEnvironment.java:475)
  at org.eclipse.jdt.internal.compiler.Compiler.internalBeginToCompile(Compiler.java:855)
  at org.eclipse.jdt.internal.compiler.Compiler.beginToCompile(Compiler.java:394)
  at org.eclipse.jdt.core.dom.CompilationUnitResolver.resolve(CompilationUnitResolver.java:1170)
  at org.eclipse.jdt.core.dom.CompilationUnitResolver.resolve(CompilationUnitResolver.java:712)
  at org.eclipse.jdt.core.dom.ASTParser.internalCreateAST(ASTParser.java:1205)
  at org.eclipse.jdt.core.dom.ASTParser.createAST(ASTParser.java:820)
  at org.eclipse.jdt.core.manipulation.CoreASTProvider$1.run(CoreASTProvider.java:272)
  at org.eclipse.core.runtime.SafeRunner.run(SafeRunner.java:45)
  at org.eclipse.jdt.core.manipulation.CoreASTProvider.createAST(CoreASTProvider.java:264)
  at org.eclipse.jdt.core.manipulation.CoreASTProvider.getAST(CoreASTProvider.java:197)
  at org.eclipse.jdt.ls.core.internal.handlers.CodeActionHandler.getASTRoot(CodeActionHandler.java:246)
  at org.eclipse.jdt.ls.core.internal.handlers.CodeActionHandler.getCodeActionCommands(CodeActionHandler.java:88)
  at org.eclipse.jdt.ls.core.internal.handlers.JDTLanguageServer.lambda$14(JDTLanguageServer.java:640)
  at org.eclipse.jdt.ls.core.internal.BaseJDTLanguageServer.lambda$0(BaseJDTLanguageServer.java:75)
  at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
  at java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
  at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
  at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
  at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
  at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
  at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)

If anyone has a better solution for working around what I assume might be a bug with this extension (or an upstream library), please tell me about it. Note that a similar issue will happen if you open a module-info.java not inside an Eclipse project, but only for the currently opened files in the editor. It will also disappear after restarting VS Code, so it's not as much of an issue as the other one. It can be reproduced with this module-info.java as well.

The last change I made to the tests was this one, which was necessary because I had to change the compiler compliance level to 11 now that the test project is modular. The change from class to namespace is because of the fact that in Java 9+ (due to the module system), the AST won't contain a recovery type binding for NonExistentClass, since the java package isn't accessible. Not sure if we should work around this in some way, or even what the expected result should be in this case.

If anyone's interested in why ModuleDeclaration#getJavadoc() always returns null (which is why I needed this workaround), it seems to be unimplemented in the ASTConverter of JDT Core. If support for resolving the Javadoc of a module declaration is added in the future, we can remove the workaround.

Eskibear

Thanks! This PR fixes a lot of bugs.

Things look good:

support for module descriptor
fix for tags in comments like @link
cover edge case for modifer typeArgument
refactoring of the visitor, which looks easier to understand.

Things I don't like:

new token types for different keywords.

Eskibear · 2021-01-20T04:31:07Z

org.eclipse.jdt.ls.core/src/org/eclipse/jdt/ls/core/internal/semantictokens/TokenModifier.java

@@ -45,7 +45,8 @@
 	GENERIC("generic"),
 	TYPE_ARGUMENT("typeArgument"),
 	DECLARATION("declaration"),
-	IMPORT_DECLARATION("importDeclaration");
+	IMPORT_DECLARATION("importDeclaration"),
+	DOCUMENTATION("documentation");


documentation is one of the standard modifiers, which should be placed above.

So is deprecated and declaration. I didn't group the tokens based on whether they are standard in LSP, just Java modifiers (as in, keyword-based) vs "semantic" modifiers. Should I re-group them based on whether they are standard LSP tokens or custom jdt.ls tokens?

Go ahead to re-group them. It's more clear, and later we might directly reference standard modifiers from LSP4J, and leave those Java specific ones here.

Eskibear · 2021-01-20T05:08:13Z

org.eclipse.jdt.ls.core/src/org/eclipse/jdt/ls/core/internal/semantictokens/TokenType.java

+	// For example, "module" typically has the same meaning as "namespace"
+	// in other languages (Rust, for example). But in Java, modules are a
+	// completely separate concept, packages being the "namespace" tokens.
+	MODULE("javaModule"),


"module" is fine with me, as this is a Java language server.

After thinking for a while, or we can directly re-use namespace for modules. I don't see any case in Java to distinguish packages and modules, as they don't occur in the same file... both are some kind of namespace.

"module" is fine with me, as this is a Java language server.

Well, the problem is that the token types from the jdt.ls will coexist with token types from other language servers. As a theme author, you won't necessarily know what language you're styling the tokens for (that's why LSP defines "standard", generic token types). So someone who doesn't know about Java modules might see that the token type module is available, and style it based on the expectation that it has the same meaning as namespace. By prefixing it with java, it becomes clear that this is a Java-specific concept, and is semantically different from, let's say, modules in Rust. I don't think it matters that much in this particular case, but out of principle I made the decision to prefix it. Here's a quote from aeschli about this from an earlier discussion:

If a custom token type or modifier is very language specific, it makes sense to prefix it that way: javaMember (extends member). But I don't think javaMember is a good example of a very 'java' specific concept.

This is also how the C# extension handles many language-specific token types, see their semanticTokenTypes contribution. Additionally, they even declare a module token type, which could potentially cause confusion with Java modules if they were not named javaModule. Although since the token type has no description I'm not entirely sure what it's used for anyway, especially since I'm no C# expert.

After thinking for a while, or we can directly re-use namespace for modules. I don't see any case in Java to distinguish packages and modules, as they don't occur in the same file... both are some kind of namespace.

They can appear in the same file, since there are directives in module-info which refer to both module names and package names. Just see the test case for modules and you'll see what I mean. Personally I might style packages and modules the same, but they are semantically different and I don't like the idea of taking away customization abilities from theme authors without good reason.

If you want me to rename javaModule to module or just namespace I'm personally fine with that, but I think you should consider the above first.

Eskibear · 2021-01-20T05:23:14Z

org.eclipse.jdt.ls.core/src/org/eclipse/jdt/ls/core/internal/semantictokens/TokenType.java

+	// in other languages (Rust, for example). But in Java, modules are a
+	// completely separate concept, packages being the "namespace" tokens.
+	MODULE("javaModule"),
+	TYPE_LITERAL_KEYWORD("javaClassLiteralKeyword"), // genericName taken from: https://docs.oracle.com/javase/specs/jls/se11/html/jls-15.html#jls-15.8.2


If I understand correctly, it's "class" part of "String.class". To highlight a "variable"-like "keyword", here:

adds a new token type

to avoid regression, clients are required to explicitly declare new types.
It makes not too much sense to me. Suggest to reuse "keyword" token.

Yes, you've understood that correctly.

to avoid regression, clients are required to explicitly declare new types.
It makes not too much sense to me. Suggest to reuse "keyword" token.

It's not a regression because the class postfix keyword (which technically isn't really a keyword, so if you have a better name please suggest one) was not previously part of semantic highlighting. In fact it wasn't even part of the VS Code Java TextMate grammar, which is the main reason I wanted to add it via semantic highlighting. The only reason I scope-map it in redhat-developer/vscode-java#1760 is so that existing themes will see it as a keyword, which does actually change the colorization, but for the better in my opinion (previously it was treated as a field/property, from the TextMate grammar, but that doesn't make sense to me).

Using the keyword token is fine, but as I've said previously I don't know if I like the idea of making token types generalized when they can be specific. As far as the syntax highlighting in VS Code goes, I've already supertyped javaClassLiteralKeyword to keyword in redhat-developer/vscode-java#1760.

In some cases I do agree that being overly specific about token types is unnecessary, like differentiating between class, interface, enum keyword etc. But the class literal keyword is very different from, for example, the type declaration keywords or modifier keywords, because it's a postfix, which makes it seem like it's almost a field.

Eskibear · 2021-01-20T05:25:23Z

org.eclipse.jdt.ls.core/src/org/eclipse/jdt/ls/core/internal/semantictokens/TokenType.java

+	// completely separate concept, packages being the "namespace" tokens.
+	MODULE("javaModule"),
+	TYPE_LITERAL_KEYWORD("javaClassLiteralKeyword"), // genericName taken from: https://docs.oracle.com/javase/specs/jls/se11/html/jls-15.html#jls-15.8.2
+	JAVADOC_TAG("javadocBlockTag"), // genericName taken from: https://docs.oracle.com/en/java/javase/11/docs/specs/doc-comment-spec.html#block-tags


Suggest to reuse "keyword" token.

Again, this takes away the capability for theme authors to style block tags differently from inline tags, which I don't really like (and in this case I would personally like to style inline tags differently).

Eskibear · 2021-01-20T05:26:56Z

org.eclipse.jdt.ls.core/src/org/eclipse/jdt/ls/core/internal/semantictokens/TokenType.java

+	MODULE("javaModule"),
+	TYPE_LITERAL_KEYWORD("javaClassLiteralKeyword"), // genericName taken from: https://docs.oracle.com/javase/specs/jls/se11/html/jls-15.html#jls-15.8.2
+	JAVADOC_TAG("javadocBlockTag"), // genericName taken from: https://docs.oracle.com/en/java/javase/11/docs/specs/doc-comment-spec.html#block-tags
+	JAVADOC_NESTED_TAG("javadocInlineTag"); // genericName taken from: https://docs.oracle.com/en/java/javase/11/docs/specs/doc-comment-spec.html#inline-tags


Suggest to reuse "keyword" token.

Takes away capabilities from theme authors, see my other comments.

Eskibear · 2021-01-20T13:24:59Z

I agree that on theme author's perspective, it's always good that LS expose more details (e.g granularity of token types, a generic keyword or a bunch of different keyword token types), then you'll be able to provide more flexibility and customization.

The question is how "specific" we are going to be. Taking keywords for example, we could invent specific token types for every keyword, e.g. "newKeyword", "forKeyword", "whileKeyword"...etc, then users could use whatever color customization they want for each keyword if they want. We need balance between generic and specific, and I do not see common requirements to add above types for the moment.

I do want to get this PR merged for those fixes and improvements, and for simplicity I suggest that:

In this PR we don't invent new token types, just using namespace for modules, and keyword for the others.
If you do want to have those new specific types, let's open a issue for open discussion first.

Signed-off-by: 0dinD <[email protected]>

0dinD · 2021-01-20T14:50:07Z

The question is how "specific" we are going to be. Taking keywords for example, we could invent specific token types for every keyword, e.g. "newKeyword", "forKeyword", "whileKeyword"...etc, then users could use whatever color customization they want for each keyword if they want. We need balance between generic and specific, and I do not see common requirements to add above types for the moment.

I think "newKeyword", "forKeyword", "whileKeyword" etc. is a bit too specific, but my point about the class literal "keyword" is that it's different from other keywords (being a postfix).

But I agree that we should take all of that in a separate discussion, so I've committed a change to revert the new tokens to the more generic LSP tokens, losing some information.

This was referenced Jan 10, 2021

Module descriptor syntax coloring should be improved redhat-developer/vscode-java#1384

Closed

Declare semantic token modifiers redhat-developer/vscode-java#1760

Merged

testforstephen requested a review from Eskibear January 12, 2021 07:13

Eskibear requested changes Jan 20, 2021

View reviewed changes

Further semantic tokens improvements

617a7ce

Signed-off-by: 0dinD <[email protected]>

0dinD force-pushed the further-semantic-tokens-improvements branch from a9f0cba to 617a7ce Compare January 20, 2021 14:44

Eskibear approved these changes Jan 21, 2021

View reviewed changes

Eskibear merged commit d02206b into eclipse-jdtls:master Jan 25, 2021

0dinD mentioned this pull request Jan 26, 2021

@link highlighting broken when linking to class redhat-developer/vscode-java#1753

Closed

rgrunber added this to the Early February 2021 milestone Feb 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Further semantic tokens improvements #1641

Further semantic tokens improvements #1641

0dinD commented Jan 10, 2021 •

edited

Loading

Eskibear left a comment

Eskibear Jan 20, 2021

0dinD Jan 20, 2021

Eskibear Jan 20, 2021

Eskibear Jan 20, 2021

Eskibear Jan 20, 2021

0dinD Jan 20, 2021

Eskibear Jan 20, 2021

0dinD Jan 20, 2021 •

edited

Loading

Eskibear Jan 20, 2021

0dinD Jan 20, 2021

Eskibear Jan 20, 2021

0dinD Jan 20, 2021

Eskibear commented Jan 20, 2021

0dinD commented Jan 20, 2021

Further semantic tokens improvements #1641

Further semantic tokens improvements #1641

Conversation

0dinD commented Jan 10, 2021 • edited Loading

New token types

New token modifiers

Eskibear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0dinD Jan 20, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eskibear commented Jan 20, 2021

0dinD commented Jan 20, 2021

0dinD commented Jan 10, 2021 •

edited

Loading

0dinD Jan 20, 2021 •

edited

Loading