-
Notifications
You must be signed in to change notification settings - Fork 408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further semantic tokens improvements #1641
Further semantic tokens improvements #1641
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This PR fixes a lot of bugs.
Things look good:
- support for module descriptor
- fix for tags in comments like
@link
- cover edge case for modifer
typeArgument
- refactoring of the visitor, which looks easier to understand.
Things I don't like:
- new token types for different keywords.
@@ -45,7 +45,8 @@ | |||
GENERIC("generic"), | |||
TYPE_ARGUMENT("typeArgument"), | |||
DECLARATION("declaration"), | |||
IMPORT_DECLARATION("importDeclaration"); | |||
IMPORT_DECLARATION("importDeclaration"), | |||
DOCUMENTATION("documentation"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documentation
is one of the standard modifiers, which should be placed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is deprecated
and declaration
. I didn't group the tokens based on whether they are standard in LSP, just Java modifiers (as in, keyword-based) vs "semantic" modifiers. Should I re-group them based on whether they are standard LSP tokens or custom jdt.ls tokens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go ahead to re-group them. It's more clear, and later we might directly reference standard modifiers from LSP4J, and leave those Java specific ones here.
// For example, "module" typically has the same meaning as "namespace" | ||
// in other languages (Rust, for example). But in Java, modules are a | ||
// completely separate concept, packages being the "namespace" tokens. | ||
MODULE("javaModule"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"module" is fine with me, as this is a Java language server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking for a while, or we can directly re-use namespace
for modules. I don't see any case in Java to distinguish packages and modules, as they don't occur in the same file... both are some kind of namespace
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"module" is fine with me, as this is a Java language server.
Well, the problem is that the token types from the jdt.ls will coexist with token types from other language servers. As a theme author, you won't necessarily know what language you're styling the tokens for (that's why LSP defines "standard", generic token types). So someone who doesn't know about Java modules might see that the token type module
is available, and style it based on the expectation that it has the same meaning as namespace
. By prefixing it with java
, it becomes clear that this is a Java-specific concept, and is semantically different from, let's say, modules in Rust. I don't think it matters that much in this particular case, but out of principle I made the decision to prefix it. Here's a quote from aeschli about this from an earlier discussion:
If a custom token type or modifier is very language specific, it makes sense to prefix it that way:
javaMember
(extends member). But I don't thinkjavaMember
is a good example of a very 'java' specific concept.
This is also how the C# extension handles many language-specific token types, see their semanticTokenTypes
contribution. Additionally, they even declare a module
token type, which could potentially cause confusion with Java modules if they were not named javaModule
. Although since the token type has no description I'm not entirely sure what it's used for anyway, especially since I'm no C# expert.
After thinking for a while, or we can directly re-use
namespace
for modules. I don't see any case in Java to distinguish packages and modules, as they don't occur in the same file... both are some kind ofnamespace
.
They can appear in the same file, since there are directives in module-info which refer to both module names and package names. Just see the test case for modules and you'll see what I mean. Personally I might style packages and modules the same, but they are semantically different and I don't like the idea of taking away customization abilities from theme authors without good reason.
If you want me to rename javaModule
to module
or just namespace
I'm personally fine with that, but I think you should consider the above first.
// in other languages (Rust, for example). But in Java, modules are a | ||
// completely separate concept, packages being the "namespace" tokens. | ||
MODULE("javaModule"), | ||
TYPE_LITERAL_KEYWORD("javaClassLiteralKeyword"), // genericName taken from: https://docs.oracle.com/javase/specs/jls/se11/html/jls-15.html#jls-15.8.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, it's "class" part of "String.class". To highlight a "variable"-like "keyword", here:
- adds a new token type
- to avoid regression, clients are required to explicitly declare new types.
It makes not too much sense to me. Suggest to reuse "keyword" token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you've understood that correctly.
to avoid regression, clients are required to explicitly declare new types.
It makes not too much sense to me. Suggest to reuse "keyword" token.
It's not a regression because the class
postfix keyword (which technically isn't really a keyword, so if you have a better name please suggest one) was not previously part of semantic highlighting. In fact it wasn't even part of the VS Code Java TextMate grammar, which is the main reason I wanted to add it via semantic highlighting. The only reason I scope-map it in redhat-developer/vscode-java#1760 is so that existing themes will see it as a keyword, which does actually change the colorization, but for the better in my opinion (previously it was treated as a field/property, from the TextMate grammar, but that doesn't make sense to me).
Using the keyword
token is fine, but as I've said previously I don't know if I like the idea of making token types generalized when they can be specific. As far as the syntax highlighting in VS Code goes, I've already supertyped javaClassLiteralKeyword
to keyword
in redhat-developer/vscode-java#1760.
In some cases I do agree that being overly specific about token types is unnecessary, like differentiating between class, interface, enum keyword etc. But the class literal keyword is very different from, for example, the type declaration keywords or modifier keywords, because it's a postfix, which makes it seem like it's almost a field.
// completely separate concept, packages being the "namespace" tokens. | ||
MODULE("javaModule"), | ||
TYPE_LITERAL_KEYWORD("javaClassLiteralKeyword"), // genericName taken from: https://docs.oracle.com/javase/specs/jls/se11/html/jls-15.html#jls-15.8.2 | ||
JAVADOC_TAG("javadocBlockTag"), // genericName taken from: https://docs.oracle.com/en/java/javase/11/docs/specs/doc-comment-spec.html#block-tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to reuse "keyword" token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this takes away the capability for theme authors to style block tags differently from inline tags, which I don't really like (and in this case I would personally like to style inline tags differently).
MODULE("javaModule"), | ||
TYPE_LITERAL_KEYWORD("javaClassLiteralKeyword"), // genericName taken from: https://docs.oracle.com/javase/specs/jls/se11/html/jls-15.html#jls-15.8.2 | ||
JAVADOC_TAG("javadocBlockTag"), // genericName taken from: https://docs.oracle.com/en/java/javase/11/docs/specs/doc-comment-spec.html#block-tags | ||
JAVADOC_NESTED_TAG("javadocInlineTag"); // genericName taken from: https://docs.oracle.com/en/java/javase/11/docs/specs/doc-comment-spec.html#inline-tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to reuse "keyword" token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Takes away capabilities from theme authors, see my other comments.
I agree that on theme author's perspective, it's always good that LS expose more details (e.g granularity of token types, a generic The question is how "specific" we are going to be. Taking keywords for example, we could invent specific token types for every keyword, e.g. "newKeyword", "forKeyword", "whileKeyword"...etc, then users could use whatever color customization they want for each keyword if they want. We need balance between generic and specific, and I do not see common requirements to add above types for the moment. I do want to get this PR merged for those fixes and improvements, and for simplicity I suggest that:
|
Signed-off-by: 0dinD <[email protected]>
a9f0cba
to
617a7ce
Compare
I think "newKeyword", "forKeyword", "whileKeyword" etc. is a bit too specific, but my point about the class literal "keyword" is that it's different from other keywords (being a postfix). But I agree that we should take all of that in a separate discussion, so I've committed a change to revert the new tokens to the more generic LSP tokens, losing some information. |
Fixes redhat-developer/vscode-java#1753.
Fixes part of redhat-developer/vscode-java#1384.
New token types
javaModule
namespace
(for module names inside module-info.java)javaClassLiteralKeyword
keyword
(Example:class
inString.class
)javadocBlockTag
keyword
(@author
,@param
,@return
etc.)javadocInlineTag
keyword
(@code
,@link
,@value
etc.)Edit: the new token types were changed to more generic token types from LSP, see discussion with Eskibear.
New token modifiers
documentation
(Applied to all tokens inside of Javadoc comments)@param
,@link
etc. in a Javadoc comment.I refactored the semantic tokens visitor a little bit by adding more helper methods, which clean up the visitation logic and prevent code duplication. This also helps make the visitation logic easier to read and write, which has allowed me to spot and fix a few minor bugs in the previous logic. Notably, type arguments of method and constructor invocations weren't taken care of, so they were missing the
typeArgument
modifier.A note about the tests: I mostly just added test cases for the new token types and modifiers, as well as the bugs I fixed. But in order to add a
module-info.java
to the test project, I had to make some changes. Mainly, you'll notice I converted the project from a simple Eclipse project to a Maven project. This was done in order to prevent the extension from freaking out about conflicting packages when there's an Eclipse project in the test projects folder that has amodule-info.java
on it's classpath. Looking at the only other test project with amodule-info.java
, I found out that by only having apom.xml
, and not a.classpath
, the errors no longer appear. So in order to attach the test library for the semantic tokens project without creating a.classpath
, I added a method inSemanticTokensCommandTest
.More details
This is what happened when adding the
module-info.java
:There are thousands of problems like this:
And lots of errors like these in the log:
If anyone has a better solution for working around what I assume might be a bug with this extension (or an upstream library), please tell me about it. Note that a similar issue will happen if you open a
module-info.java
not inside an Eclipse project, but only for the currently opened files in the editor. It will also disappear after restarting VS Code, so it's not as much of an issue as the other one. It can be reproduced with thismodule-info.java
as well.The last change I made to the tests was this one, which was necessary because I had to change the compiler compliance level to 11 now that the test project is modular. The change from
class
tonamespace
is because of the fact that in Java 9+ (due to the module system), the AST won't contain a recovery type binding forNonExistentClass
, since thejava
package isn't accessible. Not sure if we should work around this in some way, or even what the expected result should be in this case.If anyone's interested in why
ModuleDeclaration#getJavadoc()
always returns null (which is why I needed this workaround), it seems to be unimplemented in the ASTConverter of JDT Core. If support for resolving the Javadoc of a module declaration is added in the future, we can remove the workaround.