-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import the semantic highlighter from typescript-vscode-sh-plugin #39119
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable. The style of the code is kind of strange in places, but might be OK for now. The only change I'd make is to invert the if
so that the original classifier is the fallback default.
Thanks folks - this has been updated with all the feedback |
Re: naming, this is tricky - this isn't an LSP standard or anything, and it's vscode specific. Some options for name differentiation:
|
Another option for a name to differentiate:
|
We could assume that the format will make it into the LSP spec, and call the format "lsp"? |
I suppose let's just use the year. We can |
…nstead of numbers
Co-authored-by: Nathan Shively-Sanders <[email protected]>
Co-authored-by: Daniel Rosenwasser <[email protected]>
Co-authored-by: Daniel Rosenwasser <[email protected]>
OK, this is rebased, updated to use 2020 everywhere and should be good to go - @sheetalkamat any chance I could get another look over? |
|
||
function classifySymbol(symbol: Symbol, meaning: SemanticMeaning): TokenType | undefined { | ||
const flags = symbol.getFlags(); | ||
if (flags & SymbolFlags.Class) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need early return if
if ((flags & SymbolFlags.Classifiable) === SymbolFlags.None)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but I'd need to make a new SymbolFlag for it as there are things like functions and variable declarations in this classifier also, do you have an opinion either way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No but on that note. Is this more additive compared to our original classifier? If so why not use this all the time and filter out things in original format mode ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The results sent back are different, and conform to a WIP LSP spec. The current one is used by VS, so we basically need to support both. I think this test shows the differences in the results quite well:
var c = classification;
const c = classification("original");
verify.syntacticClassificationsAre(
c.comment(firstCommentText),
c.keyword("function"), c.identifier("myFunction"), c.punctuation("("), c.comment("/* x */"), c.parameterName("x"), c.punctuation(":"), c.keyword("any"), c.punctuation(")"), c.punctuation("{"),
c.keyword("var"), c.identifier("y"), c.operator("="), c.identifier("x"), c.operator("?"), c.identifier("x"), c.operator("++"), c.operator(":"), c.operator("++"), c.identifier("x"), c.punctuation(";"),
c.punctuation("}"),
c.comment("// end of file"));
c.comment("// end of file"));
const c2 = classification("2020");
verify.semanticClassificationsAre("2020",
c2.semanticToken("function.declaration", "myFunction"),
c2.semanticToken("parameter.declaration", "x"),
c2.semanticToken("variable.declaration.local", "y"),
c2.semanticToken("parameter", "x"),
c2.semanticToken("parameter", "x"),
c2.semanticToken("parameter", "x"),
);
The first format (original) includes everything form comments to braces, the second (LSP) is quite an explicit a subset of those results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if thats the case isnt it better to classify things in one place as function. variable, comments etc and filter out and swithc things depending on format ?
Eg. returned "function.declaration" becomes identifier in original.
punctuation, comment is ignored in 2020 etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That not an unreasonable idea! However the current implementation is a scanner, but I need to use an AST to get resolved information for things which are outside the scope of the current file. They do end up with quite different semantics given that one works entirely via an AST because it's a semantic vs syntactic.
The subsequent PR from this orta@58e830c#diff-830691d11bb3d3f0f7ca90ef0ee364afR266 speeds up the processing but requires the AST ahead of time, and while it could be implemented on top of the current classifier scanner - there's quite a mis-match between what they're doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Poke on ^ - is that reasonable?
@@ -5557,7 +5571,7 @@ declare namespace ts { | |||
} | |||
interface ClassifiedSpan { | |||
textSpan: TextSpan; | |||
classificationType: ClassificationTypeNames; | |||
classificationType: ClassificationTypeNames | number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is going to be breaking change.. Everyone who uses this type (even if they arent using the new classification) are going to have to handle this and that doesnt seem necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this have to be number and not string just like before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this has to be number then we need overloads that return the new type with number (should definitely use enum instead as what does number mean to API user ?) only if format is specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point, I can think I can make this not a breaking change with some overloads and a separate type. The number makes sense in context (one goal of the new system is to be much less chattier, so it sends something like 6701 instead of class.declaration
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a stab at this in orta#4 but haven't got it compiling yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely needs enum in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling it an enum
instead of a number
is tricky because it represents an encoded mix of both ts.classifier.v2020.TokenType
and ts.classifier.v2020.TokenModifier
. It's not a straight forward 1 to 1 map.
I could make a new type like type ClassifierTokenTypeSpan = number
which describes what it does?
WIP - don't provide a breaking change
Co-authored-by: Sheetal Nandi <[email protected]>
Alright, all feedback in both PRs have been applied, it's up to date with master and shouldn't be a breaking change for API consumers 👍🏻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this seems fine. I have a small nitpick and one perf-related question in the comments.
src/services/classifier2020.ts
Outdated
return (isQualifiedName(node.parent) && node.parent.right === node) || (isPropertyAccessExpression(node.parent) && node.parent.name === node); | ||
} | ||
|
||
const tokenFromDeclarationMapping: { [name: string]: TokenType } = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a Map
(nee. ESMap
internally)? If it's an object and the property accesses on the use sites above are possible inline cache misses, it can result in deoptimizations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 0 I've used a new Map
as I found a few of those in the codebase and I assume this is fine?
@@ -5568,6 +5584,10 @@ declare namespace ts { | |||
textSpan: TextSpan; | |||
classificationType: ClassificationTypeNames; | |||
} | |||
interface ClassifiedSpan2020 { | |||
textSpan: TextSpan; | |||
classificationType: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should definitely be not number but enum.. How does API user know what number means which classification otherwise?
Part 1 of 2 - Re: #38435