Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import the semantic highlighter from typescript-vscode-sh-plugin #39119

Merged
merged 21 commits into from
Sep 11, 2020

Conversation

orta
Copy link
Contributor

@orta orta commented Jun 17, 2020

Part 1 of 2 - Re: #38435

  • Migrates the codebase over, almost entirely free of semantic changes
  • Adds all their test
  • Ensures all our older semantic tests also run on the new output

Copy link
Member

@sandersn sandersn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable. The style of the code is kind of strange in places, but might be OK for now. The only change I'd make is to invert the if so that the original classifier is the fallback default.

src/services/classifierVscode.ts Outdated Show resolved Hide resolved
src/services/services.ts Outdated Show resolved Hide resolved
src/harness/fourslashImpl.ts Outdated Show resolved Hide resolved
src/harness/fourslashImpl.ts Outdated Show resolved Hide resolved
src/compiler/types.ts Outdated Show resolved Hide resolved
tests/baselines/reference/api/tsserverlibrary.d.ts Outdated Show resolved Hide resolved
tests/baselines/reference/api/tsserverlibrary.d.ts Outdated Show resolved Hide resolved
src/services/services.ts Outdated Show resolved Hide resolved
orta added a commit to orta/TypeScript that referenced this pull request Jun 22, 2020
@orta
Copy link
Contributor Author

orta commented Jun 22, 2020

Thanks folks - this has been updated with all the feedback

@orta orta requested a review from sheetalkamat June 22, 2020 22:18
src/services/classifier2020.ts Outdated Show resolved Hide resolved
src/services/classifier2020.ts Outdated Show resolved Hide resolved
src/services/classifier2020.ts Outdated Show resolved Hide resolved
src/services/classifier2020.ts Outdated Show resolved Hide resolved
src/services/classifier2020.ts Outdated Show resolved Hide resolved
src/services/classifier2020.ts Outdated Show resolved Hide resolved
src/services/classifier2020.ts Outdated Show resolved Hide resolved
@orta
Copy link
Contributor Author

orta commented Jun 23, 2020

Re: naming, this is tricky - this isn't an LSP standard or anything, and it's vscode specific.

Some options for name differentiation:

  • tokenAndModifiers
  • semanticTokens

@orta
Copy link
Contributor Author

orta commented Jun 25, 2020

Another option for a name to differentiate:

  • refined (there are less options in the new one)

orta added a commit to orta/TypeScript that referenced this pull request Jun 29, 2020
orta added a commit to orta/TypeScript that referenced this pull request Jun 29, 2020
@orta
Copy link
Contributor Author

orta commented Jul 1, 2020

We could assume that the format will make it into the LSP spec, and call the format "lsp"?

@DanielRosenwasser
Copy link
Member

I suppose let's just use the year. We can @deprecate it if we come up with a better name.

orta added a commit to orta/TypeScript that referenced this pull request Jul 6, 2020
@orta
Copy link
Contributor Author

orta commented Jul 7, 2020

OK, this is rebased, updated to use 2020 everywhere and should be good to go - @sheetalkamat any chance I could get another look over?

src/services/classifier2020.ts Show resolved Hide resolved

function classifySymbol(symbol: Symbol, meaning: SemanticMeaning): TokenType | undefined {
const flags = symbol.getFlags();
if (flags & SymbolFlags.Class) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need early return if

 if ((flags & SymbolFlags.Classifiable) === SymbolFlags.None) 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but I'd need to make a new SymbolFlag for it as there are things like functions and variable declarations in this classifier also, do you have an opinion either way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No but on that note. Is this more additive compared to our original classifier? If so why not use this all the time and filter out things in original format mode ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results sent back are different, and conform to a WIP LSP spec. The current one is used by VS, so we basically need to support both. I think this test shows the differences in the results quite well:

var c = classification;
const c = classification("original");
verify.syntacticClassificationsAre(
     c.comment(firstCommentText),
     c.keyword("function"), c.identifier("myFunction"), c.punctuation("("), c.comment("/* x */"), c.parameterName("x"), c.punctuation(":"), c.keyword("any"), c.punctuation(")"), c.punctuation("{"),
     c.keyword("var"), c.identifier("y"), c.operator("="), c.identifier("x"), c.operator("?"), c.identifier("x"), c.operator("++"), c.operator(":"), c.operator("++"), c.identifier("x"), c.punctuation(";"),
     c.punctuation("}"),
     c.comment("// end of file")); 
     c.comment("// end of file"));

const c2 = classification("2020");
 verify.semanticClassificationsAre("2020",
     c2.semanticToken("function.declaration", "myFunction"), 
     c2.semanticToken("parameter.declaration", "x"), 
     c2.semanticToken("variable.declaration.local", "y"), 
     c2.semanticToken("parameter", "x"), 
     c2.semanticToken("parameter", "x"), 
     c2.semanticToken("parameter", "x"), 
 );

The first format (original) includes everything form comments to braces, the second (LSP) is quite an explicit a subset of those results

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if thats the case isnt it better to classify things in one place as function. variable, comments etc and filter out and swithc things depending on format ?

Eg. returned "function.declaration" becomes identifier in original.
punctuation, comment is ignored in 2020 etc

Copy link
Contributor Author

@orta orta Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That not an unreasonable idea! However the current implementation is a scanner, but I need to use an AST to get resolved information for things which are outside the scope of the current file. They do end up with quite different semantics given that one works entirely via an AST because it's a semantic vs syntactic.

The subsequent PR from this orta@58e830c#diff-830691d11bb3d3f0f7ca90ef0ee364afR266 speeds up the processing but requires the AST ahead of time, and while it could be implemented on top of the current classifier scanner - there's quite a mis-match between what they're doing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poke on ^ - is that reasonable?

src/services/classifier2020.ts Outdated Show resolved Hide resolved
src/services/classifier2020.ts Outdated Show resolved Hide resolved
@@ -5557,7 +5571,7 @@ declare namespace ts {
}
interface ClassifiedSpan {
textSpan: TextSpan;
classificationType: ClassificationTypeNames;
classificationType: ClassificationTypeNames | number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going to be breaking change.. Everyone who uses this type (even if they arent using the new classification) are going to have to handle this and that doesnt seem necessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this have to be number and not string just like before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this has to be number then we need overloads that return the new type with number (should definitely use enum instead as what does number mean to API user ?) only if format is specified

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, I can think I can make this not a breaking change with some overloads and a separate type. The number makes sense in context (one goal of the new system is to be much less chattier, so it sends something like 6701 instead of class.declaration)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a stab at this in orta#4 but haven't got it compiling yet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely needs enum in that case

Copy link
Contributor Author

@orta orta Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling it an enum instead of a number is tricky because it represents an encoded mix of both ts.classifier.v2020.TokenType and ts.classifier.v2020.TokenModifier. It's not a straight forward 1 to 1 map.

I could make a new type like type ClassifierTokenTypeSpan = number which describes what it does?

src/services/classifier2020.ts Outdated Show resolved Hide resolved
src/services/types.ts Outdated Show resolved Hide resolved
@typescript-bot typescript-bot added the For Uncommitted Bug PR for untriaged, rejected, closed or missing bug label Sep 4, 2020
@orta
Copy link
Contributor Author

orta commented Sep 9, 2020

Alright, all feedback in both PRs have been applied, it's up to date with master and shouldn't be a breaking change for API consumers 👍🏻

Copy link
Member

@rbuckton rbuckton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this seems fine. I have a small nitpick and one perf-related question in the comments.

src/services/classifier2020.ts Outdated Show resolved Hide resolved
return (isQualifiedName(node.parent) && node.parent.right === node) || (isPropertyAccessExpression(node.parent) && node.parent.name === node);
}

const tokenFromDeclarationMapping: { [name: string]: TokenType } = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a Map (nee. ESMap internally)? If it's an object and the property accesses on the use sites above are possible inline cache misses, it can result in deoptimizations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 0 I've used a new Map as I found a few of those in the codebase and I assume this is fine?

@orta orta merged commit db5368d into microsoft:master Sep 11, 2020
@@ -5568,6 +5584,10 @@ declare namespace ts {
textSpan: TextSpan;
classificationType: ClassificationTypeNames;
}
interface ClassifiedSpan2020 {
textSpan: TextSpan;
classificationType: number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should definitely be not number but enum.. How does API user know what number means which classification otherwise?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants