Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module conversion tracking issue / notes #49332

Closed
5 tasks
jakebailey opened this issue May 31, 2022 · 29 comments · Fixed by #51387
Closed
5 tasks

Module conversion tracking issue / notes #49332

jakebailey opened this issue May 31, 2022 · 29 comments · Fixed by #51387
Assignees
Labels
Discussion Issues which may not have code impact Fix Available A PR has been opened for this issue

Comments

@jakebailey
Copy link
Member

jakebailey commented May 31, 2022

I'm using this thread to dump my brain and to note problems as I solve them (or to discuss).

Issues of note:

2022-05-31

Changes since the previous effort

  • The "typeformer" (the code that does the transform) now lives at https://github.com/jakebailey/typeformer, and has been rewritten using ts-morph to preserve comments and formatting.
    • The old typeformer use TS's own transform and emitter, but the emitter is not careful enough to preform TS-to-TS transformations (I have a list of bugs I've discovered, which don't matter for JS emit)
  • My WIP changes are semi-automatically pushed to a stack of PRs on my fork, starting here.
    • It's "best" to look at the difference between the "indent" commit and the last commit, e.g. this diff (but note that this link may not be up to date, is is as of now).
    • Please don't comment on these PRs; the script I run to create this stack is not very smart and will force-push if I insert a new commit in the middle. I wish I could just use gerrit, but...
  • I have removed the esbuild step from the previous effort; not because we don't plan to use it, but because it's simplest to start by using raw modules and then tweaking the output later. If we're going to be modules, I'd hope we can be run through a bundler successfully.
  • The faked namespace modules (barrel modules that at runtime behave very similarly to the old namespaces) have been moved to a dedicated folder in each project, rather than being dumped at the top level.
  • Use webworker typings for webServer.ts #46944 has to be reverted; this code doesn't seem to compile properly when actually built.

typeformer bugs

  • The typeformer's inlineImports step converts globalThis.assert into assert, because it thinks it's a global function that should be settable. Technically true, but if we wanted that, we probably wouldn't have written globalThis explicitly. This is just a bug when I rewrote the step.
  • Sometimes #region comments are dropped. Likely a bug that can be worked around like other ts-morph oddities to do with comment preservation (which it is almost always good at, but sometimes not).
  • Sometimes comments get duplicated, e.g. // DestructuringAssignmentType in src/compiler/types.ts.
  • The output path is determined by the project name; this was changed in my iteration of the typeformer, but should be hand-tweaked back to the right path, e.g. the debug code (dbg.ts) that is dynamically loaded in, or loggedIO.

Unsolved problems

  • My version of the typeformer's "inlineImports" pass is very aggressive, importing everything from their original declaration if possible. This is closest to how we would write the code by hand (if we had done so from the start), but the circularity of the codebase has broken things.
    • e.g., debug.ts imports semver.ts, but the Semver class creates a static property zero, which calls the Semver constructor, which then contains Debug.assert calls.
    • This only happened because the top of debug.ts contains import { Semver } from './semver, whereas before, the code didn't do this but instead referred to the namespace, so never actually caused that file to be evaluated.
    • It's totally possible that we'd hit this circularity issue today, except that most code is not at the top level and avoids the problem entirely, or has worked "magically" thanks to the order we happen to merge namespaces in.
    • We could just import these from the namespaces (like Module conversion #46567 does), which I think would fix this, but this is really unnatural code structure and auto-imports will not function properly.
  • The System interface (aka ts.sys) is set (ts.sys = ...) in order to support the browser, but in the module version, modules do import { sys } from '...'), which makes it very difficult to do like this. What can do do here?
  • Additionally, the System interface has getExecutingFilePath (among other functions), which only makes sense for TS packaged as a self contained file.

TODOs

  • Figure out what to do about the API; the past effort talks about API extractor, which is the right thing to do, but we need to make sure that we produce an output that describes the API as namespaces, not as reexported modules. Re-export class from namespace #4529 makes this hard.
  • We need to create the entrypoints for the new exported API; right now it just outputs as outDir, but I don't think we intend to just allow everyone to import our internals directly. outFile made a file that did the right thing before, and after we just need to write something like tsc.js which exports * from _namespaces/ts.ts or something.
    • See also exportAsModule, which hacks together a module.exports = ... line rather than emitting it via export.
  • Deprecations were previously done via namespace merging (i.e. deprecating "Map" in favor of "ESMap"); these are very messy when not coming up with a single file.
  • registerRefactor uses side effects of file loading to register refactorings. This technically works OK after modification if the fake namespace modules still import these files, but it'd be better to refactor this to work like the transforms (which are just imported in other files instead).
  • Loads of type names are duplicated; this previously didn't matter because you'd just fully qualify names for another namespace, but now that things are directly imported, you end up with these random fully-qualified names which would be better done differently. For example, fakeHosts.ts implements certain interfaces, but does so with classes named the same as the interface. Maybe just do FakeSomeInterface?

Other stuff

@jakebailey jakebailey added the Discussion Issues which may not have code impact label May 31, 2022
@jakebailey jakebailey self-assigned this May 31, 2022
@jakebailey jakebailey mentioned this issue May 31, 2022
19 tasks
@weswigham
Copy link
Member

My version of the typeformer's "inlineImports" pass is very aggressive, importing everything from their original declaration if possible. This is closest to how we would write the code by hand (if we had done so from the start), but the circularity of the codebase has broken things.
e.g., debug.ts imports semver.ts, but the Semver class creates a static property zero, which calls the Semver constructor, which then contains Debug.assert calls.
This only happened because the top of debug.ts contains import { Semver } from './semver, whereas before, the code didn't do this but instead referred to the namespace, so never actually caused that file to be evaluated.
It's totally possible that we'd hit this circularity issue today, except that most code is not at the top level and avoids the problem entirely, or has worked "magically" thanks to the order we happen to merge namespaces in.
We could just import these from the namespaces (like #46567 does), which I think would fix this, but this is really unnatural code structure and auto-imports will not function properly.

I'd fixed this by hand a few times in my version - I usually swap deep imports for barrel file imports and reorder the barrel imports to match our tsconfig order, since that represents the canonical execution order we desire. Sometimes some barrel imports need to be replaced with deep imports to break circularities, but it wasn't too many, iirc.

This:

e.g., debug.ts imports semver.ts, but the Semver class creates a static property zero, which calls the Semver constructor, which then contains Debug.assert calls.

? I don't see a Semver reference in debug.ts locally, and semver comes after debug in our tsconfig, so semver's never available during debug's setup (not that it seems to need to be?).

or has worked "magically" thanks to the order we happen to merge namespaces in.

Nothing too magic about it. The tsconfig order is curated to be a correct execution order - so long as barrel files reproduce the same order (either automatically or by hand), we should be equivalent, at least.

The System interface (aka ts.sys) is set (ts.sys = ...) in order to support the browser, but in the module version, modules do import { sys } from '...'), which makes it very difficult to do like this. What can do do here?

We have a setSys function in sys already as part of my work years ago - we should swap external stuff over to using that if it isn't already.

Additionally, the System interface has getExecutingFilePath (among other functions), which only makes sense for TS packaged as a self contained file.

It still makes sense for a tree of files, imo - intuitively I feel like it should be the entrypoint file, or whatever was initially executed by node. (be it tsc.js or tsserver.js or bin/whatever)

@jakebailey
Copy link
Member Author

? I don't see a Semver reference in debug.ts locally, and semver comes after debug in our tsconfig, so semver's never available during debug's setup (not that it seems to need to be?).

I meant to say Version, which is defined in semver.ts. It's referenced many times. (I was confusing it with the Semver package everyone else uses.)

I'd fixed this by hand a few times in my version - I usually swap deep imports for barrel file imports and reorder the barrel imports to match our tsconfig order, since that represents the canonical execution order we desire. Sometimes some barrel imports need to be replaced with deep imports to break circularities, but it wasn't too many, iirc.

Nothing too magic about it. The tsconfig order is curated to be a correct execution order - so long as barrel files reproduce the same order (either automatically or by hand), we should be equivalent, at least.

My overall goal was to end up with a codebase that felt like one that had been written with modules; this means eliminating these barrel imports from files. Otherwise, auto-imports will not function properly because auto-imports will prefer importing directly from the declaring modules. It also means not requiring that a given file import each file in a specific order either, given everyone (including our own import organizer) will not preserve this order.

Nothing too magic about it. The tsconfig order is curated to be a correct execution order - so long as barrel files reproduce the same order (either automatically or by hand), we should be equivalent, at least.

This is going to be rough to maintain in the long term, because I don't think the average person knows that this ordering has any impact. I sure didn't; I've never used a project that listed its files explicitly, only globs. I would hope that we can stop having so few files just to not have to add more, and instead break the code apart a bit. Especially if perf improves because we're not running every file in a closure where each access to the old namespace was actually fully-qualified under the hood.

We have a setSys function in sys already as part of my work years ago - we should swap external stuff over to using that if it isn't already.

My understanding had been that if one has a import { sys } from "..." that you're stuck with whatever value existed at import time, but searching it now, I guess export let ... is some sort of magical "live" view, which does fix this. That's good.

It still makes sense for a tree of files, imo - intuitively I feel like it should be the entrypoint file, or whatever was initially executed by node. (be it tsc.js or tsserver.js or bin/whatever)

Yep, this is more of a plumbing problem. I just wanted to note it so I could delete my random TODOs.

@weswigham
Copy link
Member

weswigham commented Jun 1, 2022

This is going to be rough to maintain in the long term

I mean, it already is, it's one of the hidden costs if using namespaces. It just only comes up when someone finds they want to adjust the order (because, eg, they want to start using utilities somewhere new or something). Likewise, it'll only come up with modules when someone wants to import from a new module that's already in use elsewhere (and that module does static initialization), which, in practice, is pretty rare.

@weswigham
Copy link
Member

this means eliminating these barrel imports from files

So this is a style argument, really. Imo, you're supposed to import from the barrel files for everything other than your immediately adjacent files (and we've talked about our autoimport behavior re: this before, and the only way we'll get better at it is honestly using it ourselves and seeing the upsides and downsides of each approach). Still, our first concern should probably be retaining a functional execution order (T.T) and not a specific style - that can come later.

@jakebailey
Copy link
Member Author

My next step is to revert my changes that made imports really aggressive in favor of importing from the barrels, just to confirm that things run again (and to double check the state of auto-imports), so that should hopefully get me back to a working codebase.

I was really hopeful that we could write things like everyone else, but it's a shame about the execution ordering.

@jakebailey
Copy link
Member Author

2022-08-29

I took a few months off of this to work on other issues, but have recently picked this back up again with the intent to make this breaking packaging change for 5.0 (it seems like a good idea to not do this on a minor release).

Last week, I hit a major milestone and got the test suite to run; all tests pass except the API tests (which don't pass because I haven't started on fixing the d.ts files yet).

  • The entire build is just generateLibs and tsc -b ./src, which is a great simplification from our current state. I expect this to only expand for things like api-extractor and potentially a bundler if we still need to support a single-file for some outputs. The build time is roughly the same as main for a clean build of every project, but a good deal faster than main when making an incremental change to checker.ts.
  • Getting tests running required a reordering of things in the namespaces, as well as moving some helper code around such that tests do not import other tests.
  • The tests only work via the serial test runner, i.e. what's built into mocha. Our own custom parallel runner does not work. I am investigating this, however, mocha has a built-in parallel test runner. Our own was built before it didn't, so I'm going to see if we can drop our own special runner and just stick to upstream.
  • I had to bring the module transform back to importing things from the faked namespaces (as noted in my previous comment); this is not my preference, but our code is currently too circular / execution order dependent to use direct imports.
    • It's likely that we can just start with these namespace imports first, then fix the circularities over time. There's a tool that the VS Code team used to analyze their codebase for these sorts of issues, and would likely work on our own once it uses imports.
  • getExecutingFilePath is still replaced with a dummy implementation; reasonably I expect this API to change to something more like getTypeScriptLibRoot or similar such that the path to our lib folder can be provided rather than being deduced relative to tsc.js or something.

@jakebailey
Copy link
Member Author

I posted the above early in the day; I did get the parallel test suite working. Mocha's parallel runner unfortunately can't be used with our repo in its current state as it expects individual test files, which it then runs in parallel (never any actual tests within a file in parallel).

We do our own thing and end up with effectively one file. Potentially, we can switch to something "more normal", somehow, but it's no longer a blocker. If we were using something like jest or ava, they could potentially run tests within a file in parallel, but that's a much larger change.

@jakebailey
Copy link
Member Author

2022-08-31

I have the parallel test runner working; it came down to the fact that the entrypoint I was using for the test runner was the namespace file, which broke because runner.ts exported a mutable variable to another file, but runner.ts hadn't yet "returned" to the namespace object's require('./runner.ts') to add the binding to its own exports, so it appeared to be undefined. This wouldn't happen if we used direct imports, but making the entrypoint be runner.js was enough.

While working on this, I prototyped a Jest runner that could run our compiler tests instead of the relatively large bit of code we wrote to make Mocha work; I think that's a good idea for the future. I only tried it because I was frustrated trying to get our own parallel runner to work (so was looking for any alternative), but I'll defer that to another day now that I have all tests running besides the API tests.

Next up is api-extractor, and getting some proper entrypoints for things like tsc.js, tsserver.js, typescript.js, etc, which we should now be able to build unconditionally in build mode as they will be small.

@jakebailey
Copy link
Member Author

jakebailey commented Sep 2, 2022

2022-09-02

I have the entrypoints set up and working for the libraries/executables, and a basic api-extractor config for one of them. Unfortunately, things seem to fail on that front. I've reported microsoft/rushstack#3612. But, I've noticed that our d.ts output is a little suboptimal; the case that we are currently breaking on is an import that shouldn't even be emitted as it can be safely imported. I'm sure we've gotten someone reporting that issue before.

We'll have to eventually call api-extractor programatically; one transformation we will need to do again is to convert our const enum to enum; I doubt api-extractor does this for us.

@jakebailey
Copy link
Member Author

But, I've noticed that our d.ts output is a little suboptimal; the case that we are currently breaking on is an import that shouldn't even be emitted as it can be safely imported. I'm sure we've gotten someone reporting that issue before.

This turned out to be part of the bug; hand-fixing these in the d.ts file allowed api-extractor to run to completion, although the output did not seem to exclude @internal declarations. I'll have to come up with some sort of reproducer for this.

@jakebailey
Copy link
Member Author

2022-09-12

The aforementioned import() thing has to do with #44044 and #49730; the cases that crash api-extractor are ones where we don't emit an import statement because the original file didn't have one, so emit an import expression instead. api-extractor shouldn't be crashing for these cases, but I'm going to try and explicitly annotate the offending pieces of code and see how far I get.

@jakebailey
Copy link
Member Author

jakebailey commented Sep 14, 2022

2022-09-14

I've gotten things far along enough to run perf benchmarks, and I'm overwealmingly happy to share the results.

This difference is very likely due to the fact that we are going through our export helpers, which are like our old namespace accesses, but extra slow. Compare that to ESM, which is faster as it doesn't use those import helpers, but since we import things like helper functions (isIdentifier, createParser, etc), we see a performance boost as we are no longer accessing via an object for each invocation, instead using a local binding created by the import statement.

Now, for esbuild:

  • Running esbuild on our CJS output, we see about the same performance loss as individual CJS files (maybe slightly worse). Not surprising, our import helpers are still present, on top of esbuild's.
  • Running esbuild directly on our source files (no tsc emit), we see an incredible 7-25% performance boost! 🎉 🎉 Looking at the output file, esbuild was able to lift all modules to the top level, meaning all accesses to functions in other modules are effectively local; no overhead. What's even more incredible is that this is without lowering const/let to var, something that esbuild currently does not support. (Could we go even faster?)
  • Running esbuild on our ESM output, we see about a 6-13% performance boost. This is somewhat surprising, as this is a case where we were using var instead of let/const, so I would have expected it to be even faster than the previous case.

Previously, we had intended to try and ship as individual files, a breaking change for 5.0. Since we're not likely to switch away from CJS, the performance penalty there is pretty unfortunate. But, if we just bundle our output, we would be producing a package that's still compatible with the previous one, and turns out to be 7-25% faster, and has an extremely fast build that we can make use of during development.


At this point, most things are working; I have some remaining tasks:

  • api-extractor; the bug I'm hitting needs to be minimized and reported. Declarations that are marked as @internal aren't stripped from the namespace objects produced for our namespace-barrel modules. Likely an oversight, as export * support is new.
  • Random loose end endpoints need to be handled, like the typings installer, cancellation token, etc.
  • dynamicImportCompat needs to be checked (and maybe rethought).
  • buildProtocol needs to be rethought (do we need to keep providing protocol.d.ts?)
  • createPlaygroundBuild needs to be checked to make sure it works; this was tricker when I was planning to output individual modules, but probably no longer a problem if we are bundling.
  • The build needs to be massively cleaned up. Things are a lot simpler, but I haven't been deleting anything from the gulpfile quite yet.
  • Whatever else comes up.

@jakebailey
Copy link
Member Author

jakebailey commented Sep 19, 2022

2022-09-19

On #50811, I tested what it'd be like to downlevel let/const to var on esbuild's output. Since it doesn't support that itself, I just had our build run @babel/plugin-transform-block-scoping on the output.

The additional performance boost was as expected, roughly matching the performance difference I noted when trying to switch our target up past ES5 on #50022, or about 3%. Most of that difference comes from improving parsing performance by 5-9%, which indicates that it's doing a lot in the TDZ. The other parts of the compiler don't improve as greatly, but do generally get a few percent faster.

Running babel on the output of esbuild works, but is definitely a lot slower than esbuild itself. I don't think we need to downlevel everything to ES5 (evanw/esbuild#297), so just --supported=let-and-const:false support in esbuild would be very helpful.

Benchmark results, bundled before and after let/const downleveling:

Metric 50811 50811 Delta Best Worst
Angular - node (v14.15.1, x64)
Memory used 328,472k
±0.01%
328,287k
±0.00%
-185k
-0.06%
328,244k 328,312k
Parse Time 2.06s
±0.52%
1.93s
±0.41%
-0.13s
-6.36%
1.91s 1.95s
Bind Time 0.70s
±0.57%
0.69s
±0.71%
-0.01s
-0.72%
0.69s 0.71s
Check Time 5.35s
±0.39%
5.28s
±0.39%
-0.07s
-1.31%
5.25s 5.33s
Emit Time 5.20s
±0.82%
5.15s
±0.56%
-0.06s
-1.08%
5.07s 5.21s
Total Time 13.31s
±0.37%
13.05s
±0.26%
-0.25s
-1.92%
12.98s 13.15s
Compiler-Unions - node (v14.15.1, x64)
Memory used 182,299k
±0.44%
182,374k
±0.39%
+75k
+0.04%
181,976k 185,246k
Parse Time 0.89s
±0.65%
0.81s
±0.37%
-0.08s
-8.91%
0.80s 0.81s
Bind Time 0.46s
±0.87%
0.45s
±0.50%
-0.01s
-2.18%
0.44s 0.45s
Check Time 6.22s
±0.67%
6.16s
±0.50%
-0.07s
-1.06%
6.10s 6.22s
Emit Time 1.99s
±0.98%
1.96s
±0.76%
-0.03s
-1.31%
1.94s 2.01s
Total Time 9.56s
±0.60%
9.38s
±0.32%
-0.18s
-1.87%
9.30s 9.43s
Monaco - node (v14.15.1, x64)
Memory used 315,704k
±0.01%
315,437k
±0.01%
-267k
-0.08%
315,401k 315,488k
Parse Time 1.56s
±0.48%
1.47s
±0.35%
-0.09s
-5.52%
1.46s 1.48s
Bind Time 0.63s
±0.94%
0.62s
±0.55%
-0.01s
-1.74%
0.62s 0.63s
Check Time 5.14s
±0.44%
5.12s
±0.50%
-0.02s
-0.29%
5.06s 5.18s
Emit Time 2.79s
±0.71%
2.74s
±0.42%
-0.05s
-1.72%
2.71s 2.77s
Total Time 10.12s
±0.29%
9.96s
±0.32%
-0.16s
-1.60%
9.89s 10.04s
TFS - node (v14.15.1, x64)
Memory used 280,802k
±0.01%
278,434k
±0.01%
-2,367k
-0.84%
278,374k 278,495k
Parse Time 1.32s
±0.76%
1.26s
±1.47%
-0.06s
-4.91%
1.23s 1.31s
Bind Time 0.58s
±0.63%
0.59s
±0.62%
+0.00s
+0.17%
0.58s 0.59s
Check Time 5.03s
±0.43%
4.97s
±0.46%
-0.05s
-1.07%
4.92s 5.02s
Emit Time 2.95s
±0.56%
2.71s
±0.54%
-0.24s
-8.07%
2.69s 2.75s
Total Time 9.89s
±0.28%
9.53s
±0.30%
-0.36s
-3.60%
9.49s 9.60s
material-ui - node (v14.15.1, x64)
Memory used 427,373k
±0.00%
427,047k
±0.00%
-326k
-0.08%
427,006k 427,094k
Parse Time 1.87s
±0.46%
1.77s
±0.42%
-0.10s
-5.25%
1.76s 1.79s
Bind Time 0.53s
±0.65%
0.53s
±0.71%
-0.00s
-0.38%
0.52s 0.53s
Check Time 12.00s
±0.48%
11.91s
±0.35%
-0.09s
-0.74%
11.81s 11.99s
Emit Time 0.00s
±0.00%
0.00s
±0.00%
0.00s
NaN%
0.00s 0.00s
Total Time 14.40s
±0.40%
14.21s
±0.32%
-0.19s
-1.30%
14.10s 14.31s
xstate - node (v14.15.1, x64)
Memory used 517,236k
±0.01%
516,843k
±0.00%
-394k
-0.08%
516,784k 516,900k
Parse Time 2.68s
±0.57%
2.53s
±0.67%
-0.15s
-5.63%
2.50s 2.57s
Bind Time 0.85s
±0.86%
0.84s
±0.68%
-0.01s
-0.94%
0.83s 0.85s
Check Time 1.47s
±0.50%
1.46s
±0.52%
-0.00s
-0.27%
1.45s 1.48s
Emit Time 0.07s
±4.13%
0.07s
±3.23%
-0.00s
-4.17%
0.06s 0.07s
Total Time 5.08s
±0.37%
4.91s
±0.51%
-0.17s
-3.39%
4.87s 4.98s

System

Machine Namets-ci-ubuntu
Platformlinux 4.4.0-210-generic
Architecturex64
Available Memory16 GB
CPUs4 × Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

Hosts

  • node (v14.15.1, x64)

Scenarios

  • Angular - node (v14.15.1, x64)
  • Compiler-Unions - node (v14.15.1, x64)
  • Monaco - node (v14.15.1, x64)
  • TFS - node (v14.15.1, x64)
  • material-ui - node (v14.15.1, x64)
  • xstate - node (v14.15.1, x64)

Summary

Benchmark Name Iterations
Current 50811 10
Baseline 50811 10

Also of note, I'm going to have to change our bundled output format to an IIFE; typescript uses a single file for both node and the browser, conditionally using module.exports. This isn't something that esbuild can do (without eval or new Function), but it looks like I can just use an IIFE with a footer and tack on some extra code at the bottom to set things up (per evanw/esbuild#507 (comment); TS doesn't need full UMD support, but it's close). This also gives me the opportunity to eliminate dynamicImportCompat, I think, if I can just inject (id) => import(id) directly into the output.

@jakebailey
Copy link
Member Author

2022-09-29

At this point, everything is mostly done. I ended up writing my own small dts bundler tool that can cover our codebase specifically; we have some unique requirements, namely that we define our own Node, Symbol, Map, etc. types, which most rollup tools try and rename. The limited bundler produces namespaces that look pretty similar to our current d.ts files and makes use of the fact that we don't have any name duplication through our exports, something that is effectively already required by the fact that we do export * and that errors if you don't resolve the conflict.

At this point, the only remaining tasks are:

  • Backporting changes so the stack isn't so big, including build changes and other removals, such as...
  • Changes related to removing typescriptServices.js (Can we stop shipping typescriptServices.js? #50758); this isn't strictly required but simplifies my build a little bit.
  • Fixing up createPlaygroundBuild, or remove it entirely.
  • Get a plan in place to correctly ignore the final commit in git blame using .git-blame-ignore; this will hopefully make this change hidden from GitHub blames and potentially GitLens or other tools.
  • Testing the developer experience when it comes to auto-importing from the new namespace barrel modules; I might have to move them to another location so they are chosen as the "right" place to import helpers from.

@jakebailey
Copy link
Member Author

jakebailey commented Oct 11, 2022

2022-10-11

Down to basically odds and ends.

  • CodeQL appears to get stuck (https://github.com/microsoft/TypeScript/actions/runs/3229429756/jobs/5286722214); probably something to do with how circular the codebase is. I plan to try and run the same "layer violation" software that the VS Code team used to fixup their codebase, after the fact. It'd be nice to switch to direct, non-cyclical imports.
  • I have sent PRs for monaco to fix things up in prep for 5.0: Use typescript.js as source for typescriptServices monaco-editor#3344, Remove regex replacements from import-typescript monaco-editor#3356, which are already applied in https://github.com/microsoft/TypeScript-Make-Monaco-Builds and running for all playground builds. These changes allow us to remove typescriptServices.js, as well as no longer worry about our code changes accidentally breaking the source-level hacks applied in monaco.
    • We have a copy of their import script in our own repo as createPlaygroundBuild, but it turns out that monaco's build has changed enough to the point that this script no longer works; monaco needs tsWorker.js, a file that only their build can produce as it includes code from their codebase along with ours.
  • I have automated .git-blame-ignore-revs to ignore the "conversion step" commits that do the bulk of the work and make all of the noise. When we merge the module branch into main, we'll have to use a merge commit rather than a squash to make this work. Ignoring a squash commit itself will cause git blames to look very odd as there is a lot of new code that I do want git blame to work on.
  • Auto-imports appear to work.

We should be good to dogfood this.

Also, since this effectively requires a whole new build, I've taken this opportunity to make a number of major build changes to speed up our build performance, remove dependencies, replace dependencies, and so on; changes that are hard to make during our normal dev cycle without causing churn. The module conversion itself is a huge speedbump to anyone bisecting around as it is.

@jakebailey
Copy link
Member Author

jakebailey commented Oct 19, 2022

2022-10-19

Things are looking good, still at odds and ends. The dogfooding has netted some useful feedback, some issues I've fixed like tsserver needing to be a UMD-module for web, and so on.

  • The big news this week is that I've been able to shave an additional 5.7 MB off our package (TS 4.8 is 63 MB, post-modules we will be at 35.6 MB, ~43% smaller). This comes from a seemingly unrelated change; we have a compiler-debug.js that's built locally which adds in some helpers to format the code flow graph in debugging. This was not included into the compiler as there's no real need to ship this to end users. However, the way that it worked was to require it, then call an init function with the ts namespace. By moving that code directly into our Debug namespace (~300 lines), we no longer need to pass in the ts namespace, which enables bundlers to actually tree shake it. Since the Debug namespace is used in every bundle, this means that both tsc and typingsInstaller shrunk, giving that 5.7 MB number. Notably, this doesn't affect typescript.js, tsserver.js, or tsserverlibrary.js, as those support the plugin API, which has to pass in the ts namespace just like the debug helper did. So, those ones are effectively unable to tree shake, which is unfortunate. But in any case, it's a win.
  • I am no longer planning on handling let/const in any way; I considered sending a PR to esbuild that does the "easy" examples, but I believe this is actually a bad idea because in the future we'll likely be emitting ESM and we don't want to emit something like export var such that consumers only see mutable bindings.
  • All of the changes to monaco have been merged, so our playground will continue to work as expected, and in a way that lets us do things like minification in the future without trouble.
  • All of our pipelines are updated in prep for the new task runner.

After some design meeting bikeshedding:

  • We're going to target ES2018; I picked this as it was the ES version we list for Node 10, and it doesn't seem like we'd gain much from newer syntax besides nullish colalescing, which only appeared in Node 14. (I'd rather not leave Node 12 people in the dust quite yet.)
    • Related to this, we are also all okay with dropping ts.Map, ts.Set, etc; with our target no longer being ES5, we can just start using the real types. In a previous version of TS, I deleted our shims entirely, and nobody has been bitten yet, so this seems safe plus/minus the API break.
  • We're not going to do anything automated to make imports sorted / have a max line limit. All of the different things we think we could do here are undesirable in various ways; we can't make each import its own line (the files would be huge), we can't use a formatter to reflow them as that would cause major merge conflicts, and so on. This is easy to address in the future, and the reality is that it's rare to add a new import to our worst files like the checker.

Remaining work:

@jakebailey
Copy link
Member Author

jakebailey commented Oct 25, 2022

2022-10-25

Probably the last update before we make the switch in 5.0.

First off, more performance wins; tsc.js is now 30% faster to start. Measured using hyperfine:

Benchmark 1: tsc main
  Time (mean ± σ):     115.3 ms ±   1.1 ms    [User: 103.3 ms, System: 11.5 ms]
  Range (min … max):   112.5 ms … 123.9 ms    1000 runs

Benchmark 2: tsc module-ified
  Time (mean ± σ):      89.2 ms ±   1.1 ms    [User: 78.3 ms, System: 10.1 ms]
  Range (min … max):    86.8 ms … 101.4 ms    1000 runs

Summary
  'tsc module-ified' ran
    1.29 ± 0.02 times faster than 'tsc main'
Expand me for tsserver.js, typescript.js, tsserverlibrary.js
Benchmark 1: tsserver main
  Time (mean ± σ):     161.0 ms ±   1.6 ms    [User: 147.1 ms, System: 13.7 ms]
  Range (min … max):   157.4 ms … 170.8 ms    1000 runs

Benchmark 2: tsserver module-ified
  Time (mean ± σ):     151.9 ms ±   1.4 ms    [User: 137.6 ms, System: 15.0 ms]
  Range (min … max):   148.4 ms … 161.9 ms    1000 runs

Summary
  'tsserver module-ified' ran
    1.06 ± 0.01 times faster than 'tsserver main'
Benchmark 1: typescript main
  Time (mean ± σ):     146.3 ms ±   1.7 ms    [User: 132.4 ms, System: 14.2 ms]
  Range (min … max):   142.9 ms … 165.6 ms    1000 runs

Benchmark 2: typescript module-ified
  Time (mean ± σ):     132.6 ms ±   1.5 ms    [User: 120.0 ms, System: 13.4 ms]
  Range (min … max):   129.6 ms … 147.3 ms    1000 runs

Summary
  'typescript module-ified' ran
    1.10 ± 0.02 times faster than 'typescript main'
Benchmark 1: tsserverlibrary main
  Time (mean ± σ):     153.0 ms ±   1.3 ms    [User: 139.2 ms, System: 14.1 ms]
  Range (min … max):   149.7 ms … 161.4 ms    1000 runs

Benchmark 2: tsserverlibrary module-ified
  Time (mean ± σ):     143.1 ms ±   1.5 ms    [User: 127.7 ms, System: 16.0 ms]
  Range (min … max):   139.5 ms … 158.4 ms    1000 runs

Summary
  'tsserverlibrary module-ified' ran
    1.07 ± 0.01 times faster than 'tsserverlibrary main'

The script for this is located here: https://gist.github.com/jakebailey/362aedfb19bff6086585509e223b7a64

I have filed bugs for the problems cross-team, linked in the previous post.

These are two other bugs in TypeScript that I've experienced during the development of the conversion:

The former is an annoyance; the latter is a half-blocker towards converting the Debug namespace into a module (which will allow bundlers to directly call our debug helpers).

I also have a big writeup which will go into the final PR's description + be converted into a blog post.

See you all then.

@Jack-Works
Copy link
Contributor

image

still have some require calls. any plan to move it to ts.sys?

@Jack-Works
Copy link
Contributor

I prototyped a Jest runner that could run our compiler tests

please consider jest's snapshot test! manually update some test is pain

@jakebailey
Copy link
Member Author

still have some require calls. any plan to move it to ts.sys?

I'm not sure what you're asking; the output format is still something like UMD so the same files serve Node and the browser, so those can't really go away. I don't have a good long term plan for eliminating them given our synchronous requirements.

please consider jest's snapshot test! manually update some test is pain

This turned out to be a failure; custom runners are not in a good place, and we'd have to completely overhaul how we do tests. Maybe one day, but right now things are still fine. Post modules, my build is a little faster at accepting baselines.

@Jack-Works
Copy link
Contributor

we don't want to emit something like export var such that consumers only see mutable bindings.

they're immutable even if they're let/var anyway

@typescript-bot typescript-bot added the Fix Available A PR has been opened for this issue label Nov 2, 2022
@jakebailey
Copy link
Member Author

The PR has been posted! #51387

@kungfooman
Copy link

This wiki page is still references `protocol.d.ts": https://github.com/Microsoft/TypeScript/wiki/Standalone-Server-%28tsserver%29

Should it just point to here then? https://github.com/microsoft/TypeScript/blob/main/src/server/protocol.ts

@jakebailey
Copy link
Member Author

It should probably just say "check the ts.server.protocol" namespace, reflecting the suggestion in my PR. Thanks for finding that.

@robpalme
Copy link

robpalme commented Feb 6, 2023

Hello @jakebailey

The TDZ performance finding quoted here (the 5% — 9% parser win) is being quoted as one of the reasons to perform TDZ optimisation in ESBuild. Meaning ESBuild is replacing const and let with var to improve performance with no way to opt-out.

We all want faster code and so this optimization may well be justified but I'd like to dig in a bit more before concluding. Because excess downlevelling hampers debugging, as you have previously recognized. And when we abandon using modern features in production via tooling that downlevels, it reduces the incentive for engines to optimize those features - a negative feedback loop.

Specifically I'd like to understand if there is a missing V8 optimisation here. So two questions:

  • Have you replicated this performance delta on any other engine?
  • Do you have an isolated/minimal repro that could be provided to V8 team as part of a performance bug report?

@jakebailey
Copy link
Member Author

jakebailey commented Feb 6, 2023

Meaning ESBuild is replacing const and let with var to improve performance with no way to opt-out.

Having read that linked thread, I honestly don't know how esbuild could stop using var without some major performance changes in v8 and such; the var + lazy init function pattern is explicitly one where if let were used, the TDZ checks would affect basically all uses.

because excess downlevelling hampers debugging, as you have previously recognized.

This is true, but in this particular case, switching to var does not really hurt debugging so long as the vars that are switched don't generate any other additional code such as the extra closures needed for for loop block scoping. But, it did catch at least one bug.

Have you replicated this performance delta on any other engine?

The benchmark suite that tests TypeScript largely runs tsc, and tsc only runs in node, so I have not tested anything other than v8, no. Perhaps recent Bun changes mean I can do a drop-in test and see what happens. (I'm pretty sure Deno is v8 so I wouldn't test that.)

Do you have an isolated/minimal repro that could be provided to V8 team as part of a performance bug report?

I don't; what I really need is some sort of tool that acts as a "TDZ blame" and can tell me where all of my TDZ time is being spent, but no such tool exists. The parser is very large and besides simply running our code through a transform which changes all const/let to var, it's hard for me to know what I can do to actually find the problem than to sit down and just replace a bunch and see what happens.

@jakebailey
Copy link
Member Author

it's hard for me to know what I can do to actually find the problem than to sit down and just replace a bunch and see what happens.

That being said, I could likely do a differential profile between the two and look there.

@jakebailey
Copy link
Member Author

@robpalme Just to show the current state, #52656 (comment) is what it'd be like if we downleveled all let/const to var; the performance difference is really stunning.

I still need to go try this on another engine, but the delta shows that there really is a lot of perf sitting on the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion Issues which may not have code impact Fix Available A PR has been opened for this issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants