Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify worker module map transmission w/ small perf benefit. #8237

Merged
merged 6 commits into from
Mar 29, 2019

Conversation

scotthovestadt
Copy link
Contributor

@scotthovestadt scotthovestadt commented Mar 29, 2019

Summary

This PR unifies the way module maps are passed to the worker. Previously, we did it one way for watch mode and a different way for non-watch mode because our watch mode way was a lot slower.

I fixed that slowness for watch mode and realized while doing some performance and memory profiling that the watch mode way is now actually faster on a few levels:

  1. It's straight-up faster to transmit it to the process because the module map is significantly smaller than the whole haste map you have to deserialize if you get at it via the file.
  2. If you load the whole haste map and want to discard half of it, suddenly there is a bunch of stuff that will need to be GC'd in the future. This happens in the worker because it only wants the module map but has to deserialize the whole file.
  3. Not requiring the haste map be written to disk at this point opens up further optimizations in the future.

Here's a benchmark of running yarn jest packages/expect, meant to profile starting up some workers and running a couple tests. Each profile was run 10 times after 3 warm ups.

master

Time (mean ± σ): 3.902 s ± 0.120 s [User: 21.570 s, System: 5.105 s]
Range (min … max): 3.682 s … 4.084 s 10 run

this branch

Time (mean ± σ): 3.522 s ± 0.175 s [User: 19.722 s, System: 4.777 s]
Range (min … max): 3.356 s … 3.897 s 10 runs

It's faster. It's less code with a unified code path. It opens up more optimizations in the future.

Test plan

  1. All tests pass.
  2. Benchmarks show better performance in all situations.

Copy link
Collaborator

@thymikee thymikee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@thymikee
Copy link
Collaborator

Linter says that we can actually remove even more :D

@@ -36,6 +35,30 @@ export default class ModuleMap {
private readonly _raw: RawModuleMap;
private json: SerializableModuleMap | undefined;

private static mapToArrayRecursive(
Copy link
Contributor Author

@scotthovestadt scotthovestadt Mar 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably wondering: what is this?

Turns out the ModuleMap wasn't being serialized in watch mode correctly but tests didn't catch it because we had 2 code paths.

1 code path = bug caught and fixed.

@@ -65,43 +65,6 @@ test('injects the serializable module map into each worker in watch mode', () =>
});
});

test('does not inject the serializable module map in serial mode', () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty sure this code path no longer exists with my changes, please sanity check my assumption

@codecov-io
Copy link

Codecov Report

Merging #8237 into master will decrease coverage by 0.04%.
The diff coverage is 20%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8237      +/-   ##
==========================================
- Coverage   62.33%   62.28%   -0.05%     
==========================================
  Files         265      265              
  Lines       10553    10556       +3     
  Branches     2565     2563       -2     
==========================================
- Hits         6578     6575       -3     
- Misses       3387     3393       +6     
  Partials      588      588
Impacted Files Coverage Δ
packages/jest-haste-map/src/ModuleMap.ts 56.25% <0%> (-8.04%) ⬇️
packages/jest-runner/src/testWorker.ts 0% <0%> (ø) ⬆️
packages/jest-runner/src/index.ts 65.95% <100%> (-2.05%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e08be02...8d3a28f. Read the comment docs.

Copy link
Member

@SimenB SimenB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less code and better perf is an awesome outcome! 😀

@scotthovestadt
Copy link
Contributor Author

@cpojer Would prefer you review before merge.

@scotthovestadt scotthovestadt requested a review from cpojer March 29, 2019 17:16
@SimenB SimenB requested a review from rubennorte March 29, 2019 17:19
@cpojer
Copy link
Member

cpojer commented Mar 29, 2019

The reason that we save the map to a file and read it from the worker is that I've found that on large module maps it used to take an insane amount of time to share it through workers. Doing it with two different implementations seemed like a good trade off. Before merging this, can you confirm this is actually faster on www and that it keeps scaling?

@scotthovestadt
Copy link
Contributor Author

@cpojer

Benchmarked against 11 test files (all of which are just a single test with a single assertion) on WWW. Ran old vs. new 10 times each to get a decent profile. --skipFilter on and cache warm before running each benchmark.

This benchmarks 1.5% faster on average, which means that just this part is actually significantly faster since it's just a small part of the overall picture. 🥇

Other thoughts:

  • The benchmark doesn't even take into account the benefits of not needing to GC a lot of data loaded and discarded immediately, so the performance profile is slightly better than it seems.
  • Performance benefit is secondary. I'll take it with equal performance because this opens up future optimization opportunities and unifies code paths.

@cpojer
Copy link
Member

cpojer commented Mar 29, 2019

Oh wow, that's cool. Well, if you are confident, ship it.

@scotthovestadt
Copy link
Contributor Author

@cpojer I just ran it 2 more times to be sure. It's definitely a statistically significant improvement in terms of performance. Merging!

@scotthovestadt scotthovestadt merged commit 9c9555f into jestjs:master Mar 29, 2019
private static mapFromArrayRecursive(
arr: ReadonlyArray<[string, unknown]>,
): Map<string, unknown> {
if (arr[0] && Array.isArray(arr[1])) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you check for arr[1]? Is there some constraint that there are always at least 2 items in arr when going recursive?
Checking for arr[0][1] would make more sense.

@SimenB
Copy link
Member

SimenB commented Apr 23, 2021

The changes in jest-runner here caused the regression reported in #9021. Regression does not exist when running serially, so there's something about how the data is passed to the worker which regressed.

@scotthovestadt you wouldn't by any chance be able to take a look?

@SimenB
Copy link
Member

SimenB commented Apr 23, 2021

FWIW, my local diff (partial revert of this PR) which fixes the error (but based on the rest of the discussion here, introduces a perf regression)

diff --git i/packages/jest-runner/src/index.ts w/packages/jest-runner/src/index.ts
index b58d881416..0f1a0a9846 100644
--- i/packages/jest-runner/src/index.ts
+++ w/packages/jest-runner/src/index.ts
@@ -155,12 +155,14 @@ export default class TestRunner {
     onFailure?: OnTestFailure,
   ) {
     const resolvers: Map<string, SerializableResolver> = new Map();
-    for (const test of tests) {
-      if (!resolvers.has(test.context.config.name)) {
-        resolvers.set(test.context.config.name, {
-          config: test.context.config,
-          serializableModuleMap: test.context.moduleMap.toJSON(),
-        });
+    if (watcher.isWatchMode()) {
+      for (const test of tests) {
+        if (!resolvers.has(test.context.config.name)) {
+          resolvers.set(test.context.config.name, {
+            config: test.context.config,
+            serializableModuleMap: test.context.moduleMap.toJSON(),
+          });
+        }
       }
     }
 
diff --git i/packages/jest-runner/src/testWorker.ts w/packages/jest-runner/src/testWorker.ts
index 919eeccd9c..091fe52094 100644
--- i/packages/jest-runner/src/testWorker.ts
+++ w/packages/jest-runner/src/testWorker.ts
@@ -58,24 +58,32 @@ const formatError = (error: string | ErrorWithCode): SerializableError => {
 };
 
 const resolvers = new Map<string, Resolver>();
-const getResolver = (config: Config.ProjectConfig) => {
-  const resolver = resolvers.get(config.name);
-  if (!resolver) {
-    throw new Error('Cannot find resolver for: ' + config.name);
+const getResolver = (config: Config.ProjectConfig, moduleMap?: ModuleMap) => {
+  const name = config.name;
+  if (moduleMap || !resolvers.has(name)) {
+    resolvers.set(
+      name,
+      Runtime.createResolver(
+        config,
+        moduleMap || Runtime.createHasteMap(config).readModuleMap(),
+      ),
+    );
   }
-  return resolver;
+  return resolvers.get(name)!;
 };
 
 export function setup(setupData: {
   serializableResolvers: Array<SerializableResolver>;
 }): void {
-  // Module maps that will be needed for the test runs are passed.
+  // Setup data is only used in watch mode to pass the latest version of all
+  // module maps that will be used during the test runs. Otherwise, module maps
+  // are loaded from disk as needed.
   for (const {
     config,
     serializableModuleMap,
   } of setupData.serializableResolvers) {
     const moduleMap = ModuleMap.fromJSON(serializableModuleMap);
-    resolvers.set(config.name, Runtime.createResolver(config, moduleMap));
+    getResolver(config, moduleMap);
   }
 }

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants