Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

clang-tidy: added the ability to shard jobs #37265

Merged
merged 17 commits into from
Nov 7, 2022

Conversation

gaaclarke
Copy link
Member

@gaaclarke gaaclarke commented Nov 2, 2022

This will allow bots to split up clang tidy work across the 4 instances of clang tidy so shared classes like everything in fml don't get linted 4 times.

This is accomplished by adding the flags: --shard-id 0 --shard-variants="foo,bar".

issue flutter/flutter#114632

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the Flutter Style Guide and the C++, Objective-C, Java style guides.
  • I listed at least one issue that this PR fixes in the description above.
  • I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See testing the engine for instructions on writing and running engine tests.
  • I updated/added relevant documentation (doc comments with ///).
  • I signed the CLA.
  • All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

@gaaclarke gaaclarke force-pushed the clang-tidy-shard branch 2 times, most recently from 2711cfb to be68606 Compare November 3, 2022 00:18
@gaaclarke gaaclarke marked this pull request as ready for review November 3, 2022 17:59
@gaaclarke gaaclarke requested review from jmagman and zanderso November 3, 2022 17:59
Copy link
Member

@jmagman jmagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How exactly will shard-variants work? For example will Mac iOS be shard 0 and have to build the host as well in order to get the difference between the files? And the Mac host will be shard 1 and also build iOS? And then Linux Android is shard 0 and builds the Linux host and Linux host is shard 1 and also builds Android? And build (headers), or just gn?

@@ -112,9 +123,16 @@ class ClangTidy {
final List<dynamic> buildCommandsData = jsonDecode(
options.buildCommandsPath.readAsStringSync(),
) as List<dynamic>;
final List<Command> changedFileBuildCommands = await getLintCommandsForChangedFiles(
final List<List<dynamic>> shardBuildCommandsData = options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the rest of this file does this, but nit don't use dynamic in null safe code, use Object? or Object if you know it won't be null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -171,23 +189,82 @@ class ClangTidy {
return repo.changedFiles;
}

Iterable<T> _takeShard<T>(Iterable<T> values, int id, int shardCount) sync*{
int count = 0;
for(final T val in values) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for(final T val in values) {
for (final T val in values) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -171,23 +189,82 @@ class ClangTidy {
return repo.changedFiles;
}

Iterable<T> _takeShard<T>(Iterable<T> values, int id, int shardCount) sync*{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess dart format isn't running on this package?

Suggested change
Iterable<T> _takeShard<T>(Iterable<T> values, int id, int shardCount) sync*{
Iterable<T> _takeShard<T>(Iterable<T> values, int id, int shardCount) sync* {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with the framework repo, dartfmt isn't run in the engine repo.

bool allSetsContain(Command command) {
for (final List<Command> set in sets) {
final Iterable<String> filePaths = set.map((Command e) => e.filePath);
if (!filePaths.contains(command.filePath)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the set math be easier making the parameter literally a set Iterable<Set<Command>>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like can we get rid of some of this code by taking advantage of Set? I don't think so. We'd have to add hashing ability to Command and make clear what it means to look up a Command by file path. So in that sense it will make the code more complicated. It might make things a bit faster however.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved to Set<String> to speed up intersection calculation.

.map((dynamic data) =>
Command.fromMap(data as Map<String, dynamic>))
.toList());
final Iterable<_SetStatusCommand> intersectionResults =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method could use some comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

intersection
.sort((Command x, Command y) => x.filePath.compareTo(y.filePath));
totalCommands.addAll(
_takeShard(intersection, shardId!, 1 + shardBuildCommands.length));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should shardId param be nullable if it's going to get ! here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value isn't used when sharedBuildCommandsData.isEmpty (ie when sharding isn't used).

@gaaclarke
Copy link
Member Author

How exactly will shard-variants work? For example will Mac iOS be shard 0 and have to build the host as well in order to get the difference between the files? And the Mac host will be shard 1 and also build iOS? And then Linux Android is shard 0 and builds the Linux host and Linux host is shard 1 and also builds Android? And build (headers), or just gn?

In order to shard with another variant you have to have its compile_commands.json. That's generated by running gn.
In other words, you don't actually have to build the target, just create the build files.

We have 2 options:

  1. Shard between [iOS, mac] and [Android, linux]
  2. Shard between [iOS, mac, Android, linux]

The second option would be the best, but I'm not sure that it is possible to make gn generate the build files for the platform you are not on, so the first option will be easier to implement.

@gaaclarke gaaclarke requested a review from jmagman November 3, 2022 21:15
Copy link
Member

@zanderso zanderso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a sense for how many minutes this would improve the Mac Host clang-tidy build?

https://ci.chromium.org/p/flutter/builders/prod/Mac%20Host%20clang-tidy

@@ -171,23 +189,86 @@ class ClangTidy {
return repo.changedFiles;
}

Iterable<T> _takeShard<T>(Iterable<T> values, int id, int shardCount) sync* {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code from here down that is reasoning about sets of commands could be made a bit more idiomatic, and maybe more readable as well, by encapsulating it in a new class (Commands, CommandSet, or CommandShard, or something) that can live in lib/src/command.dart.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Respectfully, I don't think that will help make the code easier to read or maintain. The small codebase already has poor oop design already and it actually made my job harder. In objective terms adding more classes will result in more lines of code, more segmentation of the code, and more concepts you need to know to modify the code.

As an example, there is no reason for CommandSet to exist when Set<Command> (or List<Command>) already encapsulates the idea perfectly. People can come into the codebase and know what that is immediately, not so with CommandSet. I also can't understand from what the difference between a CommandSet and a CommandShard is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, there is no reason for CommandSet to exist when Set<Command> (or List<Command>) already encapsulates the idea perfectly.

The code in getLintCommandsForFiles is not operating on the Set, List, etc. in its full generality. Instead it's doing something very specific: trying to filter unwanted items out of a big set. The specific internal data structures that are used to accomplish that goal are an implementation detail. This code would be easier to read and maintain if the the implementation details were encapsulated somehow. It doesn't necessarily have to be a class. It could be more helper methods, for example. I don't have a strong opinion there. However, I think even with comments, the code in getLintCommandsForFiles could be improved on, and that is overall what I'm asking for.

I also can't understand from what the difference between a CommandSet and a CommandShard is.

These were just naming suggestions for a single new class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think the documentation could be better and will help with that. I've updated the docstring in getLintCommandsForFiles and added one for _takeShard.

} else {
totalCommands.addAll(buildCommandsData.map((Object? data) => Command.fromMap((data as Map<String, Object?>?)!)));
}
Stream<Command> filterCommands() async* {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add new uses of Streams / async*.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the guidelines for editing code in the framework? That doesn't apply here, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dart conventions from the framework repo do apply here, yes. But independent of that, Streams and async* make code very difficult to debug, so should be avoided in general.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I think the framework repo recommendation lists alternatives to streams that don't exist here. Nevertheless, I removed the Stream and the async*.

@gaaclarke
Copy link
Member Author

gaaclarke commented Nov 4, 2022

Do you have a sense for how many minutes this would improve the Mac Host clang-tidy build?

https://ci.chromium.org/p/flutter/builders/prod/Mac%20Host%20clang-tidy

I didn't count the union, just eyeballed it. Napkin math: maybe there are 100 files overlapped. I measured clang-tidy to take 16s the other day on my computer. So for sharding between 2 computers: (100/2 * 16s * (1m / 60s)) = 13 minutes per run.

My computer is faster and the virtual machine is probably under more load than mine and technically the tasks are parallelized so your milage may vary.

Edit: also subtract the extra time needed to run GN an extra time which isn't that long, a couple of seconds? a minute max?

@zanderso
Copy link
Member

zanderso commented Nov 4, 2022

Do you have a sense for how many minutes this would improve the Mac Host clang-tidy build?
https://ci.chromium.org/p/flutter/builders/prod/Mac%20Host%20clang-tidy

I didn't count the union, just eyeballed it. Napkin math: maybe there are 100 files overlapped. I measured clang-tidy to take 16s the other day on my computer. So for sharding between 2 computers: (100/2 * 16s * (1m / 60s)) = 13 minutes per run.

My computer is faster and the virtual machine is probably under more load than mine and technically the tasks are parallelized so your milage may vary.

Edit: also subtract the extra time needed to run GN an extra time which isn't that long, a couple of seconds? a minute max?

Have you done a local run with the sharding flags? You could also hardcode some values in this PR to try them on the presubmit check.

@gaaclarke
Copy link
Member Author

Have you done a local run with the sharding flags? You could also hardcode some values in this PR to try them on the presubmit check.

I didn't, that would take over an hour to run locally and wouldn't give us good results compared to the bots. I can try to hack the bots to do it in CI. It's a bit tricky since I'll need to add an extra GN step depending on what platform is executing. Sounds like you want those numbers before approving though?

@gaaclarke
Copy link
Member Author

gaaclarke commented Nov 4, 2022

Okay, I've pushed up a hack that may work for the mac host tidy. The previous run was 65minutes, now I'm running it as shard 0 with the ios_debug shard-variant.

Note: this will probably break the other linters. Only the mac host tidy matters.

@gaaclarke
Copy link
Member Author

Heres the overhead from the extra GN call: 5s

Generating GN files in: out/ios_debug
Generating Xcode projects took 322ms
Generating compile_commands took 76ms
Done. Made 609 targets from 260 files in 4985ms

@gaaclarke
Copy link
Member Author

gaaclarke commented Nov 4, 2022

Calculating the intersection took longer than expected. It didn't get measured but it might be worth using a Set like jenn thought.

@gaaclarke
Copy link
Member Author

Sharding removed 226 files from the mac host clang-tidy job. (836->610)

@gaaclarke
Copy link
Member Author

gaaclarke commented Nov 4, 2022

@zanderso this run was 56m, so from 65m that's a 9m reduction. Hey my napkin math wasn't that far off.

edit: fixed numbers, we might be able to shave off another 2 minutes using hashing when doing the set math.

@gaaclarke gaaclarke requested a review from zanderso November 4, 2022 19:42
@gaaclarke
Copy link
Member Author

Remember too from my post mortem that the sharding isn't just about saving time either, it's about being in a position where clang-tidy fix patches don't overlap as well.

engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
naudzghebre pushed a commit to naudzghebre/engine that referenced this pull request Nov 9, 2022
* clang-tidy: added the ability to shard jobs

* added test

* jenn feedback

* hack ci to run as a shard to measure the time

* tweak

* fix hack

* zach feedback

* zach feedback 2

* removed stray async

* moved to using sets for lookups

* fixed typo in docstring

* Revert "fix hack"

This reverts commit 06a61a6.

Revert "tweak"

This reverts commit e7c58b1.

Revert "hack ci to run as a shard to measure the time"

This reverts commit e458963.

* removed calls to map

* turned the ci hack back on

* Revert "turned the ci hack back on"

This reverts commit 0d53794.

* removed sync*
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Nov 9, 2022
zanderso pushed a commit to flutter/flutter that referenced this pull request Nov 9, 2022
…115008)

* de0b58e82 clang-tidy: added the ability to shard jobs (flutter/engine#37265)

* 8aefb8b61 Roll Dart SDK from d97d5ad98893 to b22b36c0f52e (1 revision) (flutter/engine#37400)

* 2f043090f Roll Skia from aef6d301c0b5 to 0c8127b3dd7d (6 revisions) (flutter/engine#37402)

* dc7cb20ad Fix boolean json property. (flutter/engine#37403)

* 512fa40b6 Adding release_build:true property to windows builders. (flutter/engine#37397)

* 0af87071f Add test for image readback (flutter/engine#37401)

* c6f26e035 Update display_list_image_filter_unittests to be permit Skia roll (flutter/engine#37327)

* 4e9e97b4b Roll Skia from 0c8127b3dd7d to 8b19fb0f57d4 (1 revision) (flutter/engine#37409)

* 42c2940ff Pass the correct name for gclient variables in ci.yaml (flutter/engine#37429)

* 2e2a2fd97 Revert "Roll Skia from 0c8127b3dd7d to 8b19fb0f57d4 (1 revision) (#37409)" (flutter/engine#37430)

* 02cb78905 Roll Fuchsia Mac SDK from sa5bVGimNo3JwLV27... to d4l6A1aPw6Z0YjxmA... (flutter/engine#37432)

* c24ae1872 [web] Use v8BreakIterator where possible (flutter/engine#37317)

* 306b0fe9c Add rects to accumulator rather than bounds (flutter/engine#37435)

* c76035429 Roll Dart SDK from b22b36c0f52e to c15cdb978761 (2 revisions) (flutter/engine#37437)
schwa423 pushed a commit to schwa423/engine that referenced this pull request Nov 16, 2022
* clang-tidy: added the ability to shard jobs

* added test

* jenn feedback

* hack ci to run as a shard to measure the time

* tweak

* fix hack

* zach feedback

* zach feedback 2

* removed stray async

* moved to using sets for lookups

* fixed typo in docstring

* Revert "fix hack"

This reverts commit 06a61a6.

Revert "tweak"

This reverts commit e7c58b1.

Revert "hack ci to run as a shard to measure the time"

This reverts commit e458963.

* removed calls to map

* turned the ci hack back on

* Revert "turned the ci hack back on"

This reverts commit 0d53794.

* removed sync*
godofredoc added a commit that referenced this pull request Nov 16, 2022
* clang-tidy: added the ability to shard jobs (#37265)

* clang-tidy: added the ability to shard jobs

* added test

* jenn feedback

* hack ci to run as a shard to measure the time

* tweak

* fix hack

* zach feedback

* zach feedback 2

* removed stray async

* moved to using sets for lookups

* fixed typo in docstring

* Revert "fix hack"

This reverts commit 06a61a6.

Revert "tweak"

This reverts commit e7c58b1.

Revert "hack ci to run as a shard to measure the time"

This reverts commit e458963.

* removed calls to map

* turned the ci hack back on

* Revert "turned the ci hack back on"

This reverts commit 0d53794.

* removed sync*

* Clang-tidy: Fixed math on shard-id validator. (#37433)

Clang-tidy: Fixed math on shard-id validator.

* Felt analyze (#37481)

* Adding `felt analyze` command that CI will run.

* Remove some copypasta'd stuff.

* Also remove code path from felt.dart that forces a rebuild if it doesn't
detect the host_debug_unopt directory.

* More cleanup of felt.bat for CI.

* Fix typo in felt.bat.

* Run pub get before building host.dart. (#37502)

* Run pub get before building host.dart.

* We should call `pub get` for `web_ui` in the launcher script because
felt itself needs it. However, we should let felt invoke `pub get` on
`web_engine_tester` only as needed, not in the launcher script.

* Skip the skwasm unit test suite on Safari since it is flaky. (#37602)

* Skip the skwasm unit test suite on Safari since it is flaky.

* Add TODO.

* Remove felt snapshotting behavior. (#37639)

* Remove felt snapshotting behavior.

* Use `dart run`.

* Combine results of all the test batches. (#37610)

* Combine results of all the test batches.

* Skip regressions

* Use bool instead

* remove unused var

* skip fragment_program_test

* Also skip GL context lost test

* Transparent background test fails on Firefox and Safari

* Skip other test in safari

* Skip text test on firefox

Co-authored-by: gaaclarke <[email protected]>
Co-authored-by: Jackson Gardner <[email protected]>
Co-authored-by: Harry Terkelsen <[email protected]>
shogohida pushed a commit to shogohida/flutter that referenced this pull request Dec 7, 2022
…lutter#115008)

* de0b58e82 clang-tidy: added the ability to shard jobs (flutter/engine#37265)

* 8aefb8b61 Roll Dart SDK from d97d5ad98893 to b22b36c0f52e (1 revision) (flutter/engine#37400)

* 2f043090f Roll Skia from aef6d301c0b5 to 0c8127b3dd7d (6 revisions) (flutter/engine#37402)

* dc7cb20ad Fix boolean json property. (flutter/engine#37403)

* 512fa40b6 Adding release_build:true property to windows builders. (flutter/engine#37397)

* 0af87071f Add test for image readback (flutter/engine#37401)

* c6f26e035 Update display_list_image_filter_unittests to be permit Skia roll (flutter/engine#37327)

* 4e9e97b4b Roll Skia from 0c8127b3dd7d to 8b19fb0f57d4 (1 revision) (flutter/engine#37409)

* 42c2940ff Pass the correct name for gclient variables in ci.yaml (flutter/engine#37429)

* 2e2a2fd97 Revert "Roll Skia from 0c8127b3dd7d to 8b19fb0f57d4 (1 revision) (flutter#37409)" (flutter/engine#37430)

* 02cb78905 Roll Fuchsia Mac SDK from sa5bVGimNo3JwLV27... to d4l6A1aPw6Z0YjxmA... (flutter/engine#37432)

* c24ae1872 [web] Use v8BreakIterator where possible (flutter/engine#37317)

* 306b0fe9c Add rects to accumulator rather than bounds (flutter/engine#37435)

* c76035429 Roll Dart SDK from b22b36c0f52e to c15cdb978761 (2 revisions) (flutter/engine#37437)
gspencergoog pushed a commit to gspencergoog/flutter that referenced this pull request Jan 19, 2023
…lutter#115008)

* de0b58e82 clang-tidy: added the ability to shard jobs (flutter/engine#37265)

* 8aefb8b61 Roll Dart SDK from d97d5ad98893 to b22b36c0f52e (1 revision) (flutter/engine#37400)

* 2f043090f Roll Skia from aef6d301c0b5 to 0c8127b3dd7d (6 revisions) (flutter/engine#37402)

* dc7cb20ad Fix boolean json property. (flutter/engine#37403)

* 512fa40b6 Adding release_build:true property to windows builders. (flutter/engine#37397)

* 0af87071f Add test for image readback (flutter/engine#37401)

* c6f26e035 Update display_list_image_filter_unittests to be permit Skia roll (flutter/engine#37327)

* 4e9e97b4b Roll Skia from 0c8127b3dd7d to 8b19fb0f57d4 (1 revision) (flutter/engine#37409)

* 42c2940ff Pass the correct name for gclient variables in ci.yaml (flutter/engine#37429)

* 2e2a2fd97 Revert "Roll Skia from 0c8127b3dd7d to 8b19fb0f57d4 (1 revision) (flutter#37409)" (flutter/engine#37430)

* 02cb78905 Roll Fuchsia Mac SDK from sa5bVGimNo3JwLV27... to d4l6A1aPw6Z0YjxmA... (flutter/engine#37432)

* c24ae1872 [web] Use v8BreakIterator where possible (flutter/engine#37317)

* 306b0fe9c Add rects to accumulator rather than bounds (flutter/engine#37435)

* c76035429 Roll Dart SDK from b22b36c0f52e to c15cdb978761 (2 revisions) (flutter/engine#37437)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
autosubmit Merge PR when tree becomes green via auto submit App
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants