-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guarantee classes are packed with their inner synthetic lambdas during sharding #16369
Conversation
@mauriciogg i'm still able to repro this using this repo: Error in bazel-out/darwin_arm64-dbg/bin/apps/app/_dx/bin_devDebug/shard9.jar.dex.zip:com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda2.class.dex:
Attempt at compiling intermediate artifact without its context
Error in bazel-out/darwin_arm64-dbg/bin/apps/app/_dx/bin_devDebug/shard9.jar.dex.zip:com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda0.class.dex:
Attempt at compiling intermediate artifact without its context
Error in bazel-out/darwin_arm64-dbg/bin/apps/app/_dx/bin_devDebug/shard9.jar.dex.zip:com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda10.class.dex:
Attempt at compiling intermediate artifact without its context
Error in bazel-out/darwin_arm64-dbg/bin/apps/app/_dx/bin_devDebug/shard9.jar.dex.zip:com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda1.class.dex:
Attempt at compiling intermediate artifact without its context
Merge failed: Compilation failed to complete, origin: bazel-out/darwin_arm64-dbg/bin/apps/app/_dx/bin_devDebug/shard9.jar.dex.zip:com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda1.class.dex
Exception in thread "main" com.android.tools.r8.CompilationFailedException: Merge failed: Compilation failed to complete, origin: bazel-out/darwin_arm64-dbg/bin/apps/app/_dx/bin_devDebug/shard9.jar.dex.zip:com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda1.class.dex
at com.google.devtools.build.android.r8.DexFileMerger.main(DexFileMerger.java:389) |
are all GalleryService$$ExternalSyntheticLambda in the same shard that gallery service? If so it might be that lambdas itself need to appear in order as well and my code does just sorts alphabetically which misses the ordering of lambdas. |
It looks like they are being split between two dex files. ➜ zipinfo bazel-out/darwin_arm64-dbg/bin/apps/app/_dx/bin_devDebug/shard9.jar.dex.zip | grep "GalleryService"
-rw---- 1.0 fat 1268 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda0.class.dex
-rw---- 1.0 fat 1284 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda1.class.dex
-rw---- 1.0 fat 1308 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda10.class.dex
-rw---- 1.0 fat 1392 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda2.class.dex
-rw---- 1.0 fat 1332 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda3.class.dex While the main class is in: ➜ zipinfo bazel-out/darwin_arm64-dbg/bin/apps/app/_dx/bin_devDebug/shard10.jar.dex.zip | grep "GalleryService"
-rw---- 1.0 fat 1344 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda4.class.dex
-rw---- 1.0 fat 1272 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda5.class.dex
-rw---- 1.0 fat 1288 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda6.class.dex
-rw---- 1.0 fat 1396 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda7.class.dex
-rw---- 1.0 fat 1400 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda8.class.dex
-rw---- 1.0 fat 1448 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService$$ExternalSyntheticLambda9.class.dex
-rw---- 1.0 fat 8956 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService.class.dex
-rw---- 1.0 fat 2584 bx stor 69-Dec-31 16:00 com/lyft/android/camera/gallery/GalleryService_Factory.class.dex |
If a desugared lambda is packed in a different shard than the one from its original class D8 will fail with the following message: Attempt at compiling intermediate artifact without its context Changed sharding code so that it treats a class and an inner class as a single unit.
882e691
to
d2f03bc
Compare
Thanks all, let us know when this is good to review. I'll probably need to get this reviewed by a few folks and run global presubmit tests internally, so it might take a little bit to get this imported. |
@Bencodes I can't repo: I don't see how it could be sharded that way. Looks like dexes are being sort in reverse order. The main class being in shard10 while the inner classes being in shard9. Should be the other way:
My code (AFAICT) did not change the iteration order. Can you share the ordering of the dexes building without my code? |
@ahumesky I think the tests are failing in CI because they need the r8 jar being pulled from CI, but instead the bazel binary (version 5.3.1) uses the one from the sdk, which happens to be a very old one. Is there a way to disable those tests for the time being? |
Some follow up here - there is some misleading branching of login inside of Bazel that might trip people up. If you set I re-applied the latest patches that you pushed yesterday which seems to have broken
|
@Bencodes oh yeah you have to remove dex_shards otherwise it does something completely different. thats the only thing I can think that could be causing those issues |
Yeah we have both commits. |
ugh android_remote_tools.WORKSPACE probably needs to be updated. Although not sure I can do that myself. |
Actually hold up. I think it might have been user error on my part. The |
Alright @mauriciogg confirmed that this PR does in fact work for us now. |
@Bencodes thanks! |
@@ -47,7 +48,10 @@ java_library( | |||
|
|||
java_library( | |||
name = "testdata", | |||
srcs = glob(["testdata/**/*.java"]), | |||
srcs = glob( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might need to run buildifier and google-java-format against the changes in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code will get formatted automatically on import I think, so I don't think we have to manually invoke the formatter
Regarding android_remote_tools.WORKSPACE, it doesn't look like it needs to be modified, but if it does, you can make changes to it, but we'll need to make a new release of the remote tools and then update Bazel's dependency on it. It's usually not that much work. For getting the tests to pass on CI, I think you can use the r8 jar from google's maven repo here: Line 417 in 9ff9041
by changing this dependency:
to |
I ended up adding that dependency on the test build file src/test/java/com/google/devtools/build/android/dexer/BUILD because I didnt want to include a test dependency in non test code, but doing it this way the gmaven jar has to always come up first in the classpath which is somewhat brittle- so let me know if you want me to move it into the r8 BUILD file. |
Yeah good point about test code vs non-test code. Luckily |
458afff
to
d2f03bc
Compare
6393df1
to
1c3f1bb
Compare
Done |
I ran this against our internal global presubmit tests, and I'm getting back about 900 test targets failing with
I'll look into what's going on. It would be a little surprising if there were classes with so many inner classes that they don't fit into a shard though. |
So the root cause of the vast majority of the failures is from a generated file for protos, and has over 5,000 inner classes, with enough methods that it's bigger than what 1 shard can contain. Not totally sure offhand what can be done about this case. I'll check with the D8 team. There are 2 other root causes for the rest of the failures, they're probably similar situations. |
I havent confirmed this, but I think the error only happens with classes generated from desugaring lambdas, so in theory those are the only ones needed to be packed with the outer class. If thats, the case we could be smarted here and place sort lambdas right after the outer class and allow splits on other inner classes. |
@mauriciogg were you able to test if including just the classes from desugared lambdas in a shard worked? I can re-run the internal tests with a prototype. I also asked the D8 team if that would work. |
Works in our codebase: I changed
Also the comparison function changed to sort lambdas first as well:
I need to fix the tests and then I'll update this PR accordingly, but I'd also like to wait for the official D8 team response just to make sure we are outputting what d8 expects. |
@ahumesky updated to do the split only on lambdas |
We faced this when we migrated to v6 and we were able to workaround it temporarily by downgrading R8 to 3.0.64. This is the commit that changed the behavior in R8 if that helps https://r8-review.googlesource.com/c/r8/+/60228 |
@mauriciogg thanks, i started rerunning the tests, should have results by tomorrow |
I ran this latest version against our internal tests, and everything passed, however in talking with the D8 team, they brought up that there are other kinds of synthetic classes that could be generated, so this only happened to work for now. They suggested to go back to grouping inner classes with their enclosing class, and simply removing the check that they'll all fit into a single dex. With multidex, it shouldn't matter for the dexmerger. I reverted the imported change to group by inner/enclosing class, and took out the check, and I started rerunning all the tests against that. |
We seem to have hit the issue with one of those other kinds of synthetic classes. With Bazel 6.0.0, this PR applied on top and when not setting
Using I don't know what these |
We ran into this issue with another kind of synthetic classes,
Looking forward to a generic fix. Until then it is too risky for us to start using Bazel 6. |
A generic fix would be to just group all inner classes. Looks like support for this already exists in mobile-install https://github.com/bazelbuild/rules_android/blob/pre-alpha/src/tools/ak/dex/dex.go#L265 |
I ran our internal tests against this, and ran into a lot of failures -- looking through the code it was because this bit of code: somehow became just
leading to a number of dexfiles going over the 64k limit because not all the dexes were counted. That took a while to figure out. Not sure what happened there... In any event, I've reimported the change with the correct code, and adjusted it to group by inner class again, and started rerunning all our tests. I'll hopefully have results on Monday. We should be able to get this reviewed and imported in time for 6.1.0. |
Looking at the results, and there are many failures like:
I didn't update the error message -- it's really "and all of its inner classes". I looked into one of the apps with the failure, and looking at all the classes it looked like it didn't have any classes with 64k+ inner classes, so this will need a little more investigating. |
It looks like we have some classes that have enough inner classes to exceed the dex limits for a single dex, so it doesn't look like grouping by just inner classes will work. With help from the D8 folks, I reworked the change so that it groups only classes that are truly synthetic classes, which have an annotation on them from R8. Since I can't readily repro the original problem, it would be very helpful if those of you who are running into this can try out the updated PR: |
@ahumesky thanks for the update. With that patch we're still seeing the issue with proto
Edit: I believe there's an error around |
Thanks for trying this and for the detailed report -- I think the problem is that the code doesn't actually make sure that the outer class is with the synthetic classes. The outer class + its synthetic inner classes have to fit into one shard, but the "regular" inner classes can go into any shard. Because the outer class is being put in with the "regular" classes, it could potentially be placed into the next shard. So I think this bit:
needs to be:
For completeness, I've uploaded a new patch with the above to https://bazel-review.googlesource.com/c/bazel/+/211231 But all the above said, after talking through all of this with the D8 team, they're a bit worried about how tightly coupled this all is to implementation details of D8 (the annotation, the annotation class name, the naming conventions), not to mention it's not really great for performance to have to parse the dex files. (The code is also getting pretty complicated.) The D8 team proposes:
The D8 team would handle 1 and 2, and the bazel team and/or community could do 3 and 4. This avoids coupling to D8 implementation details and having to parse dex files and hopefully reduces the code complexity a bit. |
This should be fixed in Bazel 6.3.0: |
If a desugared lambda is packed in a different shard than the one from its original class D8 will fail with the following message:
Attempt at compiling intermediate artifact without its context
Changed sharding code so that it treats a class and synthetic lambda classes as a single unit.
Fixes #16368