perf: reduce memory usage of bes-uploader #20579
Labels
P2
We'll consider working on this in future. (Assignee optional)
team-Performance
Issues for Performance teams
type: bug
Description of the bug:
we've been able to improve the overall performance of our builds by improving the throughput of
bes-uploader
. when events are not processed quickly, theeventQueue
andackQueue
may grow faster than they're cleared, meaning these queues consume more and more of the heap. sometimes this causes bazel to OOM, but in other cases the build is slowed down because of extra competition for memory.we've seen two reasons for
bes-uploader
processing events slowly:bes-uploader
does too much work, see DigestUtils: avoid throwing on invalid digest function name #20574 and ByteStreamBuildEventArtifactUploader: skip reading metadata for files that won't be uploaded #20575there are things we can do to address both of these, but it would be nice if the
bes-uploader
wasn't able to cause the rest of the build to perform poorly, even if it can't clear the events quickly.note that
bes_upload_mode=fully-async
does not help because the events still need to be stored in memory.some ideas:
SendRegularBuildCommand
by usingPathConverter
ASAP. if paths were converted when an event is pushed to the queue, thenPathConverter
instances could be collected immediately. somePathConverter
instances for our monorepo are 22mb (see screenshot below)eventQueue
Which category does this issue belong to?
Performance
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
No response
Which operating system are you running Bazel on?
macos, linux
What is the output of
bazel info release
?6.4.0
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: