-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipelines that depend on RBE remote cache are failing #1055
Comments
Thank you @jayconrod. I'm looking into this now. |
Hi Jay, this is a loadshedding issue on the RBE side. Our oncall is currently looking into it. Feel free to ping me internally if you need additional details, but hopefully we will have this resolved shortly. Are you still seeing the issue or did it just happen once? |
Thanks for looking into this! I'm still seeing the issue (as of ~20 minutes ago). I first noticed it when tests on a PR failed. They were still failing after a restart. Same deal with the nightly tests on the main branch. |
We see this externally a lot as well. I filed a ticket https://issuetracker.google.com/u/1/issues/171446864 |
Note envoy was hitting this as well, so it's another project we might be able to tell this was fixed on https://dev.azure.com/cncf/envoy/_build/results?buildId=55198&view=logs&jobId=fa3d3e18-6969-5713-c3e7-3581195704fd&j=fa3d3e18-6969-5713-c3e7-3581195704fd&t=ba370bef-ca49-52f5-0c93-7c0c0f27c465 |
rules_go is no longer seeing any issues. BuildKite is passing again. |
To close the loop here: this was triggered by a server change that has mostly been rolled back. Unfortunately a handful of clusters won't receive the update until next week. We've also root-caused and fixed the source of the failure so subsequent server releases won't cause issues. |
Apologies if this isn't the right repo: let me know if I should file this somewhere else.
A couple examples:
master
: https://buildkite.com/bazel/rules-go-golang/builds/2488master
: https://buildkite.com/bazel/bazel-remote-cache/builds/1321Errors look like:
The text was updated successfully, but these errors were encountered: