Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel tests are flaky #968

Closed
johanbrandhorst opened this issue Jul 8, 2019 · 14 comments
Closed

Bazel tests are flaky #968

johanbrandhorst opened this issue Jul 8, 2019 · 14 comments
Labels

Comments

@johanbrandhorst
Copy link
Collaborator

The bazel tests fail with connection refused errors every now and then. Seems we might need to wait a little longer for the test server to start.

@johanbrandhorst
Copy link
Collaborator Author

Happened again today, but seems to be bazel consuming too many resources and the host killing the process? Not quite sure.

https://circleci.com/gh/grpc-ecosystem/grpc-gateway/3484

@achew22
Copy link
Collaborator

achew22 commented Aug 5, 2019

My vote is that we activate the Bazel team's CI and deactivate the CircleCI based one. Step 1 in that process is here #983

@srenatus
Copy link
Contributor

Cool. So, in buildkite.com/bazel, test fail consistently. Like this one: https://buildkite.com/bazel/grpc-ecosystem-grpc-gateway/builds/104#8f57d1c1-3082-4a3d-8ad3-f1c35f0bf987

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //examples/integration:go_default_test
-----------------------------------------------------------------------------
E0911 09:09:51.099595      14 main.go:73] Failed to listen and serve: listen tcp :8080: bind: address already in use
cannot run gateway service: listen tcp :8080: bind: address already in use

(from the uploaded test logs in buildkite artifacts)

👀 I guess it's either running test in parallel that shouldn't, or... something else is bound to 8080. Either way, making the tests use a random available port (i.e. binding to :0, then retrieving the port that was used) could be an alternative 🤔

@johanbrandhorst
Copy link
Collaborator Author

TIL. Sounds like unintentional parallelism to me. Unfortunately I understand nothing of how this works, so I defer to @achew22.

@srenatus
Copy link
Contributor

Well, I'm sure our tests would be more robust if they could run in parallel. 😃 But yeah, that's the harder route for sure.

@drigz
Copy link
Contributor

drigz commented Sep 27, 2019

FYI, if you add tags = ["exclusive"] for a Bazel test it will not be run in parallel with other tests, so you can use this for tests that bind to a fixed port.

@stale
Copy link

stale bot commented Nov 26, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Nov 26, 2019
@johanbrandhorst
Copy link
Collaborator Author

Flipping stale bot giving me lip

@stale stale bot removed the wontfix label Nov 27, 2019
@johanbrandhorst
Copy link
Collaborator Author

@stale
Copy link

stale bot commented Mar 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 22, 2020
@johanbrandhorst
Copy link
Collaborator Author

Sigh

@stale stale bot removed the wontfix label Mar 22, 2020
drigz added a commit to drigz/grpc-gateway that referenced this issue Mar 23, 2020
This aims to reduce/eliminate the CircleCI flakes described in grpc-ecosystem#968.
It's based on the approach in
https://github.com/angular/angular-cli/blob/master/.circleci/bazel.rc
johanbrandhorst pushed a commit that referenced this issue Mar 23, 2020
This aims to reduce/eliminate the CircleCI flakes described in #968.
It's based on the approach in
https://github.com/angular/angular-cli/blob/master/.circleci/bazel.rc
adasari pushed a commit to adasari/grpc-gateway that referenced this issue Apr 9, 2020
This aims to reduce/eliminate the CircleCI flakes described in grpc-ecosystem#968.
It's based on the approach in
https://github.com/angular/angular-cli/blob/master/.circleci/bazel.rc
@adasari
Copy link
Contributor

adasari commented Apr 29, 2020

@johanbrandhorst still there tests which is uses ports (secondary) in test. listener.Close() is not synchronous and never guarantee that port is available for immediate use in other test. Using unique ports fixes the problem.

pull bot pushed a commit to BuildingRobotics/grpc-gateway that referenced this issue Apr 29, 2020
This aims to reduce/eliminate the CircleCI flakes described in grpc-ecosystem#968.
It's based on the approach in
https://github.com/angular/angular-cli/blob/master/.circleci/bazel.rc
@johanbrandhorst
Copy link
Collaborator Author

Thanks for pointing that out, it could be worth doing something about, but this issue is specifically around the bazel tests, which are failing independently of the other tests (which run the same tests) due to what looks like resource constraints a lot of the time.

@johanbrandhorst
Copy link
Collaborator Author

Might be fixed by caching the bazel artifacts and using explicit limits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants