[grid] Fix flaky event bus tests by dedicated threading, reverting the polling loop logic and increasing poll timeout #9383
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thanks for contributing to Selenium!
A PR well described will help maintainers to quickly review and merge it
Before submitting your PR, please check our contributing guidelines.
Avoid large PRs, help reviewers by making them as simple and short as possible.
Description
Event bus tests when run using the following command would fail :
bazel test --cache_test_results=no --runs_per_test=20 //java/server/test/org/openqa/selenium/events:EventBusTest --test_filter=org.openqa.selenium.events.EventBusTest#
Motivation and Context
The pattern observed in the failure was that the countdown latch would keep waiting and never receive the messages as expected intermittently event after increasing the wait times.
This was primarily due to errors :
This was observed when tcp transport layer was used in creating the socket when the poll wait time was set to return immediately i.e. set to 0. Once this was updated to a higher value, the tests were not longer flaky but the run time was very slow since a scheduled thread ran and waited for a bit in each run.
Reverting the polling loop to
selenium/java/server/src/org/openqa/selenium/events/zeromq/UnboundZmqEventBus.java
Line 118 in de8579b
Additionally, closed the poller during resource clean up. Added a separate dedicated thread to publish messages.
Types of changes
Checklist