Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mitigate potential overloads in Round Robin load balancing in the event of node failure #676

Open
havaker opened this issue Mar 23, 2023 · 0 comments

Comments

@havaker
Copy link
Contributor

havaker commented Mar 23, 2023

In the default load balancing policy, round robin can lead to overloading of nodes in the event of a node failure. Under the usual round robin order, if a node such as A fails, the next node in the sequence (in this case, B) will take on all of A's requests, potentially causing it to become overloaded.

A potential solution to this issue is to shuffle chosen nodes in each load balancing plan's group, which would distribute the failed node's load more evenly among the remaining nodes. However, it should be noted that random shuffling is currently only implemented for replica choosing in the scylla::transport::load_balancing::DefaultPolicy. Shuffling all the nodes in the later stages of constructing a load balancing plan was considered, but deemed too costly, resulting in the use of round robin (#612 (comment)).

@havaker havaker changed the title Mitigate overloads in Round Robin load balancing in the event of node failure Mitigate potential overloads in Round Robin load balancing in the event of node failure Mar 23, 2023
@piodul piodul added this to the 1.1.0 milestone Mar 28, 2023
@Lorak-mmk Lorak-mmk self-assigned this Nov 15, 2023
@Lorak-mmk Lorak-mmk removed their assignment Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants