-
Notifications
You must be signed in to change notification settings - Fork 809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create naive isolation group matching loadbalancer #6570
Conversation
taskListType: taskListType, | ||
} | ||
wI := lb.weightCache.Get(taskListKey) | ||
if wI == nil { | ||
return lb.fallbackLoadBalancer.PickReadPartition(domainID, taskList, taskListType, forwardedFrom) | ||
return lb.fallbackLoadBalancer.PickReadPartition(taskListType, req, isolationGroup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiLoadBalancer
has the common checks and must be the only one used directly as wrapper going forward. It also has a defaultLoadBalancer
. So we may simplify weightedLoadBalancer
by getting rid of fallback logic and returning a special error to indicate fallback to default lb is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Going to address in followup refactor.
} | ||
|
||
func (i *isolationLoadBalancer) pickBetween(partitions map[int]any) int { | ||
// Could alternatively use backlog weights to make a smarter choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! this would be great enhancement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, will add in followup.
This is an intentionally naive implementation of a load balancer that assigns isolation groups to specific partitions and routes pollers/tasks to those partitions based on their isolation group. Any time there are multiple options one is picked randomly. The goal of this implementation is to benchmark how significantly isolation-group-based routing improves task latencies and isolation group containment for non-extremely-skewed scenarios. It additionally provides a baseline to see how much a more sophisticated solution (storing the isolation group assignment with the partitions and dynamically rebalancing them) might improve these metrics. This isn't intended to be used in production as-is.
This is an intentionally naive implementation of a load balancer that assigns isolation groups to specific partitions and routes pollers/tasks to those partitions based on their isolation group. Any time there are multiple options one is picked randomly.
The goal of this implementation is to benchmark how significantly isolation-group-based routing improves task latencies and isolation group containment for non-extremely-skewed scenarios. It additionally provides a baseline to see how much a more sophisticated solution (storing the isolation group assignment with the partitions and dynamically rebalancing them) might improve these metrics.
This isn't intended to be used in production as-is.
What changed?
Why?
How did you test it?
Potential risks
Release notes
Documentation Changes