Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create naive isolation group matching loadbalancer #6570

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

natemort
Copy link
Member

This is an intentionally naive implementation of a load balancer that assigns isolation groups to specific partitions and routes pollers/tasks to those partitions based on their isolation group. Any time there are multiple options one is picked randomly.

The goal of this implementation is to benchmark how significantly isolation-group-based routing improves task latencies and isolation group containment for non-extremely-skewed scenarios. It additionally provides a baseline to see how much a more sophisticated solution (storing the isolation group assignment with the partitions and dynamically rebalancing them) might improve these metrics.

This isn't intended to be used in production as-is.

What changed?

  • Added a benchmarking-only implementation
  • Refactoring related to matching client load balancers.

Why?

  • Allows for analyzing the impact of isolation group to partition assignment.
  • Refactoring allows for implementing more sophisticated solutions in the future.

How did you test it?

  • Unit tests.

Potential risks

Release notes

Documentation Changes

taskListType: taskListType,
}
wI := lb.weightCache.Get(taskListKey)
if wI == nil {
return lb.fallbackLoadBalancer.PickReadPartition(domainID, taskList, taskListType, forwardedFrom)
return lb.fallbackLoadBalancer.PickReadPartition(taskListType, req, isolationGroup)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiLoadBalancer has the common checks and must be the only one used directly as wrapper going forward. It also has a defaultLoadBalancer. So we may simplify weightedLoadBalancer by getting rid of fallback logic and returning a special error to indicate fallback to default lb is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Going to address in followup refactor.

}

func (i *isolationLoadBalancer) pickBetween(partitions map[int]any) int {
// Could alternatively use backlog weights to make a smarter choice
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! this would be great enhancement

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, will add in followup.

This is an intentionally naive implementation of a load balancer that assigns isolation groups to specific partitions and routes pollers/tasks to those partitions based on their isolation group. Any time there are multiple options one is picked randomly.

The goal of this implementation is to benchmark how significantly isolation-group-based routing improves task latencies and isolation group containment for non-extremely-skewed scenarios. It additionally provides a baseline to see how much a more sophisticated solution (storing the isolation group assignment with the partitions and dynamically rebalancing them) might improve these metrics.

This isn't intended to be used in production as-is.
@natemort natemort merged commit b22774a into cadence-workflow:master Dec 19, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants