-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce method invocation of reservoir sampling #11257
Reduce method invocation of reservoir sampling #11257
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful to add a benchmark for this code if possible to show the difference of the new implementation, maybe here: https://github.com/apache/druid/tree/master/benchmarks/src/test/java/org/apache/druid/server/coordinator
final BalancerStrategy strategy = params.getBalancerStrategy(); | ||
final int maxIterations = 2 * maxSegmentsToMove; | ||
final int maxToLoad = params.getCoordinatorDynamicConfig().getMaxSegmentsInNodeLoadingQueue(); | ||
int moved = 0, unmoved = 0; | ||
|
||
Iterator<BalancerSegmentHolder> segmetnsToMove = strategy.pickSegmentsToMove( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo segmetnsToMove
-> segmentsToMove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
static List<BalancerSegmentHolder> getRandomBalancerSegmentHolders( | ||
final List<ServerHolder> serverHolders, | ||
Set<String> broadcastDatasources, | ||
int k |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to retain the percentOfSegmentsToConsider
functionality to allow short-circuiting iterateAllSegments across all servers? We update the docs https://github.com/apache/druid/blob/master/docs/configuration/index.md#dynamic-configuration accordingly either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to me to retain the percentOfSegmentsToConsider
and add an another dynamic parameter useBatchedSegmentSampler
to enable the new improvement. In this way, the changing will be less aggressive.
if (numSoFar < k) { | ||
holders.add(new BalancerSegmentHolder(server.getServer(), segment)); | ||
numSoFar++; | ||
continue; | ||
} | ||
int randNum = ThreadLocalRandom.current().nextInt(numSoFar + 1); | ||
if (randNum < k) { | ||
holders.set(randNum, new BalancerSegmentHolder(server.getServer(), segment)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it would be worth adding some code comments here to describe that this algorithm has 2 phases, where it picks the first k
segments from the servers, in order, then iterates through the server list randomly replacing these picked segments with decreasing frequency the more segments have been iterated.
Im also somewhat curious/nervous about random this pick method is compared to the previous one, though I'm not entirely sure how it would be measured.
I think we should add a new coordinator dynamic config property, https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/coordinator/CoordinatorDynamicConfig.java, to allow falling back to the old algorithm in case there are any unpredictable strange behavior caused by this new algorithm, maybe useBatchedSegmentSampler
or .. i'm not sure, naming is hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random picking here is a standard implementation of a reservoir sampling algorithm. It makes all segments have the same probability to be selected. The original code also uses this trick, so no need to worry :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Echoing what @clintropolis suggested, it would be useful to have a new dynamic config to toggle between the old and new implementations. We could keep the new config for a couple of releases which would be enough time for operators to test it out and later on we could do away with the config property.
Could we also consider including percentOfSegmentsToConsider
as part of this new approach? This is a useful knob especially for larger clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to me to retain the percentOfSegmentsToConsider
and add an another dynamic parameter useBatchedSegmentSampler
to enable the new improvement. In this way, the changing will be less aggressive.
@@ -71,13 +71,60 @@ | |||
* @return {@link BalancerSegmentHolder} containing segment to move and server it currently resides on, or null if | |||
* there are no segments to pick from (i. e. all provided serverHolders are empty). | |||
*/ | |||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't really seem deprecated so much as no longer called at all, maybe we should remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
The new algorithm implementation looks good to me. But it drops the |
It would be useful to incorporate |
I haven't read the code yet, just reading comments and responding to my tag from @a2l007 ... I do think it is very helpful to keep the Also this patch and the issue it is resolving would help with the underlying performance issues of the design here. My patch to short circuit was more of a bandaid. This seems like a design improvement that visits the root issue of this method being invoked more than needed! So maybe my tuning config is less helpful once this PR is in? Or maybe it is still useful and both together would be great? |
Hi @capistrant, thanks for your comment. It's great to know that we both have tried to solve this issue. Ideally, this patch should be able to solve this issue fundamentally while it makes sense to me to evolve gradually in engineering. So, I agree to retain the And it'll be great if you can help to review this PR! |
Here's the benchmark data(or see JMH result graph)
|
6f2ad91
to
3b14ef2
Compare
3b14ef2
to
b02a9e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuanlihan whoa, the benchmark result looks great 👍 The latest change LGTM.
|
||
@Override | ||
public BalancerSegmentHolder next() | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
if (!hasNext()) {
throw new NoSuchElementException();
}
@FrankChen021 @a2l007 @clintropolis do you have more comments? |
@yuanlihan Great work. Code change LGTM. For benchmark, could you add a combination of |
Thanks for this patch. Looking forward to using this in our clusters. It would be nice if you could add a few lines in the doc regarding a recommended value for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for your contribution
wow, awesome benchmark results! So sorry that I didn't participate more in the review of this PR, I hav been pulled away from Druid more than I'd like lately. Regardless, I see that y'all had it more than covered! So excited to start using this 💯 |
Fixes #11256.
Description
Adding a new Reservoir Sampling method to sample K elements each time instead of only one element per method invocation.
A default method
pickSegmentsToMove
will be added to interfaceBalancerStrategy
to pick K segments to move in a single method invocation.Key changed/added classes in this PR
BalancerStrategy
ReservoirSegmentSampler
BalanceSegments
This PR has: