-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make batched segment sampling the default, minor cleanup of coordinator config #13391
Conversation
@@ -115,7 +116,7 @@ public CoordinatorDynamicConfig( | |||
@JsonProperty("mergeSegmentsLimit") int mergeSegmentsLimit, | |||
@JsonProperty("maxSegmentsToMove") int maxSegmentsToMove, | |||
@Deprecated @JsonProperty("percentOfSegmentsToConsiderPerMove") @Nullable Double percentOfSegmentsToConsiderPerMove, | |||
@JsonProperty("useBatchedSegmentSampler") boolean useBatchedSegmentSampler, | |||
@Deprecated @JsonProperty("useBatchedSegmentSampler") boolean useBatchedSegmentSampler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is setting the default value of useBatchedSegmentSampler
to true
missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this, @rohangarg ! It was in a different patch.
Updated to use the default of true
both when
- deserializing using the constructor. This would mean that configs already stored in the DB with no explicit value of
useBatchedSegmentSampler
would now start usingtrue
. - creating/deserializing using the Builder. So newly created/updated configs would now use
true
.
Also updated tests to verify this.
{ | ||
@Test | ||
public void testFindIntervalForKill() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test essentially just verifies the JodaUtils.umbrellaInterval()
method and these cases are already being verified in JodaUtilsTest
.
The required test cases from here have been merged into the other tests in this class.
@@ -115,7 +116,7 @@ public CoordinatorDynamicConfig( | |||
@JsonProperty("mergeSegmentsLimit") int mergeSegmentsLimit, | |||
@JsonProperty("maxSegmentsToMove") int maxSegmentsToMove, | |||
@Deprecated @JsonProperty("percentOfSegmentsToConsiderPerMove") @Nullable Double percentOfSegmentsToConsiderPerMove, | |||
@JsonProperty("useBatchedSegmentSampler") boolean useBatchedSegmentSampler, | |||
@Deprecated @JsonProperty("useBatchedSegmentSampler") boolean useBatchedSegmentSampler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this, @rohangarg ! It was in a different patch.
Updated to use the default of true
both when
- deserializing using the constructor. This would mean that configs already stored in the DB with no explicit value of
useBatchedSegmentSampler
would now start usingtrue
. - creating/deserializing using the Builder. So newly created/updated configs would now use
true
.
Also updated tests to verify this.
I think the default value for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes @kfaraz.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the changes
Thanks for the reviews, @rohangarg , @AmatyaAvadhanula ! |
Description
The batch segment sampling added in #11257 performs significantly better than the older method of sampling if there are a large number of used segments. It also avoid duplicate results in sampling.
Changes
useBatchedSegmentSampler
druid.coordinator.loadqueuepeon.repeatDelay
KillUnusedSegments
KillUnusedSegmentsTest
, add better tests, remove redundant testsRelease note
Batch sampling has been made the default method for sampling segments during balancing as it performs significantly better than the alternative when there is a large number of used segments in the cluster.
The following have been deprecated and will be removed in future releases:
useBatchedSegmentSampler
percentOfSegmentsToConsiderPerMove
BalanceSegments
The unused coordinator property
druid.coordinator.loadqueuepeon.repeatDelay
has been removed.Use only
druid.coordinator.loadqueuepeon.http.repeatDelay
to configure repeat delay for the HTTP-based segment loading queue.This PR has: