Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to configure default analysis types in SegmentMetadataQuery #4259

Merged
merged 23 commits into from
May 26, 2017
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,10 @@ public SegmentMetadataQueryQueryToolChest(SegmentMetadataQueryConfig config)
}

@Inject
public SegmentMetadataQueryQueryToolChest(SegmentMetadataQueryConfig config, GenericQueryMetricsFactory queryMetricsFactory)
public SegmentMetadataQueryQueryToolChest(
SegmentMetadataQueryConfig config,
GenericQueryMetricsFactory queryMetricsFactory
)
{
this.config = config;
this.queryMetricsFactory = queryMetricsFactory;
Expand Down Expand Up @@ -233,18 +236,30 @@ public SegmentAnalysis apply(@Nullable SegmentAnalysis input)
@Override
public <T extends LogicalSegment> List<T> filterSegments(SegmentMetadataQuery query, List<T> segments)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of the changes in this method? They don't seem related to the rest of the patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these are unrelated to the changes proposed for this patch, I'll open up another PR for this.
The current implementation does not take into consideration the time range while filtering, I added an improvement to only include segments that are within the intervals mentioned by the query.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation does not take into consideration the time range while filtering, I added an improvement to only include segments that are within the intervals mentioned by the query.

filterSegments doesn't need to worry about that, since the list of segments it gets has already been pruned down to whatever matches the intervals filter. In fact many implementations just return segments; with no changes -- check out timeseries for an example. filterSegments only has to potentially filter it down even further, if it wants (see timeBoundary for another example of that). I suppose the javadocs for filterSegments could be more clear on this.

Please also see #4259 (review) for rationale about the current behavior of defaultInterval.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that this method was called to prune the segments to include only the required ones from a list of all segments, didn't know it was already pruned. Now that I look at the examples you suggested I understand why it is required even if we already prune it.

Reg: defaultInterval - I had some failing tests that were because of defaultInterval, I don't remember the exact test case, but reverting to the old behavior did not break any tests (probably the change of not having null configs helped I guess), so we can keep the implementation on master

I excluded these changes from this PR.

{
if (!query.isUsingDefaultInterval()) {
return segments;
}

if (segments.size() <= 1) {
return segments;
}

final T max = segments.get(segments.size() - 1);

DateTime targetEnd = max.getInterval().getEnd();
final Interval targetInterval = new Interval(config.getDefaultHistory(), targetEnd);
List<Interval> intervals = query.getIntervals();

DateTime queryStartTime = JodaUtils.ETERNITY.getEnd();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other parts of the Druid codebase new DateTime(JodaUtils.MAX_INSTANT) is used

DateTime queryEndTIme = JodaUtils.ETERNITY.getStart();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryEndTime


for (Interval interval : intervals) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

queryStartTime = intervals.stream().map(Interval::getEnd)
    .max(Ordering.natural()).orElseThrow(IllegalStateException::new);

queryEndTIme = queryEndTIme.isAfter(interval.getEnd()) ? queryEndTIme : interval.getEnd();
queryStartTime = queryStartTime.isBefore(interval.getStart()) ? queryStartTime : interval.getStart();
}

Interval targetInterval;
if (!query.isUsingDefaultInterval()) {
targetInterval = new Interval(queryStartTime, queryEndTIme);
} else {
DateTime finalEndTime = queryEndTIme.isBefore(targetEnd) ? queryEndTIme : targetEnd;
targetInterval = new Interval(config.getDefaultHistory(), finalEndTime);
}

return Lists.newArrayList(
Iterables.filter(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -117,13 +117,8 @@ public SegmentMetadataQuery(
if (querySegmentSpec == null) {
this.usingDefaultInterval = true;
} else {
if (querySegmentSpec.getIntervals().size() == 1 && querySegmentSpec.getIntervals()
.get(0)
.equals(DEFAULT_INTERVAL)) {
this.usingDefaultInterval = true;
} else {
this.usingDefaultInterval = false;
}
this.usingDefaultInterval = (querySegmentSpec.getIntervals().size() == 1 &&
querySegmentSpec.getIntervals().get(0).equals(DEFAULT_INTERVAL));
}
this.toInclude = toInclude == null ? new AllColumnIncluderator() : toInclude;
this.merge = merge == null ? false : merge;
Expand Down Expand Up @@ -262,9 +257,14 @@ public SegmentMetadataQuery withFinalizedAnalysisTypes(SegmentMetadataQueryConfi
return this;
}
return Druids.SegmentMetadataQueryBuilder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't create a new query object if analysisTypes are already non-null, just return this

.copy(this)
.analysisTypes(config.getDefaultAnalysisTypes())
.build();
.copy(this)
.analysisTypes(config.getDefaultAnalysisTypes())
.build();
}

public List<Interval> getIntervals()
{
return this.getQuerySegmentSpec().getIntervals();
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1104,9 +1104,8 @@ public Interval getInterval()
Assert.assertEquals(expectedSegments3.get(i).getInterval(), filteredSegments2.get(i).getInterval());
}

Assert.assertFalse(testQuery2.isUsingDefaultInterval());
Assert.assertTrue(testQuery.isUsingDefaultInterval());

Assert.assertFalse(testQuery2.isUsingDefaultInterval());

}

Expand Down