-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding filters for TimeBoundary on backend #3168
Adding filters for TimeBoundary on backend #3168
Conversation
Signed-off-by: Balachandar Kesavan <[email protected]>
@rajk-tetration Thanks for this contribution. We'll try to get to reviewing it soon. |
Tagging for 0.9.2 to make sure it doesn't get lost in the noise |
@rajk-tetration, I haven't looked in detail yet, but I bet you'll also have to adjust For the test I see you had a comment like "no rows matched the filter--what is expected behavior in that case?". Probably the same as whatever an unfiltered timeBoundary would do if there were rows at all (I think that's returning an empty array, but double-check). |
}; | ||
|
||
if (legacyQuery.getDimensionsFilter() != null) { | ||
/* Should make a function? The only difference between these two parts is the descending boolean. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think making this a function would be good
1.) I think the TimeBoundaryQueryRunnerFactory implementation is reasonable 2.) For the QueryRunnerTestHelper, the datasource isn't actually used if you call
that function uses a hard-coded set of segments. For testing, MultiSegmentSelectQueryTest may be a useful reference. The default The multi segment structure may also be useful for testing the
|
Thanks for the comments. @gianm I just deleted the filterSegments override and let it inherit; my reasoning was that there's no way to tell which segments are needed without the actual filter (I think similar to the TimeSeries case) - is this correct? @jon-wei Thanks, that was really helpful, I was able to use it to do testing on multiple segments. |
We really do want to filter the segments down when there is no filter - that's the common use case for timeBoundary and the optimization can make a huge difference. |
@gianm Right, that makes sense. Updated. |
@Override | ||
public Result<DateTime> apply(Cursor cursor) | ||
{ | ||
if (cursor.isDone()) { return null; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
formatting issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, should return null
be on a separate line?
private final String bound; | ||
|
||
@JsonCreator | ||
public TimeBoundaryQuery( | ||
@JsonProperty("dataSource") DataSource dataSource, | ||
@JsonProperty("intervals") QuerySegmentSpec querySegmentSpec, | ||
@JsonProperty("bound") String bound, | ||
@JsonProperty("filter") DimFilter dimFilter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires updating the timeboundary query documentation in docs/content/querying
👍 after docs updated. |
@rajk-tetration TimeBoundaryQuery.getCacheKey needs to be updated too. |
@@ -9,6 +9,7 @@ Time boundary queries return the earliest and latest data points of a data set. | |||
"queryType" : "timeBoundary", | |||
"dataSource": "sample_datasource", | |||
"bound" : < "maxTime" | "minTime" > # optional, defaults to returning both timestamps if not set | |||
"filter" : { "type": "and", "fields": [<filter>, <filter>, ...] } # optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this is an "and" filter rather than just <filter>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find an example of just using "filter": <filter>
anywhere else in the docs, for example in TimeSeries an actual filter is given. I can change it, just wanted to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, fair enough, I guess this style predated your edit so feel free to leave it that way.
@gianm I'm not really sure what |
@gianm - do you have any advice/references on implementing |
@raj-kesavan sorry for the delay, I missed your first message… getCacheKey is used for the query cache, which is per-query, per-segment. It should include anything that might affect the query result on a per segment basis. Filters definitely should be included. You could check out TimeseriesQueryQueryToolChests -> getCacheStrategy -> computeCacheKey for an example of how to do that. |
👍 |
I'm really confused here, is this a PR from @raj-kesavan or @rajk-tetration ? ((edit: had double the same name)) |
((clarification is important for authorship and CLA guarantees) |
{ | ||
return false; | ||
} | ||
public boolean hasFilters() { return dimFilter != null; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting is off here. You can use the IntelliJ/Eclipse style bundles in https://github.com/druid-io/druid/blob/master/CONTRIBUTING.md to format the code automatically.
@rajk-tetration looking good, just a few final comments from me! |
@rajk-tetration Thanks for the contribution! Would you mind filling out the CLA at http://druid.io/community/cla.html please? |
@gianm Does this look okay? I was able to import the xml file into IntelliJ but pressing "reformat code" never does anything. @drcrallen Will do, thanks |
@rajk-tetration, interesting, I wonder if you need to choose "Druid code style" in your IntelliJ code style settings? Anyway the formatting does look better now. 👍 from me after travis + CLA are sorted. |
@rajk-tetration did you get a chance to sign the CLA? |
@fjy did you receive the CLA by e-mail? |
Got CLA. We are good 👍 |
This adds support for adding filters in TimeBoundaryQuerys.
Questions:
Thanks.