-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional native query tests for unnest datasource #13554
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for adding more tests, just some minor comments 🚀
* under the License. | ||
*/ | ||
|
||
package org.apache.druid.query; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if this was in org.apache.druid.query.topn
with all the other topn query tests then wouldn't need change method visibility in PooledTopNAlgorithm
* under the License. | ||
*/ | ||
|
||
package org.apache.druid.query; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: imo this should live in org.apache.druid.query.groupby
* under the License. | ||
*/ | ||
|
||
package org.apache.druid.query; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this should live in org.apache.druid.query.scan
public static final ObjectMapper DEFAULT_MAPPER = TestHelper.makeSmileMapper(); | ||
public static final DruidProcessingConfig DEFAULT_PROCESSING_CONFIG = new DruidProcessingConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since these and some other stuff are public in the standard GroupByQueryRunnerTest
I wonder if we should just use those instead of redefining here
@Test | ||
public void testGroupByOnUnnestedColumn() | ||
{ | ||
// Cannot vectorize due to extraction dimension spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this comment isn't true
); | ||
|
||
Iterable<ResultRow> results = GroupByQueryRunnerTestHelper.runQuery(factory, runner, query); | ||
TestHelper.assertExpectedObjects(expectedResults, results, "hroupBy-on-unnested-column"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
TestHelper.assertExpectedObjects(expectedResults, results, "hroupBy-on-unnested-column"); | |
TestHelper.assertExpectedObjects(expectedResults, results, "groupBy-on-unnested-column"); |
Iterable<ResultRow> results = GroupByQueryRunnerTestHelper.runQuery(factory, runner, query); | ||
TestHelper.assertExpectedObjects(expectedResults, results, "hroupBy-on-unnested-column"); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to also have a test with a virtual column, maybe both mv_to_array
and making a composite array from multiple columns with array
would be interesting
} | ||
|
||
@Test | ||
public void testUnnestRunner() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: testScan
?
} | ||
|
||
@Test | ||
public void testUnnestRunnerVirtualColumns() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think would also be interesting to have a test on a composite array column using array
function to create an array from multiple columns (in addition to mv_to_array
test)
} | ||
|
||
@Test | ||
public void testTopNStringVirtualColumnUnnest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment about virtual column created from multiple columns using array
in addition to mv_to_array
test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to understand better why you had to pipe things down into the QueryRunner. You shouldn't've needed to do that, it makes me think that these tests aren't running anything that actually does the segment mapping function...
public static <T, QueryType extends Query<T>> List<QueryRunner<T>> makeUnnestQueryRunners( | ||
QueryRunnerFactory<T, QueryType> factory, | ||
String dimensionToUnnest, | ||
String outputColumn, | ||
LinkedHashSet<String> allowSet | ||
) | ||
{ | ||
final IncrementalIndex rtIndex = TestIndex.getIncrementalTestIndex(); | ||
final IncrementalIndex noRollupRtIndex = TestIndex.getNoRollupIncrementalTestIndex(); | ||
final QueryableIndex mMappedTestIndex = TestIndex.getMMappedTestIndex(); | ||
final QueryableIndex noRollupMMappedTestIndex = TestIndex.getNoRollupMMappedTestIndex(); | ||
final QueryableIndex mergedRealtimeIndex = TestIndex.mergedRealtimeIndex(); | ||
final QueryableIndex frontCodedMappedTestIndex = TestIndex.getFrontCodedMMappedTestIndex(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused as to why you have to do this? The Unnest is defined as part of the query, not as part of the query runner, you should be able to just send the query along and it will attach the unnest on the segment via the segment mapper functions, no?
public UnnestSegment(Segment baseSegment, String dimension, String outputName, LinkedHashSet<String> allowList) | ||
{ | ||
this.baseSegment = baseSegment; | ||
this.dimension = dimension; | ||
this.renamedOutputDimension = outputName; | ||
this.allowSet = allowList; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this should actually be required. The segment mapping fns should be doing this as part of query processing.
makeRow( | ||
query, | ||
"2011-04-01", | ||
"alias", | ||
"automotive", | ||
"rows", | ||
2L, | ||
"idx", | ||
270L | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit, I don't think these are actually wide enough to require each of the arguments being on independent lines. The test would read a lot easier if each of the makeRow()
calls was its own line.
Map<String, Object> event = new HashMap<>(); | ||
String[] values1 = input.split("\\t"); | ||
for (int i = 0; i < dimSpecs.length; i++) { | ||
if (dimSpecs[i] == null || i >= dimSpecs.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this CodeQL thing is likely legitimate
Migrated towards using the segmentMapFunction along with |
processing/src/test/java/org/apache/druid/query/scan/UnnestScanQueryRunnerTest.java
Fixed
Show fixed
Hide fixed
private final boolean vectorize; | ||
|
||
@Rule | ||
public ExpectedException expectedException = ExpectedException.none(); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
processing/src/test/java/org/apache/druid/query/topn/UnnestTopNQueryRunnerTest.java
Fixed
Show fixed
Hide fixed
processing/src/test/java/org/apache/druid/query/scan/UnnestScanQueryRunnerTest.java
Fixed
Show fixed
Hide fixed
processing/src/test/java/org/apache/druid/query/scan/UnnestScanQueryRunnerTest.java
Fixed
Show fixed
Hide fixed
processing/src/test/java/org/apache/druid/query/scan/UnnestScanQueryRunnerTest.java
Fixed
Show fixed
Hide fixed
processing/src/test/java/org/apache/druid/query/groupby/UnnestGroupByQueryRunnerTest.java
Fixed
Show resolved
Hide resolved
.idea/misc.xml
Outdated
<component name="SuppressKotlinCodeStyleNotification"> | ||
<option name="disableForAll" value="true" /> | ||
</component> | ||
</project> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to this file appear to still be checked in? We cannot merge while this is part of the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge #13576 and then I'll merge the master with this branch and take care of this one. The UnnestColumnValueSelectorCursor is updated in that PR and this needs conflict resolution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is corrected now
The CodeQL stuff seems annoying, but if we look at what it's saying and try to make it better, I think it actually does leave the code in a better state than we found it in, so let's try to resolve those CodeQL things. |
Updated to remove duplicated code, unused exception and moved to Edit: The numbers.parse has some NPEs currently binding the codeQL part inside try catch blocks |
Native tests for the unnest datasource.
Adds the tests and changes for native tests for unnest in Scan, TopN, GroupBy etc.
This PR has: