Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DimFilterHavingSpec. #3727

Merged
merged 2 commits into from
Dec 2, 2016
Merged

Add DimFilterHavingSpec. #3727

merged 2 commits into from
Dec 2, 2016

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Dec 1, 2016

Allows any DimFilters to be used as HAVING clauses.

@gianm gianm added the Feature label Dec 1, 2016
@gianm gianm added this to the 0.9.3 milestone Dec 1, 2016
@gianm gianm assigned fjy and jon-wei Dec 1, 2016
@gianm gianm force-pushed the having-dim-filter branch 2 times, most recently from ccc2ca3 to 6030148 Compare December 1, 2016 15:44
@b-slim
Copy link
Contributor

b-slim commented Dec 1, 2016

@gianm i am wondering how is this different from having a filter (out side of the having) ?

@fjy
Copy link
Contributor

fjy commented Dec 1, 2016

👍

@gianm
Copy link
Contributor Author

gianm commented Dec 1, 2016

@b-slim it's a HAVING filter, so it applies to the grouped results, not to the rows in the segment. You can use it to do things like only include groups that have sum(count) > x. Like the SQL SELECT dim1, SUM(cnt) FROM druid.foo WHERE dim2 = 'x' GROUP BY dim1 HAVING SUM(cnt) > 2 ORDER BY dim1 LIMIT 4

Copy link
Member

@leventov leventov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments


public class DimFilterHavingSpec implements HavingSpec
{
private static final byte CACHE_KEY = (byte) 0x9;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 0x9? If this is a counting byte (0..8 mean something different), maybe declare them all at the same place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it's a counting byte. Currently they aren't centralized. I could do that in a follow on PR, I don't really want to mix the two changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gianm agree


DimFilterHavingSpec that = (DimFilterHavingSpec) o;

return dimFilter != null ? dimFilter.equals(that.dimFilter) : that.dimFilter == null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dimFilter is non-null, so could be simplified unless you want to stick to "as IntelliJ generated" form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I think I can just tell IntelliJ this is non-null and it will generate better code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-generated with assumption that dimFilter is non-null.

@Override
public int hashCode()
{
return dimFilter != null ? dimFilter.hashCode() : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dimFilter is non-null, so could be simplified unless you want to stick to "as IntelliJ generated" form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-generated with assumption that dimFilter is non-null.

@b-slim
Copy link
Contributor

b-slim commented Dec 1, 2016

@gianm please bear with me the current example you posted is it possible to just use LessThanHavingSpec ?

@gianm
Copy link
Contributor Author

gianm commented Dec 1, 2016

@b-slim it is possible to use GreaterThanHavingSpec for that example, but this PR would allow you to also use BoundDimFilter. Some advantages are:

  • no need for users to learn two filter grammars; the DimFilters would work for both WHERE filters and HAVING filters.
  • allows use of any DimFilter in HAVING, including things that HavingSpecs don't currently support such as regex, "in", javascript, use of extractionFns, etc.
  • simplifies the SQL layer in Built-in SQL #3682 since it can generate DimFilters for both WHERE and HAVING (this is my main motivation for this PR).

@b-slim
Copy link
Contributor

b-slim commented Dec 1, 2016

@gianm Good point about re-using the syntax.

@gianm gianm force-pushed the having-dim-filter branch from 6030148 to 9e6b3bb Compare December 1, 2016 19:39
@jon-wei
Copy link
Contributor

jon-wei commented Dec 1, 2016

let's add a test that uses extraction fns with the having spec filters, e.g.:

@Test
  public void testDimFilterHavingSpecWithExFn()
  {
    String extractionJsFn = "function(str) { return 'super-' + str; }";
    ExtractionFn extractionFn = new JavaScriptExtractionFn(extractionJsFn, false, JavaScriptConfig.getDefault());

    String extractionJsFn2 = "function(num) { return num + 10; }";
    ExtractionFn extractionFn2 = new JavaScriptExtractionFn(extractionJsFn2, false, JavaScriptConfig.getDefault());

    List<Row> expectedResults = Arrays.asList(
        GroupByQueryRunnerTestHelper.createExpectedRow("2011-04-01", "alias", "business", "rows", 2L, "idx", 217L),
        GroupByQueryRunnerTestHelper.createExpectedRow("2011-04-01", "alias", "mezzanine", "rows", 6L, "idx", 4420L),
        GroupByQueryRunnerTestHelper.createExpectedRow("2011-04-01", "alias", "premium", "rows", 6L, "idx", 4416L)
    );

    final DimFilterHavingSpec havingSpec = new DimFilterHavingSpec(
        new AndDimFilter(
            ImmutableList.of(
                new OrDimFilter(
                    ImmutableList.of(
                        new BoundDimFilter("rows", "12", null, true, false, null, extractionFn2, StringComparators.NUMERIC),
                        new SelectorDimFilter("idx", "super-217", extractionFn)
                    )
                ),
                new SelectorDimFilter("__time", String.valueOf(new DateTime("2011-04-01").getMillis()), null)
            )
        )
    );

    GroupByQuery.Builder builder = GroupByQuery
        .builder()
        .setDataSource(QueryRunnerTestHelper.dataSource)
        .setInterval("2011-04-02/2011-04-04")
        .setDimensions(Lists.<DimensionSpec>newArrayList(new DefaultDimensionSpec("quality", "alias")))
        .setAggregatorSpecs(
            Arrays.asList(
                QueryRunnerTestHelper.rowsCount,
                new LongSumAggregatorFactory("idx", "index")
            )
        )
        .setGranularity(new PeriodGranularity(new Period("P1M"), null, null))
        .setHavingSpec(havingSpec);

    final GroupByQuery fullQuery = builder.build();
    TestHelper.assertExpectedObjects(
        expectedResults,
        GroupByQueryRunnerTestHelper.runQuery(factory, runner, fullQuery),
        ""
    );
  }

👍 otherwise

@gianm
Copy link
Contributor Author

gianm commented Dec 2, 2016

@jon-wei added your test, thanks for writing it

@fjy fjy merged commit 4c5d10f into apache:master Dec 2, 2016
dgolitsyn pushed a commit to metamx/druid that referenced this pull request Feb 14, 2017
* Add DimFilterHavingSpec.

* Add test for DimFilterHavingSpec with extractionFns.
seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this pull request Jan 10, 2020
@gianm gianm deleted the having-dim-filter branch September 23, 2022 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants