Try converting all inner joins to filters #13201

rohangarg · 2022-10-10T14:43:00Z

Recently, support for partial pushdown of joins (having duplicates or output referencing columns) as filters was added. Still currently in a list of joins, we stop converting joins to filters as soon as one of them doesn't convert fully to a filter. This change allows for all joins to covert to filters fully or partially maintaining that :

If a join has duplicates or output referencing columns, then the join is partially converted
If a join table uses the output of a second join table in the matching condition, then the second join table won't be fully converted to a filter

This PR has:

been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

paul-rogers · 2022-10-12T23:58:54Z

Generic question, when we say "Try converting all inner joins to filters", does this include the case where the join would return a billion (or 100 billion) rows? That is, are we considering cardinality? Or, are we doing this only for cases where we know the cardinality must be low (such as for inline tables)?

In a traditional RDBMS, one looks at the (estimated) cardinality of the tables to determine if this kind of conversion is safe. Set some threshold: 100 items? 1000? 10K? If the estimated cardinality is above that, then the memory-for-time tradeoff doesn't work and we're better off retaining the join.

If the code can't figure this out (Druid doesn't really do cost estimation), how can a user force the join choice? New keyword? Query hint in SQL? Query context entry?

rohangarg · 2022-10-13T10:04:36Z

@paul-rogers : Thanks for the suggestion and explanation! :) Yes, we already do have a configurable limit (default 10k elements) on cardinality for this conversion. If the number of elements in the join table are above that limit, we don't convert the join as a filter and retain it as is. If the value is less than the threshold, then we convert the join to an InFilter. The advantage being that with InFilter the filter values can potentially use bitmap filtering instead of running the value matching based join filter.
The joins in druid currently are always broadcast and we always materialize the build table before planning the physical join execution. In multi-join case, maybe cost estimation (using probe and builds' cardinality) would be useful to decide which join to run first.
To force the execution of join or filter, currently you can toggle the configurable limit. As the limit is decreased, more things are run as join themselves without any filter conversions. And as the limit increases, the join to filter conversions increase.

gianm

This change is essentially reordering joins. (The old code, because it only picked off joins from the start of the clause list, did not reorder.)

I'm wondering if the new code correctly handles the verification of required columns. When we reorder clauses, we need to make sure that the reorder did not push a clause too far to the left, where it wouldn't be able to see its inputs anymore. Consider the case of two clauses, where the second clause uses output of the first in its condition, and where the second is otherwise convertible to filter but the first is not fully convertible. Is this handled properly? Do we have a test for this case?

gianm · 2022-10-17T14:15:41Z

There's another potential issue I'm wondering about, related to reordering. If one of the earlier clauses is RIGHT OUTER or FULL OUTER (a "righty" join) then they generate nulls in columns associated with the base table, for any right-hand-side rows that don't match the join clause. In this case, it wouldn't be correct to reorder a later INNER join leftwards through the RIGHT OUTER or FULL OUTER clause. Please add a test case about this too.

paul-rogers · 2022-10-17T16:17:34Z

@rohangarg, thanks for the explanation about the cutoff quantity. That raises another question: how do we know the number? If a join is against an entire table, one can just look at the table cardinality (the sum of the row counts across all segments). If there is a filter on a table, estimating the post-filter row count is a difficult challenge -- one that all query planners wrestle with (and for which Calcite provides a cost-based algorithm.) Is this PR only considering the entire table cardinality?

paul-rogers · 2022-10-19T01:37:47Z

Are we using Calcite to do the join reordering? If so, then it would be surprising if Calcite would allow invalid reordering. If we're doing it ourselves, then we do need to be careful: there are lots of tricky rules, such as the one Gian mentioned, for when join ordering is or is not allowed. Basically, if we're doing it ourselves, only reorder "plain" inner joins and we won't go wrong (though we may miss optimization opportunities.) Later, perhaps we can feed Calcite some proper cost estimates (however crude) and Calcite can do the task for us.

rohangarg · 2022-10-19T13:42:50Z

@gianm :

Consider the case of two clauses, where the second clause uses output of the first in its condition, and where the second is otherwise convertible to filter but the first is not fully convertible. Is this handled properly? Do we have a test for this case?

In this case, by 'output of the join' do you mean the right table column? If so, then the second join would not be fully converted. Both joins would stay intact and the derived/inferred filters from them would be pushed to the left side.

There's another potential issue I'm wondering about, related to reordering. If one of the earlier clauses is RIGHT OUTER or FULL OUTER (a "righty" join) then they generate nulls in columns associated with the base table, for any right-hand-side rows that don't match the join clause. In this case, it wouldn't be correct to reorder a later INNER join leftwards through the RIGHT OUTER or FULL OUTER clause. Please add a test case about this too.

Thanks for catching that, you're right that fully converting inner joins after right/outer joins would lead to incorrect results since they could contain NULLs. Maybe we could keep those joins intact and also push the filter - but that can be taken up later upon more thinking. For now, I'll stop the clause conversion loop as soon as we see a 'righty' join. And will also write a test for this case.

@paul-rogers :

how do we know the number? If a join is against an entire table, one can just look at the table cardinality (the sum of the row counts across all segments). If there is a filter on a table, estimating the post-filter row count is a difficult challenge -- one that all query planners wrestle with (and for which Calcite provides a cost-based algorithm.) Is this PR only considering the entire table cardinality?

Currently, we apply the join-to-filter optimization when the right/build side table has been materialized fully in memory, so it means that we know the cardinality of that table.

Are we using Calcite to do the join reordering? If so, then it would be surprising if Calcite would allow invalid reordering. If we're doing it ourselves, then we do need to be careful: there are lots of tricky rules, such as the one Gian mentioned, for when join ordering is or is not allowed. Basically, if we're doing it ourselves, only reorder "plain" inner joins and we won't go wrong (though we may miss optimization opportunities.) Later, perhaps we can feed Calcite some proper cost estimates (however crude) and Calcite can do the task for us.

No, we're not using the calcite's join reordering rule as of now. The current optimization can be thought of more as adaptive execution where after running all the necessary queries and building the right side table, we decide whether to push some inferred filters on the left side before running the join. Yes, you're right - currently we've decided to only do the filter inferencing for inner joins amongst a group of inner and left outer joins (and that too only if the inferred filter is on left table columns) and leave out right/outer joins. Sure, yes in future as more join support/requirement comes over we could also use the Calcite's rule in determining the join algorithm and the order.

Thanks a lot @gianm and @paul-rogers for your thorough review and suggestions!

Conflicts: processing/src/test/java/org/apache/druid/segment/join/JoinableFactoryWrapperTest.java

rohangarg · 2022-11-01T19:51:23Z

Facing #13289 in CI

gianm

LGTM with the latest changes.

rohangarg · 2022-11-07T17:49:31Z

Thanks a lot for the review @gianm and @paul-rogers

Try converting all inner joins to filters

0438db7

kfaraz added the Area - Querying label Oct 10, 2022

rohangarg requested a review from gianm October 12, 2022 06:45

gianm reviewed Oct 17, 2022

View reviewed changes

rohangarg added 2 commits October 21, 2022 22:21

Stop the filter conversion after discovering a RIGHT/OUTER join

7f2baa6

Merge remote-tracking branch 'upstream/master' into all_join_conversion

a66b92d

Conflicts: processing/src/test/java/org/apache/druid/segment/join/JoinableFactoryWrapperTest.java

gianm approved these changes Nov 7, 2022

View reviewed changes

rohangarg merged commit a9b39fc into apache:master Nov 7, 2022

kfaraz added this to the 25.0 milestone Nov 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try converting all inner joins to filters #13201

Try converting all inner joins to filters #13201

rohangarg commented Oct 10, 2022

paul-rogers commented Oct 12, 2022 •

edited

Loading

rohangarg commented Oct 13, 2022

gianm left a comment

gianm commented Oct 17, 2022

paul-rogers commented Oct 17, 2022

paul-rogers commented Oct 19, 2022

rohangarg commented Oct 19, 2022

rohangarg commented Nov 1, 2022

gianm left a comment

rohangarg commented Nov 7, 2022

Try converting all inner joins to filters #13201

Try converting all inner joins to filters #13201

Conversation

rohangarg commented Oct 10, 2022

paul-rogers commented Oct 12, 2022 • edited Loading

rohangarg commented Oct 13, 2022

gianm left a comment

Choose a reason for hiding this comment

gianm commented Oct 17, 2022

paul-rogers commented Oct 17, 2022

paul-rogers commented Oct 19, 2022

rohangarg commented Oct 19, 2022

rohangarg commented Nov 1, 2022

gianm left a comment

Choose a reason for hiding this comment

rohangarg commented Nov 7, 2022

paul-rogers commented Oct 12, 2022 •

edited

Loading