-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guard against exponential increase of filters during CNF conversion #12314
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
package org.apache.druid.segment.filter.cnf; | ||
|
||
import org.apache.druid.java.util.common.StringUtils; | ||
|
||
public class CNFFilterExplosionException extends Exception | ||
{ | ||
public CNFFilterExplosionException(String formatText, Object... arguments) | ||
{ | ||
super(StringUtils.nonStrictFormat(formatText, arguments)); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,6 +19,7 @@ | |
|
||
package org.apache.druid.segment.filter.cnf; | ||
|
||
import org.apache.druid.query.QueryContexts; | ||
import org.apache.druid.query.filter.BooleanFilter; | ||
import org.apache.druid.query.filter.Filter; | ||
import org.apache.druid.segment.filter.AndFilter; | ||
|
@@ -78,7 +79,7 @@ public static Filter pushDownNot(Filter current) | |
return current; | ||
} | ||
|
||
public static Filter convertToCnf(Filter current) | ||
public static Filter convertToCnf(Filter current) throws CNFFilterExplosionException | ||
{ | ||
if (current instanceof NotFilter) { | ||
return new NotFilter(convertToCnf(((NotFilter) current).getBaseFilter())); | ||
|
@@ -159,7 +160,7 @@ private static void generateAllCombinations( | |
List<Filter> result, | ||
List<Filter> andList, | ||
List<Filter> nonAndList | ||
) | ||
) throws CNFFilterExplosionException | ||
{ | ||
List<Filter> children = new ArrayList<>(((AndFilter) andList.get(0)).getFilters()); | ||
if (result.isEmpty()) { | ||
|
@@ -178,6 +179,11 @@ private static void generateAllCombinations( | |
a.add(child); | ||
// Result must receive an actual OrFilter. | ||
result.add(idempotentOr(Filters.or(a))); | ||
if (result.size() >= 10_000) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any reason for this limit to be configurable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this could be configurable - didn't keep it as of now to avoid more configs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've added the change where exceeding this limit for not fail queries for now. when the limit exceeds, the join flow will disable pushdown optimization and the normal filter flow will return the non-CNF filter. does that seem ok to not add a config to toggle the limit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, i think that is ok, the transformation shouldn't be changing results afaik, though per my other comment there should be some way to know that this has happened |
||
throw new CNFFilterExplosionException( | ||
"Atleast 10,000 filters created by CNF (conjunctive normal form) conversion of the filter. " | ||
rohangarg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
+ "Please disable %s flag if enabled", QueryContexts.USE_FILTER_CNF_KEY); | ||
} | ||
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a
BadQueryException
? Since i think this is user controllable by setting a context flag or not, I think doing this would result in it being a user error when it eventually surfacesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,
BadQueryException
makes sense but one problem is that it is an un-checked exception. I was trying to make this a checked exception so that callers must decide how to handle it. wdyt?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think that makes sense, looking closer I see now that these exceptions are always eaten and don't make it to the user, i admittedly just did a drive by scan of this PR earlier and thought they might make it to the surface 😅
though thinking about it, maybe it is worth a log of some sort to know that this is happening so that there is some indicator of why the expected query isn't happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess a log might be kind of noisy if this was coming from the SQL planner, and would need to be collected and printed at the end instead. We have a mechanism for collecting errors this way, but not for informative messages; this could be done at a later time I think (if is actually useful to have a way to know why the planner did or didn't do a think that was asked for).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree with that - it would make more sense to have something in a per query structure which is shown to the user.