-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-26626][SQL] Maximum size for repeatedly substituted aliases in SQL expressions #23556
Conversation
ok to test |
@cloud-fan could we get you to take a look here? should hopefully be a quick review |
Test build #101280 has finished for PR 23556 at commit
|
I'm not sure about this approach. I think there is no problem if we repeat an attribute 1000 times, but it can be a problem if we repeat an expensive expression twice (like UDF). How about we just blacklist UDF in |
@cloud-fan Ah sorry I think I didn't explain the problem clearly enough - the OOMs happen inside CollapseProject while performing the optimisation (and inside PhysicalOperation). The recursive alias substitution grows the expression tree so large that it OOMs. We're not using UDFs or any expensive expressions. To repro, try this in Spark Shell:
|
Not enough to just keep an |
// maximum size | ||
aliases.exists({ case (attribute, expression) => | ||
referenceCounts.getOrElse(attribute, 0) > 1 && | ||
expression.treeSize > SQLConf.get.maxRepeatedAliasSize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about using treeSize
as the cost of an expression. UDF can be very expensive even if its treeSize
is 1.
How about we simplify it with a blacklist? e.g. UDF is expensive and we shouldn't collapse projects if udf is repeated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't trying to determine the cost of the expression - the cost of the expression is irrelevant here, we're just trying to determine the size of the expression itself (using tree size as a proxy for memory size). That way, if the expression is too large (takes up too much memory) we can prevent OOMs by not de-aliasing it multiple times (and thus greatly increasing the amount of heap the expression tree takes up).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so your fix only care about memory usage of the expressions, instead of execution time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan That's right, the primary concern is memory usage, since the exponential increase in memory usage currently causes crashes (due to OOMs), time outs, and performance issues.
@maropu no I don't think this is related to CleanupAliases - we can't clean up the aliases, because they still need to be used in the expression; we just can't substitute them (if they're large) or we risk OOMing |
eaaf8a3
to
22f594a
Compare
Test build #101648 has finished for PR 23556 at commit
|
https://issues.apache.org/jira/browse/SPARK-26626 apache#23556 ## What changes were proposed in this pull request? This adds a `spark.sql.maxRepeatedAliasSize` config option, which specifies the maximum size of an aliased expression to be substituted (in CollapseProject and PhysicalOperation). This prevents large aliased expressions from being substituted multiple times and exploding the size of the expression tree, eventually OOMing the driver. The default config value of 100 was chosen through testing to find the optimally performant value:  ## How was this patch tested? Added unit tests, and did manual testing
@@ -658,7 +658,8 @@ object CollapseProject extends Rule[LogicalPlan] { | |||
|
|||
def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the purpose to give up this rule basically? Why don't we consider using spark.sql.optimizer.excludedRules
? I think it's more general way to resolve such issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon The purpose is to improve the rule so that it only applies when it will yield a performance improvement (and not apply when it could cause memory issues). This was the preferred solution, since if we excluded the rule entirely we wouldn't benefit from it in the instances where it would be beneficial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, basically what you want to do is to give up a rule given a condition because processing huge tree causes OOM issue only in the driver. Am I correct?
What's the diff if we set the threshold spark.sql.maxRepeatedAliasSize
to set the specific number based upon the rough estimation vs explicitly excluding the rule by spark.sql.optimizer.excludedRules
based on user's rough estimation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon it's not that processing a huge tree causes an OOM, it's that the user can write a small tree, that seems very reasonable to execute, but under the hood the optimiser turns it into a huge tree that OOMs. The user doesn't know beforehand that the optimiser issue is going to happen, in order to disable the rule. It takes a lot of debugging, looking through stack traces, etc, to identify that the OOM is caused by CollapseProject and that you can disable it. Also, we typically run many different queries within a spark session, and wouldn't want to disable CollapseProject for all of them.
This change means that we can still run CollapseProject, we just don't substitute overly large aliases. In the types of query we had problems with, this means that it will collapse the query until the aliases get too large, and then stop. So we still do apply CollapseProject to every query, we just stop substituting any alias the gets too large.
spark.sql.maxRepeatedAliasSize
just determines the size of alias tree that is determined to be too large to efficiently substitute multiple times. The default value of 100
was determined by some basic testing to find the best perf balance (see charts at top), but happy to tweak this if you don't htink it's appropriate?
Test build #103512 has finished for PR 23556 at commit
|
Test build #103516 has finished for PR 23556 at commit
|
Test build #103630 has finished for PR 23556 at commit
|
"(size defined by the number of nodes in the expression tree). " + | ||
"Used by the CollapseProject optimizer, and PhysicalOperation.") | ||
.intConf | ||
.createWithDefault(100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does add something automatic stuff but I don't think this is so much worth since we already have a general mechanism. Note that you can also increases the driver side's memory. How does it relate with this configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon increasing the driver memory unfortunately isn't an option, because due to the exponential tree size explosion, the necessary memory would be much larger than that available on most servers. Also, users wouldn't know that they needed a very large driver memory size, because they can be running small queries over small data.
If there is not anything I missed, -1 from me. Because:
|
@HyukjinKwon I think there are a few more things: In response to your concerns:
|
We still can disable the rule for job-based. This PR sounds it only adds some fine-grind partial automatic logic to disable the rule on the certain condition, which still needs a manual estimation from users to set the configuration value. I don't see so much value on it to be honest. How common the case is it, and how much does it save the memory before/after? And, are there only applicable in |
BTW, can you describe the chart in the PR description about what's x and y? |
@HyukjinKwon just to reiterate:
In the chart above, the x axis is the Here's a simple example to illustrate the problem: Take this simple query, running in Spark shell, which simulates making multiple (10) changes to a column:
This is the original query plan:
Here is the optimized query plan, with this fix applied (using the default
Now below is the optimized query plan without this fix applied, where CollapseProject collapses everything down to 1 project. You can see how the query tree explodes exponentially. This is only a very simple query, with only 10 changes - more complex queries with, with 100s of changes, will easily OOM. We've seen vast reductions in the memory required for queries - without the fix, some still OOM using all available server memory; with the fix those queries run fast with only 512MB.
|
@j-esse, to reiterate,
|
Can one of the admins verify this patch? |
Closing this due to author's inactivity. |
… SQL expressions We have internal applications (BS and C) prone to OOMs with repeated use of aliases. See ticket [1] and upstream PR [2]. [1] https://issues.apache.org/jira/browse/SPARK-26626 [2] apache#23556 Co-authored-by: j-esse <[email protected]> Co-authored-by: Josh Casale <[email protected]> Co-authored-by: Will Raschkowski <[email protected]>
… SQL expressions We have internal applications (BS and C) prone to OOMs with repeated use of aliases. See ticket [1] and upstream PR [2]. [1] https://issues.apache.org/jira/browse/SPARK-26626 [2] apache#23556 Co-authored-by: j-esse <[email protected]> Co-authored-by: Josh Casale <[email protected]> Co-authored-by: Will Raschkowski <[email protected]>
… SQL expressions We have internal applications (BS and C) prone to OOMs with repeated use of aliases. See ticket [1] and upstream PR [2]. [1] https://issues.apache.org/jira/browse/SPARK-26626 [2] apache#23556 Co-authored-by: j-esse <[email protected]> Co-authored-by: Josh Casale <[email protected]> Co-authored-by: Will Raschkowski <[email protected]>
… SQL expressions We have internal applications (BS and C) prone to OOMs with repeated use of aliases. See ticket [1] and upstream PR [2]. [1] https://issues.apache.org/jira/browse/SPARK-26626 [2] apache#23556 Co-authored-by: j-esse <[email protected]> Co-authored-by: Josh Casale <[email protected]> Co-authored-by: Will Raschkowski <[email protected]>
… SQL expressions We have internal applications (BS and C) prone to OOMs with repeated use of aliases. See ticket [1] and upstream PR [2]. [1] https://issues.apache.org/jira/browse/SPARK-26626 [2] apache#23556 Co-authored-by: j-esse <[email protected]> Co-authored-by: Josh Casale <[email protected]> Co-authored-by: Will Raschkowski <[email protected]>
… SQL expressions We have internal applications (BS and C) prone to OOMs with repeated use of aliases. See ticket [1] and upstream PR [2]. [1] https://issues.apache.org/jira/browse/SPARK-26626 [2] apache#23556 Co-authored-by: j-esse <[email protected]> Co-authored-by: Josh Casale <[email protected]> Co-authored-by: Will Raschkowski <[email protected]>
What changes were proposed in this pull request?
This adds a
spark.sql.maxRepeatedAliasSize
config option, which specifies the maximum size of an aliased expression to be substituted (in CollapseProject and PhysicalOperation). This prevents large aliased expressions from being substituted multiple times and exploding the size of the expression tree, eventually OOMing the driver.The default config value of 100 was chosen through testing to find the optimally performant value:
How was this patch tested?
Added unit tests, and did manual testing