Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35329][SQL] Split generated switch code into pieces in ExpandExec #32457

Closed
wants to merge 7 commits into from

Conversation

maropu
Copy link
Member

@maropu maropu commented May 6, 2021

What changes were proposed in this pull request?

This PR intends to split generated switch code into smaller ones in ExpandExec. In the current master, even a simple query like the one below generates a large method whose size (maxMethodCodeSize:7448) is close to 8000 (CodeGenerator.DEFAULT_JVM_HUGE_METHOD_LIMIT);

scala> val df = Seq(("2016-03-27 19:39:34", 1, "a"), ("2016-03-27 19:39:56", 2, "a"), ("2016-03-27 19:39:27", 4, "b")).toDF("time", "value", "id")
scala> val rdf = df.select(window($"time", "10 seconds", "3 seconds", "0 second"), $"value").orderBy($"window.start".asc, $"value".desc).select("value")
scala> sql("SET spark.sql.adaptive.enabled=false")
scala> import org.apache.spark.sql.execution.debug._ 
scala> rdf.debugCodegen

Found 2 WholeStageCodegen subtrees.
== Subtree 1 / 2 (maxMethodCodeSize:7448; maxConstantPoolSize:189(0.29% used); numInnerClasses:0) ==
                                    ^^^^
*(1) Project [window#34.start AS _gen_alias_39#39, value#11]
+- *(1) Filter ((isnotnull(window#34) AND (cast(time#10 as timestamp) >= window#34.start)) AND (cast(time#10 as timestamp) < window#34.end))
   +- *(1) Expand [List(named_struct(start, precisetimestampcon...

/* 028 */   private void expand_doConsume_0(InternalRow localtablescan_row_0, UTF8String expand_expr_0_0, boolean expand_exprIsNull_0_0, int expand_expr_1_0) throws java.io.IOException {
/* 029 */     boolean expand_isNull_0 = true;
/* 030 */     InternalRow expand_value_0 =
/* 031 */     null;
/* 032 */     for (int expand_i_0 = 0; expand_i_0 < 4; expand_i_0 ++) {
/* 033 */       switch (expand_i_0) {
/* 034 */       case 0:
                  (too many code lines)
/* 517 */         break;
/* 518 */
/* 519 */       case 1:
                  (too many code lines)
/* 1002 */         break;
/* 1003 */
/* 1004 */       case 2:
                  (too many code lines)
/* 1487 */         break;
/* 1488 */
/* 1489 */       case 3:
                  (too many code lines)
/* 1972 */         break;
/* 1973 */       }
/* 1974 */       ((org.apache.spark.sql.execution.metric.SQLMetric) references[33] /* numOutputRows */).add(1);
/* 1975 */
/* 1976 */       do {
/* 1977 */         boolean filter_value_2 = !expand_isNull_0;
/* 1978 */         if (!filter_value_2) continue;

The fix in this PR can make the method smaller as follows;

Found 2 WholeStageCodegen subtrees.
== Subtree 1 / 2 (maxMethodCodeSize:1713; maxConstantPoolSize:210(0.32% used); numInnerClasses:0) ==
                                    ^^^^
*(1) Project [window#17.start AS _gen_alias_32#32, value#11]
+- *(1) Filter ((isnotnull(window#17) AND (cast(time#10 as timestamp) >= window#17.start)) AND (cast(time#10 as timestamp) < window#17.end))
   +- *(1) Expand [List(named_struct(start, precisetimestampcon...

/* 032 */   private void expand_doConsume_0(InternalRow localtablescan_row_0, UTF8String expand_expr_0_0, boolean expand_exprIsNull_0_0, int expand_expr_1_0) throws java.io.IOException {
/* 033 */     for (int expand_i_0 = 0; expand_i_0 < 4; expand_i_0 ++) {
/* 034 */       switch (expand_i_0) {
/* 035 */       case 0:
/* 036 */         expand_switchCaseCode_0(expand_exprIsNull_0_0, expand_expr_0_0);
/* 037 */         break;
/* 038 */
/* 039 */       case 1:
/* 040 */         expand_switchCaseCode_1(expand_exprIsNull_0_0, expand_expr_0_0);
/* 041 */         break;
/* 042 */
/* 043 */       case 2:
/* 044 */         expand_switchCaseCode_2(expand_exprIsNull_0_0, expand_expr_0_0);
/* 045 */         break;
/* 046 */
/* 047 */       case 3:
/* 048 */         expand_switchCaseCode_3(expand_exprIsNull_0_0, expand_expr_0_0);
/* 049 */         break;
/* 050 */       }
/* 051 */       ((org.apache.spark.sql.execution.metric.SQLMetric) references[33] /* numOutputRows */).add(1);
/* 052 */
/* 053 */       do {
/* 054 */         boolean filter_value_2 = !expand_resultIsNull_0;
/* 055 */         if (!filter_value_2) continue;
/* 056 */
...

Why are the changes needed?

For better generated code.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

GA passed.

@github-actions github-actions bot added the SQL label May 6, 2021
@SparkQA
Copy link

SparkQA commented May 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42736/

@SparkQA
Copy link

SparkQA commented May 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42736/

@SparkQA
Copy link

SparkQA commented May 6, 2021

Test build #138214 has finished for PR 32457 at commit 5448708.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 7, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42742/

@SparkQA
Copy link

SparkQA commented May 7, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42742/

@SparkQA
Copy link

SparkQA commented May 7, 2021

Test build #138220 has finished for PR 32457 at commit cb182b8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented May 7, 2021

This code seems to generate # of methods that is equal to # of columns. Am I correct?
Does this splitting cause no performance degradation? If there is a possibility, it would be good to introduce a threshold.

@viirya
Copy link
Member

viirya commented May 7, 2021

The number of added methods should be number of columns (different output) * number of projections? Maybe as @kiszk said, adding a threshold?

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay otherwise.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea. Maybe adding one method for all columns per projection, instead of per column?

@kiszk
Copy link
Member

kiszk commented May 7, 2021

Totally. Two points.

  • Whether to create many methods may degrade performance or not
  • To create many methods may exceed JVM limitation.

I am a bit afraid that the number of methods may be over the limit of a class if there are multiple projections. For example, 100 columns x 700 projections (65536 is the upper limit at https://docs.oracle.com/javase/specs/jvms/se16/html/jvms-4.html#jvms-4.11).

@maropu
Copy link
Member Author

maropu commented May 7, 2021

@kiszk @viirya Yea, you're right; the previous approach seems bad. So, I applied two improvements based on your suggestion;

  • Checks if eval code size goes over the limit (spark.sql.codegen.methodSplitThreshold)
  • Creates a single method per projection (@viirya suggested above)

Could you check the latest code?

@SparkQA
Copy link

SparkQA commented May 7, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42779/

@SparkQA
Copy link

SparkQA commented May 7, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42779/

@SparkQA
Copy link

SparkQA commented May 7, 2021

Test build #138257 has finished for PR 32457 at commit 18e87d2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42791/

@SparkQA
Copy link

SparkQA commented May 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42791/

@SparkQA
Copy link

SparkQA commented May 8, 2021

Test build #138269 has finished for PR 32457 at commit a1873d9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented May 9, 2021

Is the description updated? Looks good except it.

val inputVars = inputVarSets.foldLeft(Set.empty[VariableValue])(_ ++ _)
val paramLength = CodeGenerator.calculateParamLengthFromExprValues(inputVars.toSeq)
val maybeSplitUpdateCode = if (CodeGenerator.isValidParamLength(paramLength) &&
exprCodesWithIndices.map(_._2.code.length).sum > splitThreshold) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we only check if the code under current switch case is under threshold. Seems to me, we need to accumulate non-split code too. If accumulated code is over the threshold, we need to split it out on later switch cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That's a good insight. More simply, how about checking if the overall size of switch cases goes over the limit? Could you check the latest code?

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm if CI passes.

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42868/

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42868/

@SparkQA
Copy link

SparkQA commented May 11, 2021

Test build #138346 has finished for PR 32457 at commit b01bb8c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented May 11, 2021

@kiszk Do you want to take another look before I merge this? Since you approved it, there is some change.

val argList = inputVars.map { v =>
s"${CodeGenerator.typeName(v.javaType)} ${v.variableName}"
val splitThreshold = SQLConf.get.methodSplitThreshold
val cases = if (switchCaseExprs.flatMap(_._2.map(_._2.code.length)).sum > splitThreshold) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call generateUpdateCode only once before line 198?
IMHO, code in all three cases (line 203-, line 216, and line 225-) is generated by generateUpdateCode().

@SparkQA
Copy link

SparkQA commented May 12, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42943/

@SparkQA
Copy link

SparkQA commented May 12, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42943/

@SparkQA
Copy link

SparkQA commented May 12, 2021

Test build #138421 has finished for PR 32457 at commit cdbbffe.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented May 12, 2021

Looks like a related test failure.

@maropu maropu force-pushed the splitSwitchCode branch from cdbbffe to 9e74ca6 Compare May 12, 2021 07:07
@SparkQA
Copy link

SparkQA commented May 12, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42958/

@SparkQA
Copy link

SparkQA commented May 12, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42958/

@SparkQA
Copy link

SparkQA commented May 12, 2021

Test build #138437 has finished for PR 32457 at commit 9e74ca6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented May 13, 2021

LGTM if failures disappear

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@viirya
Copy link
Member

viirya commented May 14, 2021

Thanks @maropu and @kiszk! Merging to master.

@viirya viirya closed this in 8fa739f May 14, 2021
@maropu
Copy link
Member Author

maropu commented May 14, 2021

Thank you, @viirya and @kiszk ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants