Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refine decimal multiply, avoid cast to wider type #6331

Merged
merged 3 commits into from
May 13, 2023

Conversation

mingmwang
Copy link
Contributor

@mingmwang mingmwang commented May 11, 2023

Which issue does this PR close?

Closes #6278, bring the performance back.

Rationale for this change

For decimal multiply, avoid casting to the wider type, decimals can multiply decimals directly.

What changes are included in this PR?

Are these changes tested?

Main branch:

Running benchmarks with the following options: DataFusionBenchmarkOpt { query: Some(1), debug: false, iterations: 3, partitions: 1, batch_size: 8192, path: "./parquet_data", file_format: "parquet", mem_table: false, output_path: None, disable_statistics: true }
Query 1 iteration 0 took 1716.3 ms and returned 4 rows
Query 1 iteration 1 took 1697.0 ms and returned 4 rows
Query 1 iteration 2 took 1694.3 ms and returned 4 rows
Query 1 avg time: 1702.52 ms

Branch 23 (Tag 23.0.0):

Running benchmarks with the following options: DataFusionBenchmarkOpt { query: Some(1), debug: false, iterations: 3, partitions: 1, batch_size: 8192, path: "./parquet_data", file_format: "parquet", mem_table: false, output_path: None, disable_statistics: true, enable_scheduler: false }
Query 1 iteration 0 took 864.2 ms and returned 4 rows
Query 1 iteration 1 took 842.0 ms and returned 4 rows
Query 1 iteration 2 took 838.7 ms and returned 4 rows
Query 1 avg time: 848.29 ms

This PR:

Running benchmarks with the following options: DataFusionBenchmarkOpt { query: Some(1), debug: false, iterations: 3, partitions: 1, batch_size: 8192, path: "./parquet_data", file_format: "parquet", mem_table: false, output_path: None, disable_statistics: true }
Query 1 iteration 0 took 539.1 ms and returned 4 rows
Query 1 iteration 1 took 514.7 ms and returned 4 rows
Query 1 iteration 2 took 511.1 ms and returned 4 rows
Query 1 avg time: 521.61 ms

Are there any user-facing changes?

@mingmwang mingmwang requested review from viirya and alamb May 11, 2023 05:01
@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions labels May 11, 2023
@mingmwang mingmwang requested a review from liukun4515 May 11, 2023 05:01
@mingmwang
Copy link
Contributor Author

mingmwang commented May 11, 2023

override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
    val resultScale = s1 + s2
    val resultPrecision = p1 + p2 + 1
    if (allowPrecisionLoss) {
      DecimalType.adjustPrecisionScale(resultPrecision, resultScale)
    } else {
      DecimalType.bounded(resultPrecision, resultScale)
    }
  }

  private lazy val numeric = TypeUtils.getNumeric(dataType, failOnError)

  protected override def nullSafeEval(input1: Any, input2: Any): Any = dataType match {
    case DecimalType.Fixed(precision, scale) =>
      checkDecimalOverflow(numeric.times(input1, input2).asInstanceOf[Decimal], precision, scale)
    case _: IntegerType if failOnError =>
      MathUtils.multiplyExact(
        input1.asInstanceOf[Int],
        input2.asInstanceOf[Int],
        getContextOrNull())
    case _: LongType if failOnError =>
      MathUtils.multiplyExact(
        input1.asInstanceOf[Long],
        input2.asInstanceOf[Long],
        getContextOrNull())
    case _ => numeric.times(input1, input2)
  }

@mingmwang
Copy link
Contributor Author

override def checkInputDataTypes(): TypeCheckResult = (left.dataType, right.dataType) match {
    case (l: DecimalType, r: DecimalType) if inputType.acceptsType(l) && inputType.acceptsType(r) =>
      // We allow decimal type inputs with different precision and scale, and use special formulas
      // to calculate the result precision and scale.
      TypeCheckResult.TypeCheckSuccess
    case _ => super.checkInputDataTypes()
  }

@mingmwang
Copy link
Contributor Author

And existing UTs show the result is correct.

query T
select arrow_typeof(c1*c5) from decimal_simple limit 1;
----
Decimal128(23, 13)


query R rowsort
select c1*c5 from decimal_simple;
----
0.00000000014
0.00000000033
0.00000000038
0.0000000005
0.00000000096
0.00000000105
0.0000000016
0.0000000016
0.00000000165
0.00000000176
0.00000000176
0.0000000026
0.0000000034
0.0000000039
0.000000005

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This fixed the tpch-q1 performance. But for the root cause, we still need to improve the i256 operations. Otherwise for queries containing decimal array to array multiplying, it still suffers performance issue.

@mingmwang
Copy link
Contributor Author

Good catch. This fixed the tpch-q1 performance. But for the root cause, we still need to improve the i256 operations. Otherwise for queries containing decimal array to array multiplying, it still suffers performance issue.

Yes, you are right. We still need to address the i256 operations performance, otherwise when it runs into the related logic, the performance is still not good.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing a bug by deleting code ❤️
Screenshot 2023-05-12 at 7 16 23 AM

@andygrove andygrove merged commit 063f99f into apache:main May 13, 2023
@viirya
Copy link
Member

viirya commented May 28, 2023

FYI, I'm going to improve the i256 operations at the upstream: apache/arrow-rs#4303.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Significant performance downgrade to tpch-q1
5 participants