Refine the size() calculation of accumulator #5904

yahoNanJing · 2023-04-07T03:26:24Z

Which issue does this PR close?

Closes #5903.

Rationale for this change

From the flame graph generated by
CARGO_PROFILE_RELEASE_DEBUG=true cargo flamegraph --no-inline --freq 500 --bin tpch -- benchmark datafusion --path ./data-parquet/ --format parquet --partitions 1 -q 17 --iterations 10

There are around 4% spent on calling the size() of accumulator, which can be improved.

What changes are included in this PR?

Avoid unnecessary calling of ScalarValue's size().
Avoid duplicated calculation of the initial size of accumulator_set.
Pull out the mode check to the top level to avoid if-else judgement for every group.

After applying this PR, from the flame graph, we can see there's almost no cost on the size() of accumulator. The benchmark result is improved from 11s to 10.5s.

Are these changes tested?

Are there any user-facing changes?

mingmwang · 2023-04-07T03:34:25Z

datafusion/physical-expr/src/aggregate/sum.rs

@@ -266,7 +266,7 @@ impl Accumulator for SumAccumulator {
    }

    fn size(&self) -> usize {
-        std::mem::size_of_val(self) - std::mem::size_of_val(&self.sum) + self.sum.size()
+        std::mem::size_of_val(self)
    }


Will this size change after the AvgAccumulator /SumAccumulator struct is initialized?

I would suggest rename the size method to size_in_bytes

I think not. Option is of Enum type.

From the flamegraph, it seems it's not a bottleneck now.

According to the docs

https://github.com/yahoNanJing/arrow-datafusion/blob/issue-5903/datafusion/expr/src/accumulator.rs#L88

/// Allocated size required for this accumulator, in bytes, including `Self`. /// Allocated means that for internal containers such as `Vec`, the `capacity` should be used /// not the `len` fn size(&self) -> usize;

The change in this PR seems to avoid extra allocations in ScalarValue (such as ScalarValue::Utf8 which has an allocated string in it)

mingmwang · 2023-04-07T03:45:29Z

datafusion/core/src/physical_plan/aggregates/row_hash.rs

+            .fold(acc, |acc, accumulator| acc + accumulator.size())
+    })
+}
+
 /// The state that is built for each output group.


I think the logic is quite complex to collect the memory size of the accumulators, maybe the computation is more than the real useful aggregations

Yes, I agree the calculation for the memory size is complicated.

I think @tustvold has been thinking about how to improve performance in this area, but I am not sure how far he has gotten

In general, managing individual allocations (and then accounting for their sizes) for each group is a significant additional overhead for grouping.

yahoNanJing · 2023-04-11T02:58:00Z

Hi @Dandandan, @alamb, could you help review this PR?

alamb · 2023-04-11T18:08:25Z

I will try and review this carefully later today

alamb

Thanks @yahoNanJing

The idea of caching the accumulator sizes seems like a good one.

I am not sure about the changes to Sum and Avg accumulators

Maybe we need to take a step back and figure out how to improve grouping performance more holistically rather than trying to do a special optimization on allocation accounting 🤔

For example #4973

alamb · 2023-04-11T20:49:16Z

datafusion/physical-expr/src/aggregate/sum.rs

@@ -266,7 +266,7 @@ impl Accumulator for SumAccumulator {
    }

    fn size(&self) -> usize {
-        std::mem::size_of_val(self) - std::mem::size_of_val(&self.sum) + self.sum.size()
+        std::mem::size_of_val(self)
    }


According to the docs

https://github.com/yahoNanJing/arrow-datafusion/blob/issue-5903/datafusion/expr/src/accumulator.rs#L88

/// Allocated size required for this accumulator, in bytes, including `Self`. /// Allocated means that for internal containers such as `Vec`, the `capacity` should be used /// not the `len` fn size(&self) -> usize;

The change in this PR seems to avoid extra allocations in ScalarValue (such as ScalarValue::Utf8 which has an allocated string in it)

alamb · 2023-04-11T20:50:23Z

datafusion/core/src/physical_plan/aggregates/row_hash.rs

+                                })
+                        })?;
+                }
+                AggregateMode::FinalPartitioned | AggregateMode::Final => {


Maybe github is rendering the diff confusingly, but this seems like a significant amount of new code

alamb · 2023-04-11T20:51:22Z

datafusion/core/src/physical_plan/aggregates/row_hash.rs

+            .fold(acc, |acc, accumulator| acc + accumulator.size())
+    })
+}
+
 /// The state that is built for each output group.


Yes, I agree the calculation for the memory size is complicated.

I think @tustvold has been thinking about how to improve performance in this area, but I am not sure how far he has gotten

In general, managing individual allocations (and then accounting for their sizes) for each group is a significant additional overhead for grouping.

yahoNanJing · 2023-04-12T03:38:35Z

Thanks @alamb for your comments. Just refactored the code based on latest main branch code.

alamb · 2023-05-12T12:26:44Z

I am trying to clean up outstanding PRs and I came across this one. What shall we do with it @yahoNanJing -- should we pursue getting it merged?

alamb · 2023-05-30T17:55:59Z

Marking as Draft until we come to a consensus on what to do with this PR (so it is not on the review list)

alamb · 2024-04-08T21:08:28Z

Since this has been open for more than a year, closing it down. Feel free to reopen if/when you keep working on it.

github-actions bot added core Core DataFusion crate physical-expr Physical Expressions labels Apr 7, 2023

mingmwang reviewed Apr 7, 2023

View reviewed changes

yahoNanJing mentioned this pull request Apr 10, 2023

Change back SmallVec to Vec for JoinHashMap - Issue 5940 #5941

Closed

alamb reviewed Apr 11, 2023

View reviewed changes

yahoNanJing force-pushed the issue-5903 branch from aa85ea7 to 00fea4e Compare April 12, 2023 02:21

Refine the size() calculation of accumulator

f05b108

yahoNanJing force-pushed the issue-5903 branch from 00fea4e to f05b108 Compare April 12, 2023 02:56

alamb mentioned this pull request Apr 24, 2023

Export benchmark information as line protocol #6107

Open

alamb marked this pull request as draft May 30, 2023 17:56

alamb closed this Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine the size() calculation of accumulator #5904

Refine the size() calculation of accumulator #5904

yahoNanJing commented Apr 7, 2023 •

edited

Loading

mingmwang Apr 7, 2023

mingmwang Apr 7, 2023

mingmwang Apr 7, 2023

yahoNanJing Apr 10, 2023

alamb Apr 11, 2023

mingmwang Apr 7, 2023

alamb Apr 11, 2023

yahoNanJing commented Apr 11, 2023

alamb commented Apr 11, 2023

alamb left a comment

alamb Apr 11, 2023

alamb Apr 11, 2023

alamb Apr 11, 2023

yahoNanJing commented Apr 12, 2023

alamb commented May 12, 2023

alamb commented May 30, 2023 •

edited

Loading

alamb commented Apr 8, 2024

Refine the size() calculation of accumulator #5904

Refine the size() calculation of accumulator #5904

Conversation

yahoNanJing commented Apr 7, 2023 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yahoNanJing commented Apr 11, 2023

alamb commented Apr 11, 2023

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yahoNanJing commented Apr 12, 2023

alamb commented May 12, 2023

alamb commented May 30, 2023 • edited Loading

alamb commented Apr 8, 2024

yahoNanJing commented Apr 7, 2023 •

edited

Loading

alamb commented May 30, 2023 •

edited

Loading