Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupedHashAggregate in row format #2452

Closed
yjshen opened this issue May 5, 2022 · 0 comments · Fixed by #2375
Closed

GroupedHashAggregate in row format #2452

yjshen opened this issue May 5, 2022 · 0 comments · Fixed by #2375
Labels
enhancement New feature or request

Comments

@yjshen
Copy link
Member

yjshen commented May 5, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We could both improve the performance and save the memory of GroupedHashAggregate by employing row format.
By using Vec<u8> backed rows, we are able to:

  1. compare compound grouping keys by comparing raw bytes directly.
  2. create all accumulator states by just creating a Vec<u8> for each key, and update the contents in place
  3. reduce the memory footprint for each group state, by changing from Vec<ScalarValue> based state to Vec<u8> based state with less datatype information.

Describe the solution you'd like

  1. A new Accumulator trait to manipulate state's updating/merging based on Vec<u8>
  2. branching AggregateExec::execute to employ row-based aggregate when applicable.

Describe alternatives you've considered

Additional context
#1708 and #2188

@yjshen yjshen added the enhancement New feature or request label May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant