Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine tree and shuffle methods in DataFrameGroupBy.agg tile #3051

Merged
merged 2 commits into from
May 23, 2022

Conversation

hekaisheng
Copy link
Contributor

@hekaisheng hekaisheng commented May 18, 2022

What do these changes do?

This PR applies both tree and shuffle methods for DataFrameGroupBy.agg's auto method. First, combine map chunks into larger chunks according to sample chunk's size, then shuffle those combined chunks. It brings improvements when executing groupby on cluster.

Related issue number

Fixes #xxxx

Check code requirements

  • tests added / passed (if needed)
  • Ensure all linting tests pass, see here for how to run them

@hekaisheng hekaisheng added type: enhancement request mod: dataframe to be backported Indicate that the PR need to be backported to stable branch labels May 18, 2022
@hekaisheng hekaisheng added this to the v0.10.0a1 milestone May 18, 2022
@hekaisheng hekaisheng marked this pull request as ready for review May 19, 2022 10:21
@hekaisheng hekaisheng requested a review from a team as a code owner May 19, 2022 10:21
qinxuye
qinxuye previously approved these changes May 19, 2022
Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: Wenjun Si <[email protected]>
Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye merged commit d2fde16 into mars-project:master May 23, 2022
@qinxuye qinxuye deleted the enh/groupby-agg branch May 23, 2022 07:42
hekaisheng added a commit to hekaisheng/mars that referenced this pull request May 24, 2022
wjsi pushed a commit that referenced this pull request May 24, 2022
@wjsi wjsi added backported already PR has been backported and removed to be backported Indicate that the PR need to be backported to stable branch labels May 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants